Break The Rules: Why Rule-Based DLP is Failing

Once upon a time, cybersecurity technology was governed by rules

Administrators had to pre-define every possible event, sorting them into one of two categories: events that were allowed to take place, and events that needed to be restricted. And so, data loss prevention used to take just one form: legacy rule-based systems.

These have three key weaknesses:

  • Ineffective implementation
  • End user burden
  • Failure to effectively prevent data loss across the spectrum

Traditional DLP platforms work - in theory - by identifying specific properties or text, as well as security classification tags, when a document is being sent throughout or outside the network.

These properties are basic and sensitive information can fall through the cracks

Let’s use an example from the healthcare sector: an email that contains patient data, but has no attachments and no keywords, will likely never be caught by the system. The inflexibility of the rules that govern this form of DLP means that the system needs to be constantly reworked; more and more rules must be written as exceptions pop up like mushrooms.  

Because the system is never able to catch up to its users, it will inevitably result in false positives (or negatives). Heuristic policies struggle to tell sensitive information from insensitive information,  creating two problems: alerts sent when an alert isn’t necessary, and alerts not sent when an alert is necessary. Due to the high level of these false positives, these alerts only become more and more ineffective over time.

Not only is traditional DLP largely constrained from a technical viewpoint, but it also causes problems from a UX perspective

The administrative overhead is a nightmare, as operators have to define, maintain, and iterate an ever-growing list of rules. This leads to ineffective implementation, because end users require training that ultimately forces them to change their behavior. Let’s return to our healthcare example: a doctor who has just worked a fifty-hour shift is probably not going to be rigorously double-checking an email for a misaddressed “to” field.

Rule-based platforms are only effective against threats that occur in tightly controlled environments, where conditions are predefined and unchanging. Ultimately, these pitfalls of policy mean one thing: a rule-based system is only as good as its user. And while rule based systems are able to secure the network, the majority of threats originates at the endpoint.

Human behaviour is dynamic, unpredictable, and open

Rule-based systems just cannot keep up. Human error is the largest source of data breaches reported to the Information Commissioner’s Office (ICO). While the focus is on threats caused by malicious external parties, like ransomware and malware, simple mistakes (such as sending an email to the wrong person) are statistically much more likely to occur.

Unlike rule-based methods, machine learning is probabilistic, and uses statistical models rather than deterministic rules. This means that machine intelligent DLP is easy to implement, light touch (with no burden on end user), and effective at preventing data loss from all angles.

The basic operation of a machine learning process is to analyze historic outcomes to understand and therefore forecast future outcomes: using its knowledge of the past to make informed predictions about the future, stopping data breaches in real-time. Machine intelligence allows enterprises to be proactive about their data security, taking ownership of a solution that is more preventative than any alternative.

Machine intelligent systems aren’t just more effective at stopping data breaches: they’re also better for the end user

ML systems come with virtually no administrative overhead - and the platform is invisible to the end user, working hard behind the scenes to make sure that the show goes on. Not only are they light-touch from a UX perspective, but they also allow for real-time threat reporting and risk analysis, demonstrating exactly what is being done in advance to keep data secure and prevent data loss.

Machine learning is the most effective and the most efficient way to prevent highly sensitive information being leaked outside organizations, both maliciously or inadvertently. Forget the rule-based systems of the past - machine intelligent DLP is the future of cyber security.

About Tessian

Tessian is building the world’s first Human Layer Security platform to fulfil our mission to keep the world’s most sensitive data and systems private and secure. Using stateful machine learning to analyze historical email data, Tessian’s Parallax Engine can predict for this user, at this point in time, does this email look like a security threat?

Book a demo to learn more about our email security platform.