# What's the probability of a data breach happening to you? Or is that the wrong question?

Correctly calculating the probability of risk is becoming critical to organizations. And it’s not just because it is essential and fundamental to good Risk Management practice, but also because new laws such as GDPR are mandating it. Security measures must be appropriate to the risk, and the risk is suffering a data breach. So, calculating the probability of a data breach happening, regardless of scope, is vital to determining appropriate security measures.

ISACA, previously known as the Information Systems Audit and Control Association but now known solely by its acronym, talks about the probability of risk as:

**RISK = PROBABLITY x IMPACT**

However, the word *probability* is frequently replaced by *likelihood*. Beware! These two words do not mean the same thing. Probabilities have numerical values derived from statistical analysis. Statistics is a formal discipline using somewhat complex mathematics. This discipline is not well understood and statistics are often misused. This was recognized in the 19th century by the statement popularized by Mark Twain: "There are three kinds of lies: lies, damned lies, and statistics."

All we need to do now is agree on a numerical value for probability based on statistical information. Unfortunately, if we asked three experts to arrive at a probability for a serious data breach occurring, we would end up with three quite different answers. This would be because each expert would have their own preferred statistics, drawn from their own personal experiences. In addition to the lack of agreement from one expert to the next, the real world gets involved. Boards don’t spend millions on cybersecurity defenses to hear that the probability of a serious data breach is high, or even medium. We are driven to implement security controls until we can give estimations that the probability of a data breach is low.

Once we have arrived at a fixed conclusion that the probability of a serious data breach occurring is low, only then can we all go about our business with a confident swagger.

Imagine the look of horror when twelve months later we are informed that a data breach occurred three months ago, and has exfiltrated an unknown quantity of sensitive information. Yet, this is the pattern of events for the vast majority of breached organizations.

**The Breach Chain**

Probability is not fixed in stone. Probability is conditional and changes with circumstance and time.

The odds of being struck by lightning are very low, they change however if you were to walk out into a thunderstorm holding a 10-meter copper pole. (Don’t try this at home.)

The breach chain (shown above) highlights the common sequence of events leading to a data breach. The first link in the chain is the encounter stage: we encounter malware every day, through email, web, personal devices, and more. The probability of any single encounter leading to a data breach is low. However at some point, a single encounter will ultimately result in an infection of our internal network, the second event in the breach chain.

**Persistence**

Not all infections are created equal. Some infections are nuisances like cryptominers. Others, like Point Of Sale (POS) malware, have a very different and more damaging set of capabilities and therefore a higher level of risk. Then finally, there’s Persistence: how long highly capable threats are able to exist in your environment. The longer high-capability threats are in your environment, the greater the probability of a serious data breach.

Being able spot conditional changes through the breach chain is the key to minimizing the probability of a data breach occurring. Accordingly, we can simplify the calculation of risk to focus on the amount of time various threats are in your network.

**PROBABILITY = THREAT CAPABILITY x TIME**

Substituting into the ISACA version, it becomes:

**RISK = THREAT CAPABILITY x TIME x IMPACT**

With this definition of risk, you can align your organization around continuous vigilance across the breach chain, looking for changes to the probability of a data breach happening. Instead of fixedly stating that the risk of a data breach is low, you can now say how you monitor changes to probability when breach chain events occur. When an event (or realized threat capability) indicates an increase in probability of a data breach, you now are able to remediate the threat and associated increased probability of a breach, returning risk levels back to a low assessment.

*Andy Norton, is director of threat intelligence, Lastline*