The featured image for this article.

Honesty about Effective Cyber Risk Modeling for Insurance Companies

Background
As cybersecurity plays a larger and larger role in daily lives and the global economy, there is growing importance in our collective quest to accurately understand and price cyber-related risks. Doing this correctly, transparently, and reliably is an important part of engendering better behavior in both public and private institutions as a function of governance. But in this nascent industry, cyber risk modeling companies and some carriers are seeking to profit off the opportunities for growth in a new class of business without adequate consideration of the “tail” risk. In addition to questions about specific policy wordings (e.g. attribution-related clauses which are inappropriate at best), many seem to be attempting to drown clients and regulators in a tidal wave of “information” that poorly correlates to whether or not a company will be breached. External “scan” data is necessary but not sufficient for modeling cyber risk in terms of either frequency or severity.

All too often, current data brokers and modelers make bold assertions about their ability to correctly model the core aspects of cyber-related risks with models that inadequately consider the context necessary to be predictive. The novelty and uniqueness of modeling cyber-related risks and the difficulty of correctly linking measures of exposure to the propensity to incur losses should mean that all modeling claims be met with substantial caution. Any honest modeling firm, broker or (re)insurer should be candid with their clients about modeling approaches and associated limitations. There are three key features unique to cyber that make both accuracy and precision difficult for even deterministic models - further amplified when attempting to create stochastic models:

  1. Access to the data necessary to build a holistic model.
  2. The heterogeneity of the ultimate risk pool under consideration.
  3. Proper data collection, aggregation and sampling.

Data Accessibility
First, access to data necessary to build successful and sufficiently complete models is a major current challenge. Although a substantial amount of data is collected by insureds, it is held in fragments with no incentive to share it. And historical loss events and claims handling processes do not really link the pre-event health data or telemetry with mid-event or post-event data In other words, modelers are sometimes promising a blow-by-blow account of an event when they only have a few snapshots of what happened. Further, there’s no obligation to disclose the specifics of breaches (or even more substantially, attacks not known to have caused a data loss event but which may have caused integrity or availability issues) that could be used to improve modeling. This has also meant that most models being employed are limited exclusively to Confidentiality concerns and ignore Integrity and Availability concerns common to the “CIA” triad typically referenced in information security.

This challenge is further compounded by the current approach to network and organization characterization for both third-party risk and health characterization. Companies today are far too focused on the limited external indicators of organizational security, even though modern network architectures and segmentation severely limit visibility from external sources and therefore render many of these metrics dubious at best. Internal metrics about network state, health, and configuration are much more informative and important when considering the organizational security posture and its correlation with eventual losses or lack thereof (at least outside of the most grossly negligent errors).

Some modelers think that, even with limited data, they can establish correlations that inform us as to the underlying causes of loss. However, even when using the most methodical process and logical criteria, limited data means that models can only reveal relative risk. While it’s a helpful metric, there’s a substantial difference between relative and absolute risk. Additionally, models are currently being fit to historical data that may have little to do with the evolving tactics, techniques, and procedures (TTP) of both state and non-state threat actors in cyberspace. Traditional actuarial techniques focus on extrapolation of historical experience within a given peril into the future based on consistent relationships between various parameters. In cyber, the dynamic between attacker and defender mix with a devastating rate of technological change. For example, machine learning (ML) for anti-virus and malware detection was initially quite promising, but binary executable transforms and other techniques already make it possible to create unique malware that can easily evade even the latest ML systems. Similarly, memory-related attacks which were common several years ago are less common today as credential-based exploits and tooling have massively increased, enabling stealthier breaches which leverage lateral movement techniques to avoid detection for longer periods of time. In other words, the ongoing changes across people, process, technology, and data sources (on the side of both attackers and defenders) means that we should continue to look at all the data available while recognizing that some retrospective data may not maintain value or meaning as the techniques and capabilities of participants continue to evolve. As new threats and vulnerabilities continue to arise, retrospectively-dominated forecasting and modeling techniques are not likely appropriate. A focus on simulation-based techniques is best in order to explore a broader range of futures rather than extrapolating from past events.

Risk Pool Heterogeneity
The second problem in modeling cyber risk is more profound: the risk pool is extremely diverse. Understanding financial risk requires an understanding of a heterogeneous group of threat actors (including strategically motivated nation-states and non-state actors with their own economic motives), the companies/assets themselves (with different levels of investment and culture), and ultimately the business impact (stemming from substantially different process exposure to the underlying IT systems and data). Typically this means we develop a comprehensive view of the assets of interest, a vulnerability model for those assets, a hazard model, and then a financial model which can be assessed across each of the many different scenarios to be evaluated. The practical dynamics of cyber risk as an adversarial domain where a victim can be either a strategic or opportunistic target actually undermines a typical insurance industry approach of treating clients and losses as a homogeneous risk pool. Additionally, unlike other domains of risk assessment such as natural catastrophe, geography is often not a useful factor when looking at potential risk accumulation since common technologies, services, or assets do not cluster or have impact limited to traditional geographic boundaries. In a largely virtual world such boundaries have little meaning. Cyber’s collection of heterogeneous risks leads to a constantly shifting universe, making accurate assessment more difficult and challenging than geography- and sector-based accumulation management approaches. Cyber requires more dynamic and diverse factors to be accounted for to address accumulation risks.

Another common claim from modelers is that they have completed the development of full stochastic models. Such claims should currently be taken with a grain of salt. Purely stochastic modeling approaches are completely inappropriate for cyber. While the ability to sample from a range of distributions re: hazard, vulnerability, asset composition, and financial loss is a step towards a sustainable modeling regime - the continued quest for precision without accuracy remains foolish. Cybersecurity models must address the inconvenient truth that adversaries can learn, and the models must reflect that. Traditional stochastic modeling techniques from natural perils are inappropriate for non-ergodic systems like cybersecurity and terrorism where adversaries learn and thus the system cannot sample randomly from a distribution when considering the hazard component of the model. There are ways of more intelligently sampling - but hurricanes and storms are not sentient beings with specific intent or the ability to transfer knowledge about tactics, techniques, procedures or even impacts from others exploits.

About the Authors
Jason Crabtree is CEO and Co-founder of QOMPLX
Alastair Speare-Cole is Managing Director, QOMPLX:Insurance

QOMPLX

Published a month ago