Monday, July 8, 2013

The key to measuring cyber security: inferences, not calculations

(It's getting very late here, so my apologies for any slips or gaps in communication.)

In many of the previous posts on the Ten Dimensions of Cyber Security Performance, I've hinted or suggested that these could be measured as a performance index.  But I'm sure many readers have frustrated because I haven't spelled out any details or given examples.  Still other readers will be skeptical that this can be done at all.

Sorry about that.  There's only so much I can put in each post without them becoming book chapters. In this post, I'll describe an approach and general method that I believe will work for all or nearly all the performance dimensions.  At the time of writing, this is in the idea stage, and thus it needs to be tested and proven in practice.  Still, I suggest it's worthy of consideration.

[Update 7/19/2013:  After some searching in Google Scholar, I discovered that the method I'm suggesting below is called Bayesian Structural Equation Modeling.  I'm very glad to see that it is an established method that has both substantial theory and software tool support.  I hope to start exploring it in the near future.  I'll post my results as I go.]

Here are my ideas on how to measure performance in each dimension.

Set objectives for each dimension

This is the prerequisite for any measurement.  "Performance" is relative to a set of objectives defined by managers in the context of their overall goals, objectives, and resources.  Management by Objectives (MBO) was originally proposed and popularized by management expert Peter Drucker.  MBO is described in his books Managing for Results and Management: Tasks, Practices, and Responsibilities.

Define a performance index

I recommend using a dimensionless performance index on an interval or ratio scale, similar to the SAT score, FICO credit score, or the Consumer Confidence Index.  For now, let's assume that the performance index ranges from 0 to 100, where "0" means "clear evidence of a complete lack of performance, the lowest possible" and "100" means "clear evidence of the highest possible performance".  An interval on the scale (e.g. 10) might either be a linear difference in performance or possibly a multiple of the log of the difference.  I would conjecture that a large population of organizations of similar maturity would form a bell shaped Gaussian distribution on the performance index, with a single mode.  (Of course, this needs to be tested.)

It's probably a good idea to scale the performance index by levels in the capability maturity model.  Thus, CMM Level 1 would have an index ranging from 0 to 100, and CMM Level 2 would have it's own index ranging from 0 to 100, and so on.  Thus, reaching "100" means "you are performing as well as possible at the current maturity level".  It might be possible to calibrate each of these index ranges to each other to estimate a performance index for all maturity levels, but this may not be much value in practice to guide management decisions or plans.

Estimate the index using inference from evidence rather than calculations

This is my big "aha!":  the best way to measure aggregate performance on each of these dimensions is through an inference process based on a mass of evidence rather than through a calculation process that merely uses arithmetic to combine lower-level metrics.

Most performance indexes I've seen are weighted sums of their components.  Two simple examples is the Dow Jones Industrial Average and the S&P 500.  Each stock in the index is given a "weight" which, when multiplied by the stock price or market value, equals that stocks contribution to the total index value.  Thus, the composite index is a linear sum of its components.

But composite indexes defined as weighted sums don't do a very good job capturing the non-linear behavior of systems like cyber security.  It's possible to use fancier math to get more complicated composite indexes, but they are also more prone to "wild" fluctuations that may not make sense compared to the component metrics.   With more and more complications, they can become "voodoo" formulas that are hard to justify or calibrate using any ground truth.

Another problem is what to do with incomplete or imprecise component metrics. Because of the properties of arithmetic, all the component metrics should have about the same precision, accuracy, and credibility, with no missing components.

But what if you treat all component metrics simply as evidence in an inference process where the goal is to use the evidence to estimate a performance index value (or, more accurately, a distribution of values)?

An Illustrative Example

To help you visualize how this might work, I'll describe a toy example using Bayesian analysis with continuous distributions:

Let's say we are trying to measure performance in the dimension 7. Effective External Engagement.  Before we have any evidence from component metrics or other indicators, we are ignorant and thus have no basis for estimating any value of the index.  This "prior of all prior" distributions is usefully modeled as a uniform distribution over the entire range of the performance index (0 to 100).

Now let's say that we have three component metrics and one indicator, taking these values:
  1. The existence of an inventory of external partners (an indicator)
  2. External partner satisfaction survey score = 75 out of 100  (coverage: 40% of partners)
  3. Partner engagement processes defined and implemented = 6 out of 8 possible
  4. Top 5 risk drivers identified = 2 out of 8 possible
For estimation purposes, we assume that these metrics are statistically independent.  (If not, they could be adjusted or substituted.)

Viewed as items of evidence, each of these would be converted into a probability distribution over the range of the performance index.  For example, the first indicator might convert into a uniform distribution over the range 10 to 100, meaning that it provides evidence that performance is at least 10, but otherwise doesn't distinguish between higher values.  In contrast, metric #2 might be considered to be highly specific and highly informative and thus map to a Gaussian distribution with mean of 75 and standard deviation of 5 (a narrow bell curve) on the performance index range.  Metric #3 might be considered to be moderately specific but highly informative and thus map tho a Gaussian distribution with mean of 75 and standard deviation of 20 (a broader bell curve).   Finally, metric #4 might be the most informative if more than half of relevant vendors are covered, but very much less informative if less than that.  Thus it might map to a Gaussian distribution with mean of 40 and standard deviation of 40.

The next step is to apply Bayes rule for continuous distributions to each of these sequentially, starting with a uniform distribution as a prior.  The resulting distribution would be an accurate estimate of the performance index value given the available information.  The moments of the distribution provide information about the relative certainty or uncertainty of the estimate.  And if the distribution is multi-modal, it provides information about possible ambiguity in the estimate because of conflicting information.

By the way, Bayesian inference isn't the only method.  There's Dempster-Shafer belief functions and fuzzy logic, to name two.  At this stage, I'm not sure which is most appropriate and, for all I know, they might give very similar results given the same evidence.


Regardless of the details, a major benefit of this approach is that it would allow the use of a heterogeneous set of component metrics and indicators, and also allow evolution in that set, as long as the rules were consistent for mapping component metrics to distributions on the performance index range.  Thus, there would be much less need to get everyone to use the same set of component metrics.


That's it for now.  When I get more time, I'll do another post with pretty graphics and more details on the inference method(s).

No comments:

Post a Comment