Tuesday, March 25, 2014

RAND Report on Innovation in the Cybercrime Ecosystem

This is an excellent report -- well-researched and well-written -- on the growth and development of the cybercrime ecosystem:
Though it's sponsored by Juniper Networks, I don't see any evidence that the analysis or report were slanted.  This report should be useful for people in industry, government, and academia (a rare feat!).

While they do a broad survey of the cybercrime ecosystem, they examine botnets and zero-day exploit markets in detail.  What's important about this report is that it provides a thorough analysis of the innovation capabilities and trajectories in the cybercrime ecosystem.  This is vital to understand to guide investment decisions, architecture decisions, and R&D decisions beyond a 1 year time horizon.

Here's a timeline that documents the growing sophistication and innovation capability:

Black Market timeline (part 1) -- click to enlarge
Black Market timeline (part 2) -- click to enlarge

Monday, March 24, 2014

Review of Whitsitt's "B-side Cyber Security Framework" (Mapped to the Ten Dimensions)

My colleague Jack Whitsitt (@sintixerr) has proposed a B-side version of the NIST Cyber Security Framework (NIST CSF) in this blog post.  In this post I will give my comments on Jack's framework, and do so by mapping it to the Ten Dimensions.

The NIST CSF is a catalog of information security practices, organized into categories and maturity tiers. I've criticized the NIST-CSF here, here, and here, and proposed an alternative -- the Ten Dimensions.  Jack has posted commentary and critiques here, here and  here.  Jack has the advantage of participating in all five workshops, plus several side meetings with various players.

Here's a diagram of Jack's framework:

Short Summary

I like Jack's B-sides framework. I see a lot of overlap between it and my Ten Dimensions.  They aren't identical but the same themes come through in both. His has the advantage of simpler interpretation (top-down layer cake, half as many dimensions).  It has short-comings as well.  In it's current form, it lacks performance measurement and, in my opinion, as inadequate attention to "Effective Response, Recovery, & Resilience", "Effective External Engagement", "Optimize Cost of Risk", and organization learning loops.

Sunday, March 16, 2014

Precision vs Accuracy

When ever you do any kind of measurement, it's important to understand the uncertainties associated with it.  Two characteristics of measurement that are inverse to uncertainties are 'precision' and 'accuracy' (also known as 'fidelity').  The following graphic, from this blog post, nicely demonstrate the difference between these two characteristics.

Other measurement characteristics include stability (repeatability from measurement to measurement), resolution (number of significant digits), sensitivity (ability to detect very small signals), linearity, range (from smallest valid value to largest valid value), and sampling rate (time slice or number of samples to establish a valid measurement).

S Kauffman on Emergent Possibility Spaces in Evolutionary Biology, Economics, & Tech. (great lecture)

Below is a great lecture by Stuart Kauffman on the scientific and philosophical consequences of emergent possibility spaces in evolutionary biology and evolutionary economics, including technology innovation and even cyber security. This web page has the both video and lecture notes.

The lecture is very accessible anyone who reads books or watches programs on science aimed at the general public -- especially evolution, ecology, complexity, and innovation. He does mention some mathematical topics related to Newtonian physics and also Quantum Mechanics, but you don't need to know the details of any the math to follow his argument.  He gives very simple examples for all the important points he makes.

There are very important implications on epistemology (what do we know? what can be known?), scientific methods and research programs, and the causal role of cognition, conception, and creativity in economic and technological change. This last implication is an important element in my dissertation. I'll write more on that later.

Monday, March 10, 2014

Boomer weasel words: 'high net worth individuals' as euphemism for 'rich people'

For some time, I have been noticing that rich people and the people who sell services to them don't use the phrase "rich people" any more.  Instead, they say "high net worth individuals".  When did this happen and why?

(You might compare this to my previous post regarding "Baby on Board" signs.)

The following graphs are from Google Ngram Viewer, which show relative frequency of word phrases in American English books up to the year 2008. Notice that the phrase "high net worth individuals" appears first around 1980.

(click to enlarge)

Saturday, March 8, 2014

Ideal book for self-study: "Doing Bayesian Data Analysis"

In this post, I'd like to heartily recommend a book for anyone doing self-study who doesn't have much statistics or math in their background:
This book is head-and-shoulders better than the others I've seen.  I'm using it myself right now.  Here's what's good about it:
  • It builds from very simple foundations.
  • Math is minimized.  No proofs.
  • From start to finish, everything is demonstrated through R programs. Anyone learning statistics today should be learning a statistics programming language at the same time.  R is the most popular choice and by some measures the best choice.
  • It helps you learn Empirical Bayesian methods from every angle.  It does great both with the fundamental concepts and the practical applications.
  • It takes you as far as you want to go, at least into advanced territory if you want.  But you don't have to read the whole textbook to benefit.
For what it's worth, this book was voted most popular introductory book on Stack Exchange.

Mining only 'digital exhaust', Big Data 1.0 won't revolutionize information security

I was asked during this interview whether 'Big Data' was revolutionizing information security.  My answer was, essentially, 'No, not yet'. But I don't think I did such a great job explaining why and where the revolution will come from, if it comes.

Basically, Big Data 1.0 in information security is today focused on mining 'digital exhaust' -- all the transactional data emitted and logged by computing, communications, and security devices and services.  (The term "data exhaust" was probably coined in 2007 by consultant Jerry Michalski, according to this Economist article.)  This can certainly be useful for many purposes but I don't think it is or will be revolutionary.  It will help tune spam filters, phishing filters, intrusion detection/prevention systems, and so on, but it won't change anything fundamental about how firms architect security, how they design and implement policies, and it does almost nothing on the social or economic factors.

Here's a great essay that explains why Big Data 1.0 isn't revolutionary, and what it will take to make it revolutionary.  Though it's not about information security, it doesn't take much to extend his analysis to the InfoSec domain.
Huberty, M. (2014). I expected a Model T, but instead I got a loom: Awaiting the second big data revolution. Prepared for the BRIE-ETLA Conference, September 6-7, 2013, Claremont California.
Huberty points toward Big Data 2.0 which could be revolutionary:
"...we envision the possibility of a [Big Data 2.0]. Today, we can see glimmers of that possibility in IBM’s Watson, Google’s self-driving car, Nest’s adaptive thermostats, and other technologies deeply embedded in, and reliant on, data generated from and around real-world phenomena. None rely on “digital exhaust”. They do not create value by parsing customer data or optimizing ad click-through rates (though presumably they could). They are not the product of a relatively few, straightforward (if ultimately quite useful) insights. Instead, IBM, Google, and Nest have dedicated substantial resources to studying natural language processing, large-scale machine learning, knowledge extraction, and other problems. The resulting products represent an industrial synthesis of a series of complex innovations, linking machine intelligence, real-time sensing, and industrial design. These products are thus much closer to what big data’s proponents have promised–but their methods are a world away from the easy hype about mass-manufactured insights from the free raw material of digital exhaust.


The big gains from big data will require a transformation of organizational, technological, and economic operations on par with that of the second industrial revolution. " [emphasis added]

Highlighting somewhat different themes in the context of Digital Humanities, Brian Croxall presents an insightful blog post called "Red Herrings of Big Data", which includes slides and this 2 minute video:

Here are his three 'red herrings' (i.e. distractions from the most promising trail), turned around to be heuristics:

Main message

Don't be na├»ve about Big Data in information security. To drive a revolution, it will need to be part of a much more comprehensive transformation of what data we gather in the first place and how data analysis and inference can drive results.  Just mining huge volumes of 'digital exhaust' won't do it.

Monday, March 3, 2014

Video interview with BankInfoSecurity, plus "Down the Rabbit Hole" podcast episode

Here's a 12 minute interview of me by Tracy Kitten (@BnkInfoSecurity), filmed at the RSA Conference last week:

(click to open a new page for www.bankinfosecurity.com with video)
Topics discussed:
  • The difference between "performance" and "best practices"
  • How big data is expected to revolutionize information security (some myth busting)
  • Where innovation will be coming from, and where it won't
  • Why encouraging security professionals to pursue training in statistics and data visualization is so critical

But wait...there's more!  Here's a link to episode 82 of the Down the Rabbit Hole podcast, where I'm a guest along with Bob Blakely and Lisa Leet.  (Here's the podcast itself in mp3 file format -- 43:15 in length.) From Rafal's summary, here's what we talk about:

  • Does is make sense, in a mathematical and practical senes, to look for 'probability of exploit'? 
  • How does 'game theory' apply here? 
  • How do intelligent adversaries figure into these mathematical models? 
  • Is probabilistic risk analysis compatible with a game theory approach? 
  • Discussing how adaptive adversaries figure into our mathematical models of predictability... How do we use any of this to figure out path priorities in the enterprise space? 
  • An interesting analogy to the credit scoring systems we all use today 
  • An interesting discussion of 'unknowns' and 'black swans' 
  • Fantastic practical advice for getting this data-science-backed analysis to work for YOUR organization