Friday, October 31, 2014

Presentation: Topological View on Radical Innovation

I'm presenting today at the 6th Annual Complexity in Business Conference, sponsored by the University of Maryland Center for Complexity in Business.  Here are my slides.  (FYI: no information security content here, unless you are interested in institutional innovation.)

If you are really, really interested in this topic and want all the details and references, here is a paper I just completed for a Directed Reading class (89 pages, PDF).  It's a little rough around the edges due to time constraints.

Thursday, October 9, 2014

SIRAcon presentation

I'm presenting at SIRAcon today: "How to aggregate ground-truth metrics & indicators into a performance index".  It will be recorded and will be available to SIRA members on the SIRA web site.  Here are the slides.  Here is the blog post with background and tutorial.

Wednesday, June 25, 2014

My inputs to DHS on cyber economics & incentives

I'm at the 3rd day of Workshop on Economics of Information Security (WEIS) at Penn State.  The focus of this day is to provide input and ideas to the Science & Technology (S&T) Directorate in US Department of Homeland Security regarding R&D on cyber economics and incentives.

Here is the 2007 working paper I co-authored: "Incentive-based Cyber Trust -- A Call to Action".  I think many of the arguments and ideas are still relevant.  (It's long -- 27 pages -- but I think readers will be rewarded.)

Here are my slides.

Thursday, May 1, 2014

Splitting this blog and moving to Octopress

I've decided to split this blog to separate my academic posts from my industry posts.  I'm going to be blogging more about my dissertation and related works in progress, and I suspect that most of my industry readers won't be interested and I don't want to dilute my posts on industry topics -- information security, risk, performance metrics, etc.

Google's Blogger has worked well for me, but I've decided to move to Octopress.  I'll spare you all the details of the decision process but here's a post that describes the process and benefits.  I'm also following in the footsteps of others in my community (e.g., Data Driven Security, and Adam Elkus).

The industry blog with be renamed "Meritology Blog" and will have a URL.  The academic blog will be "Exploring Possibility Space" and will have an URL.  I aim to move all the Blogger posts to these so that the archives are available under both.

I'll let you know when this goes live, and hopefully there will be redirection once the move is complete.

Thursday, April 17, 2014

"Creative Destruction": 500 word entry for Schneier's Movie Plot Contest

Since I won last year, I wasn't going to enter this year. But my imagination started turning and this came out.  Hope you enjoy it.

Bruce Schneier's 7th Annual Movie Plot Contest

Theme: NSA wins!  But how? (full description and all entries are here)

My entry:
Creative Destruction
June 2014 – March 2015: Stock market booms.

June 2014: Snowden revelations trigger international political scandals.

July: Feinstein-Rogers Intelligence Reform Bill passes, breaks up NSA. “Largest garage sale in history”.

Headline: “NSA Nuked”

August: 10,000 NSA workers are laid off.

September – December: Open Source projects, Working Groups see influx of volunteers.

September – November: “NSA garage sale” draws small contractors and public-private partnerships spread over 50 states. Private equity firms are buyers – Flatiron Partners, Narsil Capital, and Tech Disruptions.

September – November 2014: Flurry of privacy and security scandals hit big firms. Lawsuits, investigations, and criminal indictments follow.

November: “Alt Apps Group” formed: “Secure, private, and ad-free”. Most members are majority owned by Flatiron, Narsil Capital, or Tech Disruptions.

July – December: Symantec goes on buying spree: Webroot, Cloudflare, StackExchange, Disqus, Rapid7, and MaaS360 – all funded by Flatiron, Narsil Capital, or Tech Disruptions.

December: Puerto Rico Bridge Initiative announces completion of 50GB fiber optic cable.

January 2015: Private equity firms, led by Flatiron, make offer for Symantec.

February: Loren Reynolds, rookie Equity Analyst at JPMorgan Chase working on Symantec project, is accidentally copied on email from Flatiron:
“Confirming that Launch has been accelerated to March. Don’t use email anymore.”
Loren is puzzled by the distribution list:
  • Puerto Rico Bridge Initiative (PRBI) 
  • Economic Development Corporation Utah (EDCU)
  • Fatherly (formerly GoDaddy)
  • DuckDuckGo 
  • Safebook (startup)
March: Traffic and membership surges at Alt Apps Group members. Fatherly achieves 70% share in Certificate Authority market.

March: Loren receives email from friend Zoltin, networking expert:
“PRBI isn’t 50GB. It’s 75TB – the highest capacity in the world!!! WTF! There’s more. Same capacity cables to Bermuda and to Azores. Looks like they are bypassing US-Europe cables. Big news: Big data center on PR now complete.”
April: Google earnings due 4/15; then Facebook, Apple, Twitter, Microsoft, and Verizon on 4/22.

April: Over beers, Loren hears rumors of large short positions in technology stocks from a few “weird” hedge funds.

April 4: Loren receives email from Sarah:
“EDCU is gatekeeper on NSA Utah Data Center.”
April 15: Google disappoints. Earnings down 40% on flat revenue. Stock falls 30%, overall market falls 10%. 
April 20: Loren discovers link between former NSA executives and Flatiron, Narsil Capital, and Tech Disruptions. Finds NSA people behind tech firm scandals in the previous Fall 2014. Tip of the iceberg, she suspects.

April 21: Loren sends IM to her husband, an Assistant DA in the Southern District of NY:

“Must see you ASAP. NSA & GCHQ live on. They’ve gone legit – running private businesses and funds. THEY ARE KILLING THE INTERNET AD BUSINESS.”
After clicking “send”, her computer freezes. She reaches for her smart phone to call her husband, but the directory is empty. She dials the number manually, but gets an “out of service” signal, followed by “low battery”. The phone dies.

Running down seven flights of stairs, Loren races to her car. She jumps in, starts the engine, and backs out with a screech. One turn from the exit the car engine suddenly cuts out and brakes lock. The car crashes into a cement pillar. The airbag fails to deploy. Loren is out cold.
[This has a few lines added, so it's beyond 500 word limit.  But the entry on Bruce's site is below the limit.]

Tuesday, March 25, 2014

RAND Report on Innovation in the Cybercrime Ecosystem

This is an excellent report -- well-researched and well-written -- on the growth and development of the cybercrime ecosystem:
Though it's sponsored by Juniper Networks, I don't see any evidence that the analysis or report were slanted.  This report should be useful for people in industry, government, and academia (a rare feat!).

While they do a broad survey of the cybercrime ecosystem, they examine botnets and zero-day exploit markets in detail.  What's important about this report is that it provides a thorough analysis of the innovation capabilities and trajectories in the cybercrime ecosystem.  This is vital to understand to guide investment decisions, architecture decisions, and R&D decisions beyond a 1 year time horizon.

Here's a timeline that documents the growing sophistication and innovation capability:

Black Market timeline (part 1) -- click to enlarge
Black Market timeline (part 2) -- click to enlarge

Monday, March 24, 2014

Review of Whitsitt's "B-side Cyber Security Framework" (Mapped to the Ten Dimensions)

My colleague Jack Whitsitt (@sintixerr) has proposed a B-side version of the NIST Cyber Security Framework (NIST CSF) in this blog post.  In this post I will give my comments on Jack's framework, and do so by mapping it to the Ten Dimensions.

The NIST CSF is a catalog of information security practices, organized into categories and maturity tiers. I've criticized the NIST-CSF here, here, and here, and proposed an alternative -- the Ten Dimensions.  Jack has posted commentary and critiques here, here and  here.  Jack has the advantage of participating in all five workshops, plus several side meetings with various players.

Here's a diagram of Jack's framework:

Short Summary

I like Jack's B-sides framework. I see a lot of overlap between it and my Ten Dimensions.  They aren't identical but the same themes come through in both. His has the advantage of simpler interpretation (top-down layer cake, half as many dimensions).  It has short-comings as well.  In it's current form, it lacks performance measurement and, in my opinion, as inadequate attention to "Effective Response, Recovery, & Resilience", "Effective External Engagement", "Optimize Cost of Risk", and organization learning loops.

Sunday, March 16, 2014

Precision vs Accuracy

When ever you do any kind of measurement, it's important to understand the uncertainties associated with it.  Two characteristics of measurement that are inverse to uncertainties are 'precision' and 'accuracy' (also known as 'fidelity').  The following graphic, from this blog post, nicely demonstrate the difference between these two characteristics.

Other measurement characteristics include stability (repeatability from measurement to measurement), resolution (number of significant digits), sensitivity (ability to detect very small signals), linearity, range (from smallest valid value to largest valid value), and sampling rate (time slice or number of samples to establish a valid measurement).

S Kauffman on Emergent Possibility Spaces in Evolutionary Biology, Economics, & Tech. (great lecture)

Below is a great lecture by Stuart Kauffman on the scientific and philosophical consequences of emergent possibility spaces in evolutionary biology and evolutionary economics, including technology innovation and even cyber security. This web page has the both video and lecture notes.

The lecture is very accessible anyone who reads books or watches programs on science aimed at the general public -- especially evolution, ecology, complexity, and innovation. He does mention some mathematical topics related to Newtonian physics and also Quantum Mechanics, but you don't need to know the details of any the math to follow his argument.  He gives very simple examples for all the important points he makes.

There are very important implications on epistemology (what do we know? what can be known?), scientific methods and research programs, and the causal role of cognition, conception, and creativity in economic and technological change. This last implication is an important element in my dissertation. I'll write more on that later.

Monday, March 10, 2014

Boomer weasel words: 'high net worth individuals' as euphemism for 'rich people'

For some time, I have been noticing that rich people and the people who sell services to them don't use the phrase "rich people" any more.  Instead, they say "high net worth individuals".  When did this happen and why?

(You might compare this to my previous post regarding "Baby on Board" signs.)

The following graphs are from Google Ngram Viewer, which show relative frequency of word phrases in American English books up to the year 2008. Notice that the phrase "high net worth individuals" appears first around 1980.

(click to enlarge)

Saturday, March 8, 2014

Ideal book for self-study: "Doing Bayesian Data Analysis"

In this post, I'd like to heartily recommend a book for anyone doing self-study who doesn't have much statistics or math in their background:
This book is head-and-shoulders better than the others I've seen.  I'm using it myself right now.  Here's what's good about it:
  • It builds from very simple foundations.
  • Math is minimized.  No proofs.
  • From start to finish, everything is demonstrated through R programs. Anyone learning statistics today should be learning a statistics programming language at the same time.  R is the most popular choice and by some measures the best choice.
  • It helps you learn Empirical Bayesian methods from every angle.  It does great both with the fundamental concepts and the practical applications.
  • It takes you as far as you want to go, at least into advanced territory if you want.  But you don't have to read the whole textbook to benefit.
For what it's worth, this book was voted most popular introductory book on Stack Exchange.

Mining only 'digital exhaust', Big Data 1.0 won't revolutionize information security

I was asked during this interview whether 'Big Data' was revolutionizing information security.  My answer was, essentially, 'No, not yet'. But I don't think I did such a great job explaining why and where the revolution will come from, if it comes.

Basically, Big Data 1.0 in information security is today focused on mining 'digital exhaust' -- all the transactional data emitted and logged by computing, communications, and security devices and services.  (The term "data exhaust" was probably coined in 2007 by consultant Jerry Michalski, according to this Economist article.)  This can certainly be useful for many purposes but I don't think it is or will be revolutionary.  It will help tune spam filters, phishing filters, intrusion detection/prevention systems, and so on, but it won't change anything fundamental about how firms architect security, how they design and implement policies, and it does almost nothing on the social or economic factors.

Here's a great essay that explains why Big Data 1.0 isn't revolutionary, and what it will take to make it revolutionary.  Though it's not about information security, it doesn't take much to extend his analysis to the InfoSec domain.
Huberty, M. (2014). I expected a Model T, but instead I got a loom: Awaiting the second big data revolution. Prepared for the BRIE-ETLA Conference, September 6-7, 2013, Claremont California.
Huberty points toward Big Data 2.0 which could be revolutionary:
"...we envision the possibility of a [Big Data 2.0]. Today, we can see glimmers of that possibility in IBM’s Watson, Google’s self-driving car, Nest’s adaptive thermostats, and other technologies deeply embedded in, and reliant on, data generated from and around real-world phenomena. None rely on “digital exhaust”. They do not create value by parsing customer data or optimizing ad click-through rates (though presumably they could). They are not the product of a relatively few, straightforward (if ultimately quite useful) insights. Instead, IBM, Google, and Nest have dedicated substantial resources to studying natural language processing, large-scale machine learning, knowledge extraction, and other problems. The resulting products represent an industrial synthesis of a series of complex innovations, linking machine intelligence, real-time sensing, and industrial design. These products are thus much closer to what big data’s proponents have promised–but their methods are a world away from the easy hype about mass-manufactured insights from the free raw material of digital exhaust.


The big gains from big data will require a transformation of organizational, technological, and economic operations on par with that of the second industrial revolution. " [emphasis added]

Highlighting somewhat different themes in the context of Digital Humanities, Brian Croxall presents an insightful blog post called "Red Herrings of Big Data", which includes slides and this 2 minute video:

Here are his three 'red herrings' (i.e. distractions from the most promising trail), turned around to be heuristics:

Main message

Don't be naïve about Big Data in information security. To drive a revolution, it will need to be part of a much more comprehensive transformation of what data we gather in the first place and how data analysis and inference can drive results.  Just mining huge volumes of 'digital exhaust' won't do it.

Monday, March 3, 2014

Video interview with BankInfoSecurity, plus "Down the Rabbit Hole" podcast episode

Here's a 12 minute interview of me by Tracy Kitten (@BnkInfoSecurity), filmed at the RSA Conference last week:

(click to open a new page for with video)
Topics discussed:
  • The difference between "performance" and "best practices"
  • How big data is expected to revolutionize information security (some myth busting)
  • Where innovation will be coming from, and where it won't
  • Why encouraging security professionals to pursue training in statistics and data visualization is so critical

But wait...there's more!  Here's a link to episode 82 of the Down the Rabbit Hole podcast, where I'm a guest along with Bob Blakely and Lisa Leet.  (Here's the podcast itself in mp3 file format -- 43:15 in length.) From Rafal's summary, here's what we talk about:

  • Does is make sense, in a mathematical and practical senes, to look for 'probability of exploit'? 
  • How does 'game theory' apply here? 
  • How do intelligent adversaries figure into these mathematical models? 
  • Is probabilistic risk analysis compatible with a game theory approach? 
  • Discussing how adaptive adversaries figure into our mathematical models of predictability... How do we use any of this to figure out path priorities in the enterprise space? 
  • An interesting analogy to the credit scoring systems we all use today 
  • An interesting discussion of 'unknowns' and 'black swans' 
  • Fantastic practical advice for getting this data-science-backed analysis to work for YOUR organization

Tuesday, February 25, 2014

Quick links to "Ten Dimensions" resources for #RSAC folks

This post is aimed at folks attending my RSA Conference talk on Wednesday, but could be useful for anyone who wants to catch up on the topics.

My talk is at 10:40am - 11:00am in Moscone West, Room: 2020.  Immediately after the talk, I'll be moving to the "Continuing the Conversation" space in the 2nd floor lobby of Moscone West.  I'll be wearing a black EFF hat, in case you want to pick me out of a crowd.

This is 20 minute talk, so it will only be an introduction to the topics.  My main goal is to stimulate your interest to learn more and to dig into these resources:
Not directly related to the above, but here's the slides for the talk I gave Monday at BSides-SF:
If we don't connect at the conference for some reason, feel free to email me at russell ♁ thomas ❂ meritology ♁ com.  (Earth = dot; Sun = at)

And if you've come this far and you aren't following me on twitter -- @MrMeritology -- what's wrong with you?  Follow, already! ☺

How to aggregate ground-truth metrics into a performance index

My remix of a painting by William Blake,
with the Meritology logo added. Get it?
He's shedding light on an impossible shape.
(Click to enlarge)
The general problem is this:
How can we measure aggregate performance on an interval or ratio scale index when we have a hodge-podge of ground-truth metrics with varying precision, relevance, reliability, and that are incommensurate with each other?
Here's a specific example from the Ten Dimensions:
How can we measure overall Quality of Protection & Controls if our ground-truth metrics include false positives percentages, false negatives percentages, exceptions number of exceptions, various "high-medium-low" ratings, audit results, coverage percentages, and a bunch more?
I've been wrestling with this problem for a long time, both in information security and elsewhere.  So have a lot of other people.  I while back I had an insight that the solution may be to treat it as an inference problem, not a calculation problem (described in this post). But I didn't work out the method at that time.  Now I have.

In this blog post, I'm introducing a new method.  At least I think it's new because, after much searching, I haven't been able to find any previously published papers. (If you know of any, please contact me or comment to this post.)

The new method is innovative but I don't think it's much more complicated or mathematically sophisticated than the usual methods (weighted average, etc.), but it does take a change in how you think about metrics, evidence, and aggregate performance.  Even though all the examples below are related to information security, the method is completely general.  It can apply to IT, manufacturing, marketing, R&D, governments, non-profits... any organization setting where you need to estimate aggregate performance from a collection of disparate ground-truth metrics.

This post is a tutorial and is as non-technical as I can make it. As such, its is on the long side, but I hope you find it useful.  A later post will take up the technicalities and theoretical issues. (See here for Creative Commons licensing terms.)

Monday, February 24, 2014

#BSidesSF Prezo: Getting a Grip on Unexpected Consequences

Here are the slides I'm presenting today at B-Sides San Francisco (4pm).  I suggest that you download it as PPTX because it is best viewed in PowerPoint so you can read the stories in the speaker notes.

Friday, February 21, 2014

Does a model and its data ever speak for themselves? No -- A reply to Turchin

This post is the first of a series to reply to Dr. Peter Turchin regarding his PNAS article (full text PDF -- free, thanks to Turchin & team), my letter to PNAS, and his PNAS letter reply.  I wrote a blog post here because I didn't think that Dr. Turchin's reply addressed my questions due to misunderstanding and I invited Dr. Turchin to engage in a colloquy via blog posts. I'm happy to say that Dr. Turchin wrote three blog posts (here, here, and here) in reply to my post, and this is my first reply.

While this post talks about interpreting simulation results, the general topic of data interpretation applies to all empirical research, and even data analysis in industry.  

Monday, February 17, 2014

Two new #InfoSec books that could transform your way of thinking

Happiness is having great colleagues and collaborators.  I'm very happy to recommend to you two new books by three of my favorite colleagues -- Jay Jacobs (@jayjacobs), Bob Rudis (@hrbrmstr), and Adam Shostack (@adamshostack).  These books not only do a great job covering the topics, they could also transform your way of thinking.

Friday, February 14, 2014

What analysis do we really need to guide vulnerability management?

This is the first of a series of posts on the topic of doing quantitative risk analysis in the face of intelligent and adaptive adversaries.  Later posts will dig into research topics like combining risk analysis with game theory, but this first post is mostly a reaction to what other people have said recently.

Rafał Łoś recently posted an article, and then followed with a guest post from Heath Nieddu, with this general theme (paraphrasing and condensing):
It's senseless and distracting to attempt to use quantitative risk analysis to make decisions about vulnerability remediation, and even for information security as a whole.  Uncertainties about the future are too great; adversaries too agile and intelligent; and the whole quant risk endeavor is too complicated.  Keep it simple and stick with what you know for sure, especially the basics.
In this post I'm going to address some of the issues and questions this skeptical view raises, but I won't attempt a point-by-point counter argument.  For the record, there are many points I disagree with, plus many ideas that I think are confused or just mis-stated.  But I think the discussion will be best served by keeping focused on the main issues.

I'm also appearing on Rafał's podcast, Down the Rabbit Hole, along with some other SIRA members.  I'll let you know when it is posted for listening.

Tuesday, January 28, 2014

Estimating your organization's risk appetite, starting from scratch

On Twitter recently, Phillip Beyer (@pjbeyer) asked: "how do you measure risk appetite in program early stages?".  I gave my answers in a series of tweets, but this question comes up a lot so I think it's worthy of a blog post.

[Edit: Feel free to substitute the term "risk tolerance" for "risk appetite".  They have slightly different origins, but their interpretation in this context is the same.]

First, some people have an aversion to the concept of "risk appetite" and others deny that it even applies to information security (or more broadly to cyber security).  The argument goes that no rational manager or organization desires to take on information security risk if they could avoid it, and therefore there is no such thing as an appetite for risk.  A different argument against it is based on belief that risk in information security is not quantifiable, and therefore attempts to quantify risk appetite are similarly impossible or meaningless.

I believe these two positions are mistaken.  The first objection is a misunderstanding of what "risk appetite" really means and how it applies to information security.  I'll explain and clarify, hopefully,  Also in this post, I'll also address the second objection to show how risk appetite can be reliably quantified.

"How Complex Systems Fail" Richard Cook, 30min video

This is a wonderful 30 minute lecture that should be interesting to anyone in information security, risk management, operations, and especially CIOs and CISOs.  He gives very good explanations about why agility and learning are so important to resilience.

Nominated for "Best New Security Blog" at #RSAC

Every year at the RSA Conference, there is a meetup for information security bloggers.  As part of the gathering, awards are given to bloggers in various categories -- best corporate blog, best blog post, and so on --  based on votes from their peers.  One category is best new security blog, and I'm happy to report that this blog has been nominated in that category.

If you are a blogger, feel free to vote for "Exploring Possibility Space" using the link on this page.

Realistically, my chances of winning aren't great because I don't focus exclusively on information security, as the others do.  But even so, it's an honor to be nominated (as all nominees say!).

Sunday, January 19, 2014

PNAS letter & reply: You say potāto, I say potəto…

If we have mis-communicated, should we call the whole thing off?
Not just yet.  I say: once more, with FEELING!
Big news: my letter to Proceedings of the National Academies of Science (PNAS) has been published, along with the author's reply.  (pay wall)

This post includes an early draft of my letter plus some commentary.

My published letter: "Does diffusion of horse-related military technologies explain spatiotemporal patterns of social complexity 1500 BCE–AD 1500?"

The authors' reply is here.

The authors are Peter Turchin, Thomas Currie, Edward A. L. Turner, and Sergey Gavrilets.  In case you don't know him, Dr. Peter Turchin is one of the founders of this field called Cliodynamics, or the mathematical modeling of large scale, long time horizon historical dynamics.

The not-so-good-news is that the authors misunderstood my objections so their answers didn't address them.  Thus, we didn't really communicate successfully in the format of PNAS letters. Both my letter and the author's response were restricted to 500 words, and this significantly contributed to the miscommunication.

Message to PNAS editors: Your 500-word restriction on letters is anachronistic, unnecessary, and is an obstacle to productive scholarly debate. Since letters are only published online, there is no justification for the 500-word limit, which is presumably justified to save precious paper in the print version of PNAS journal. With online publication, letters should be edited to express their essential meaning without any fixed word count limit.

For copyright reasons, I can't copy verbatim either my letter or the author's response. Instead, I'll splice them together, along with my commentary on the miscommunication and more details about my objections.   My sincere desire is that the authors will respond here in the comments or elsewhere to address these (clarified) objections.

My objections focus on the authors' design and simulation choices, and not their underlying theories of social complexity.

Monday, January 13, 2014

Guest on "Data-driven Security Podcast" Ep. 1

I was a guest on the new Data-driven Security Podcast, episode 1.  There's the usual audio and also a video (1 hour 15 minutes).  Along with hosts Bob Rudis and Jay Jacobs, I joined Michael Roytman and  Alex Pinto for a lively conversation about how we all got into the data analysis side of information security and where we see it going.

The podcast and also the web site and blog are associated with a new book with the same title, Data-driven Security, authored by Bob and Jay. I was technical editor, so I can honestly say that I've read the whole book. I heartily recommend it to any information security professional or manager. It is a perfect "on-ramp" into data science and visualization as applied to information security, and it's written in your language.

Why I am not boycotting #RSAC

I'm scheduled to speak at the RSA Conference, San Francisco.  Many prominent speakers have decided not to speak in protest.  I've decided to follow through with this speaking engagement.

I'm bothered by the events and actions that have prompted the boycott -- a secret deal between the NSA and RSA to promote a weakened cryptography system.  I share most of the concerns and strong objections that the protesting speakers have expressed.  I have decided that, in this case, the benefits of speaking and engaging with attendees outweighs the value of a protest action.

(Edit: But I will be wearing an Electronic Frontier Foundation t-shirt and will give them a shout-out, so that's something.)