While this post talks about interpreting simulation results, the general topic of data interpretation applies to all empirical research, and even data analysis in industry.
In "The Truth is Born in Argument", Dr. Turchin summarizes the main ideas and results from his PNAS paper as preface to responses to the critiques raised in my letter and blog posts. However, this first blog post only covered two topics: 1) My complaint about the format of PNAS letters (he agreed that it was too short for both sides and doesn't promote good debate); and 2) His reaction to my "tone of discussion", which he said was "inflammatory" and "destroyed much the good will" I created when I suggested a debate via in blog posts.
The sentence in my blog post that he reacted to is this:
"From the viewpoint of the scientific method, this sentence is nonsense." [emphasis added]I assume that his reaction is solely due to my use of the word "nonsense". I apologize to Dr. Turchin if he found this word to be inflammatory. I didn't mean to be inflammatory. Let me rephrase it. But first, I'll add the context that Dr. Turchin left out -- the sentence that I was criticizing from their PNAS reply:
"The ability of our simple model to accurately predict data speaks for itself." [emphasis added]My statement above applied to this sentence alone, not to any other aspect of the research or any of the authors. Of course, the phrase "...speaks for itself" is common and acceptable in informal settings, drawing on common sense notions of self-evidence. But this sentence was presented in a formal setting -- a scientific journal -- and thus it is appropriate to interpret it strictly and formally. My objection is to the assertion that simulation models which generate data that accurately replicates their target (the explanandum) are self-evidently and self-sufficiently valid and significant. Therefore, I'll rephrase my objection this way:
From the viewpoint of the scientific method, this sentence makes no sense because simulation models and data always require interpretation, validation, and verification to establish scientific significance, and thus can never "speak for itself".Dr. Turchin: can you provide any references that justify the statement above from the viewpoint of the scientific method, especially the phrase "...speaks for itself"?
Yes, it's very good when a simulation produces data that is in accord with the explanandum. But that is only the beginning in the scientific contest with competing theories and critical arguments.
In my previous post, I said "There are infinitely many models than can 'accurately predict the data', and thus the model does definitely not 'speak for itself'." Any researcher needs to defend their particular model and implementation in comparison to alternatives. (More on this in a later post.)
Validation and VerificationValidation and verification are also crucial for simulations since they are situated in a broader ontological and epistemological context. The two diagrams below show some of this context. The first diagram comes from a conference paper called "On the meaning of data" and it focuses only on the bare bones of empirical research, which has some similarity to simulation-based research. It's simplistic, of course, but it gets across the main point: many factors besides the "model" and the "data" are involved in shape the final results, especially the crucial role of framing and interpretation.
|A schematic of empirical research. (Italic elements added.)|
Here's another diagram from a paper by Sargent that applies specifically to simulations and the challenge of validating and verifying simulations.
paper by Sargent goes into these issues in detail.
A Role Model: The Artificial Anasazi ProjectI'll close by offering different simulation model as a role model -- the Artificial Anasazi project of Dean, Gumerman, Epstein, Axtell, Swedlund, MacCarroll, and Parker. I point to this project because it is well known, it also has some similarities to Dr. Turchin's project, and because there has been good work done to validate and verify the model and simulation. (Information and references about the Artificial Anasazi project are provided below.)
To verify the original simulation and model, there have been several simulations written to replicate their model and results, including those using different simulation tools. (See here and here.) Of course, it helps if the original authors make their simulation code available, and also any formal specifications. I hope Dr. Turchin and his team make their's available, along with the geographic data.
For validation, there's a very interesting paper by Forrest Stonedahl and Uri Wilensky called "Evolutionary Robustness Checking in the Artificial Anasazi Model" where they use Genetic Algorithms to explore a very wide range of parameters for calibration and sensitivity analysis. Among their findings, they found that "by varying multiple parameters within a 10% range, the model can produce dramatically and qualitatively different results... Additionally, the multivariate sensitivity analysis highlighted several instances of anomalous model behavior, leading us to discover a bug in the Artificial Anasazi model’s code." Through systematic validation methods, Stonedahl and Wilensky were able to substantiate the claims of the original authors but also identify several limitations and at least one implementation flaw.
Finally, the project and its methods are analyzed by Joshua Epstein in his book Generative Social Science on pages 12 through 16. In this short analysis, Epstein is focused on computational simulation and Agent Based-Modeling as a research method, and so his analysis identifies the key criteria that simulations need to satisfy to have scientific significance. Most important: 1) "Does the hypothesized microspecification suffice to generate the observed phenomena?" and 2) Is it "empirically falsifiable". This aligns with Dr. Turchin's project and their achievements so far.
All together, the Artificial Anasazi project and related research represent a good role model for any "empirical agent-based research", which is a category that would include Dr. Turchin's project, in my opinion. When the research has reached this level of maturity, especially when the simulation and model have been well validated and verified, then we can say that the results are highly significant.
Dr. Turchin's project is much less mature, and he is inviting others, including me, to participate to replicate, validate, and expand the simulation. That's good. But as things stand now, Dr. Turchin and his co-authors are making stronger claims than I think are justified. Thus my original thesis from the PNAS letter is still valid:
"Because of design and simulation choices, it appears that their ABM provides less support for this hypothesis than the authors claim."
The main PNAS paper is here and a longer working paper is here. Below are several videos, first a short video of simulation results vs. history, and the a long video providing a full lecture on the project. You can get simulation code here.