Sunday, January 19, 2014

PNAS letter & reply: You say potāto, I say potəto…

If we have mis-communicated, should we call the whole thing off?
Not just yet.  I say: once more, with FEELING!
Big news: my letter to Proceedings of the National Academies of Science (PNAS) has been published, along with the author's reply.  (pay wall)

This post includes an early draft of my letter plus some commentary.

My published letter: "Does diffusion of horse-related military technologies explain spatiotemporal patterns of social complexity 1500 BCE–AD 1500?"

The authors' reply is here.

The authors are Peter Turchin, Thomas Currie, Edward A. L. Turner, and Sergey Gavrilets.  In case you don't know him, Dr. Peter Turchin is one of the founders of this field called Cliodynamics, or the mathematical modeling of large scale, long time horizon historical dynamics.

The not-so-good-news is that the authors misunderstood my objections so their answers didn't address them.  Thus, we didn't really communicate successfully in the format of PNAS letters. Both my letter and the author's response were restricted to 500 words, and this significantly contributed to the miscommunication.

Message to PNAS editors: Your 500-word restriction on letters is anachronistic, unnecessary, and is an obstacle to productive scholarly debate. Since letters are only published online, there is no justification for the 500-word limit, which is presumably justified to save precious paper in the print version of PNAS journal. With online publication, letters should be edited to express their essential meaning without any fixed word count limit.

For copyright reasons, I can't copy verbatim either my letter or the author's response. Instead, I'll splice them together, along with my commentary on the miscommunication and more details about my objections.   My sincere desire is that the authors will respond here in the comments or elsewhere to address these (clarified) objections.

My objections focus on the authors' design and simulation choices, and not their underlying theories of social complexity.

Objection #1 -- Does Not Adequately Model the Labeled Phenomena
From my letter,  I assert that for their simulation,...
"...the core features are underspecified—too simple and too abstract—and therefore do not adequately model the labeled phenomena." [emphasis added]
Their response:
"The core of Thomas’s critique is that our model is 'too simple and too abstract'. We, however, see model simplicity as a desirable feature in developing theory that is both grounded in sociological mechanisms and tested with historical data. Too many previous attempts at theory building in the social sciences have foundered as a result of being highly complex and including in them too many mechanisms. The ability of our simple model to accurately predict data speaks for itself."

Commentary

The authors misunderstand my objection.  The key phrase in my letter is that their model "does not adequately model the labeled phenomena".  I am not objecting to simplicity or abstraction per se, and I am not arguing in favor of model complexity or multiplicity of mechanisms.  Given the authors' decision to aim for the simplest possible model, my objection is that they have gone too far in simplicity and abstraction.  The evidence of "going too far" is that their computational model does not adequately model the phenomena that is asserted in their hypothesis.  For further analysis, see 1b, below.

Reply #1a -- "The model...speaks for itself"

In the last sentence of this paragraph, the authors say:
"The ability of our simple model to accurately predict data speaks for itself."

Commentary

From the viewpoint of the scientific method, this sentence is nonsense. There are infinitely many models than can "accurately predict the data", and thus the model does definitely not "speak for itself".  I'm very surprised that they would claim that a model would "speak for itself" just because it was able to "accurately predict data" to some degree ("65% of the variance").  In the scientific method, models have to compete with viable alternative models, and the preferred arena of competition is what Francis Bacon called "experimenta fruitifera" -- experiments that will conclusively test one theory against another.  It's not enough to just identify causal relationships, however suggestive they might be.  In their original paper, the authors do not test their model against any viable alternative model and therefore they cannot make any strong claim for their model.  At best, they could support a claim that their model was suggestive and worthy of further testing.

Let me offer a null alternative for testing. Consider a simple reaction-diffusion model with appropriate initial conditions for social complexity in 1500 BCE (i.e. delta regions in Nile, Mesopotamia, Indus Valley, and Northern China) and appropriate diffusion rules (e.g. along trade routes and gradients of agricultural productivity and population, etc.).  This model would simulate mechanisms based on a simple diffusion process and not due to any specific military technology or socio-political dynamic. Importantly, this alternative model would have not have any variables for warfare, horse-based military technology, or ultrasocial traits as the authors have defined them.  If the two models yield essentially the same results, then the authors' claim that their model "speaks for itself" would be disproved.

In #6 below, I propose an alternative causal pathway that has strong empirical support, at least in the early part of the period.  Testing the authors' model against this alternative would be a very strong test, especially if done over smaller geographic scales and time periods (see #2 and #5, below).

Objection #1b

I assert that for their simulation,...
"...the core features are underspecified—too simple and too abstract—and therefore do not adequately model the labeled phenomena. 'Ultrasocial traits' could just as well be interpreted as any assets, resources, knowledge, or capabilities that endow a region with power or influence, and not just ultrasocial traits. Military technology or 'MilTech traits' could just as well be interpreted as any factors that promote socio-cultural hegemony, not just military traits, much less horse-related military traits. The process labeled 'warfare' could just as well be labeled as “accretion” (i.e., any combination of expansion, annexa- tion, colonization, migration or warfare)."
Their response:
"...ultrasocial traits are not just any assets or resources that endow a region with power. In our model (and in real life) these cultural traits yield benefits at the collective level, but impose significant costs at lower levels of social organization. Why they spread, despite these costs, remains a central puzzle of social evolution. Similarly, the process we label as 'warfare' is not simply any “accretion” process because it involves the relative power (a function of size and ultrasocial traits) of the polities involved." 

Commentary

The authors misunderstand my objection.  I'm not objecting to the concept of "ultrasocial traits" nor am I conflating ultrasocial traits with "just any assets".  Likewise, I am not conflating "warfare" with "accretion".  My objection is that their simulation does not model ultrasocial traits in a way that is distinct from the more general class of assets or resources that confer power. Likewise, my objection is that their simulation does not model warfare in a way that is distinct from the more general class of accretion processes that expand territory.

The way the authors have codified and abstracted ultrasocial traits and MilTech traits, there is nothing preventing an observer from interpreting them more broadly.  Likewise, given the behavioral rules implemented in the simulation, there is nothing preventing an observer from interpreting them more broadly. On the same lines, the way they model "warfare" can be interpreted more broadly to a class I call "accretion".  They say that their model of "warfare" involves "relative power" (size and ultrasocial traits).  But size of any polity relative to another can manifest in many ways beyond warfare, including many of the accretion processes I mention.  Likewise, if the vector labeled "ultrasocial traits" is just as well interpreted as "any assets or resources that endow a region with power", then the combination of size and power traits could apply to colonization, annexation, migration, and other accretion processes.

In order to model ultrasocial traits and not any other assets or resources, additional codification and/or behavior rules would be required.  The same goes for modeling warfare in contrast to other accretion processes.

Let me offer this simple example to explain this point.  Imagine that I offer you a model of competition between pairs of businesses based on intellectual capital. This model has a vector of binary elements to encode intellectual capital for each business. The model has one behavioral rule for competition: for each round, the probability of "acquiring" the competition is proportional to the relative sum of each firm's intellectual capital vector.  The outcome of competition is either "no change" (a.k.a. no winner) or "winning firm acquires the losing firm".

Does this adequately model the labeled phenomena?  No.  Just because I've labeled the vector to be "intellectual capital" doesn't mean that it is appropriate to the labeled phenomena.  It could just as well be interpreted as "any factor that determines competitive success".  In order to model "intellectual capital" some additional coding would be required, such as differentiating it from resources that were not intellectual capital (e.g. physical and financial assets) and from competitive factors not based on assets (e.g. market share).

Likewise, just because I define the rule as "...acquires...", does it prohibit a more general interpretation? No it doesn't.  Because my simple model only accounts for existence or non-existence of individual businesses and doesn't account for business assets, the behavior that results from the rule could just as easily be interpreted as a merger, a bankruptcy, a sale of assets, or an other business transaction.

In very similar ways, the authors' model is too simplistic and too abstract to model the labeled phenomena.

Objection #2 -- A Single 3,000 Year Diffusion Process

From my letter:
"...other than parsimony, there seems little justification for the design choice to model MilTech as a single innovation diffusion process lasting 3,000 y."
Their response:
"...we actually consider the independent spread of five military technology traits. Although investigating the effect of timing of different innovations is an interesting question (and one that we are pursuing), this requires introducing several additional parameters and assumptions."

Commentary

In the first sentence of their reply is not relevant to the objection.  Their "independent spread of five military traits" are all introduced at the beginning of the simulation and thus are subject to a diffusion process lasting 3,000 years.  There are no additional military traits added later that might be interpreted as new technologies or innovations.

Yes, indeed, it will be necessary to add "several additional parameters and assumptions" to model waves of innovation, but this shouldn't be too complicated.

What I am objecting to is the authors' claim their current model, which includes only a single wave of horse-related military technology diffusion lasting 3, 000 years, is a realistic and plausible model of history compared to a null model based on simple reaction-diffusion (see #1a, above).

Objection #3 - Elevation is a Poor Surrogate for Defensibility

From my letter:
"...elevation appears to be a poor surrogate for defensibility. Using the same sources, it should be possible to code each region on a scale of ruggedness: for example, (max – min). Substituting ruggedness for elevation might significantly change the results, especially in high plateaus."
Their response:
"...it is not clear why elevation is a 'poor surrogate'. Our results indicated that elevation was not a strong predictor of imperial density, and other variables related to it are unlikely to add greatly to the explanatory power of the model."

Commentary

This issue may seem unimportant, but because of the authors' design choices, it turns out to be vital for their overall results and for the authors' claims.  Significantly, it appears that the elevation variable explains why, in their simulation results, the early centers of social complexity are in the lowlands near the steppes.

To recap, the authors' model only includes two variables that are a function of geography.  The first is a binary variable for agriculture.  The second is an elevation variable, which is a proxy for defensibility. The higher the elevation of a geographic cell, the more defensible it is (i.e. less likely to lose when attacked).

First, the authors offer no support for this design decision, either from theory or from empirical data.

Second, this design choice appears to "tilt" the results in specific directions (i.e. in line with history) and away from counterfactual directions.  Specifically, their model will inhibit social complexity in the high steppe plateaus in Central Asia due to high elevation, and will promote social complexity in coast regions and river deltas such as the Nile, the Tigris/Euphrates, Danube, etc.

Third, if defensibility is the phenomena they are trying to model, then some measure of ruggedness would be much more appropriate, and this would involve comparing the minimum elevation with the maximum, or similar.  An ideal measure of ruggedness would be the fractal dimension, but that would require extensive data for each geographic cell.

The second sentence of the authors' response is not relevant to my objection.  It's irrelevant whether elevation alone is or is not a strong predictor of imperial density.  In fact, I know of no theory of social complexity that includes elevation as a primary or secondary variable. Furthermore, the authors offer no theoretical justification for their evaluation of elevation alone.

My point is not that adding ruggedness will increase the explanatory power of their model.  I'm arguing the reverse -- that "elevation" is an unjustified variable in their model that inflates its explanatory power.  If a more realistic measure of ruggedness were included as a proxy for defensibility, then their model would have much less explanatory power.  Remember: their hypothesis is that it is the spread of horse-related military technology over 3,000 years that has primary causal influence on the geographic distribution of social complexity.  Defensibility is central to the way they model the war-centered diffusion process.

As I stated in my letter, if they included a measure of ruggedness rather than elevation, their model would predict much higher social complexity in the high plateaus compared to the historical data.

Objection #4 -- Random Seeding of MilTech is Not a Good Alternative Treatment

From my letter:
"...random seeding of MilTech across the entire space is not a good alternative treat- ment. A better alternative treatment would be to seed MilTech (as “socio-cultural hege- mony traits”) in the lowland civilizations that had already achieved a high level of social complexity in 1500 BCE: Egypt, Mesopota- mia, Indus Valley, and Northern China. I suspect that the simulation results would Department of Computational Social be identical. If so, this result would considerably weaken support for the horse-related MilTech hypothesis."
Their response:
"We lack space to address all specific points raised by Thomas, so we will focus on the most important ones."

Objection #5 -- Running Simulations over Shorter Periods Would Be Better

From my letter:
"...running the simulation over several shorter periods and smaller geographic re- gions would be more fruitful than the author’s choice of 3,000 y because sharper comparisons could be made between empirical data and simulations. Initial conditions would be more accurate, dark ages could be excluded, and so forth. It would be very compelling if the same model and parameters accurately reproduced empirical data from different periods and regions."
Their response:
"We lack space to address all specific points raised by Thomas, so we will focus on the most important ones."

Objection #6 -- Alternative Causal Pathways

From my letter:
"...the results would be much more compelling if an alternate causal pathway was tested. [...] However, at least for the war chariot in the Late Bronze Age Near East, there is strong evidence that innovation and diffusion was driven by prestige and elite emulation (2) and only later took on major significance in warfare."
(2) Feldman MH, Sauvage C (2011) Objects of prestige? Chariots in the Late Bronze Age Eastern Mediterranean and Near East. Ägypten und Levante 20:67–182. 
Their response:
"We lack space to address all specific points raised by Thomas, so we will focus on the most important ones." 

Commentary

It is very disappointing that they did not address this objection because it goes to the heart of two primary objections to their research.  First, to make strong claims that their model supports their hypothesis, they should be testing it against viable alternative models and theories.  Second, they should be drawing on more empirical research than just geographic distribution of imperial polities.  The paper that I cited was particularly strong because it detailed and summarized many lines of evidence that ultimately supports the causal pathway that war chariot innovation and diffusion was driven primarily by prestige and elite emulation.

Conclusion

The authors were unmoved by my objections though they did say that further refinements and alternative models would be welcome.

Though it would have been better if the authors had addressed my objections, I have sympathy for them in attempting to do so under the PNAS 500-word limit for letters.  I didn't have enough space to fully explain my points and they didn't have enough space to fully reply.

I'm happy with the process from the viewpoint of participating in scholarly debate, so I don't regret anything.

It would be even better if the authors could reply to my objections here in this blog (in comments) or in some other public venue.  Long gone are the days when academic debates could only be conducted in journals.   (I will be emailing the lead author directly with this appeal.)

No comments:

Post a Comment