It’s always nice to read a paper that is obviously wrong, but where you have to think about why it is wrong. Because it makes you, well, think. And sometimes learn something new. So when I see a paper in TREE with the title “Do simple Models lead to generality in ecology?“, it’s clear that it’s going to answer “no”, and that I’m going to disagree.
In there paper, Evans and a plethora of co-author present this argument:
[W]e argue that there is usually a trade-off between simplicity and generality, such that simpler models are, all other things being equal, less general than are complex models. For example, a nonlinear population growth equation such as dN/dt = αN + βX1+a represents a large family of models, the members of which correspond to the constant parameters α and β being set to particular values (whereas a can take any value). If β is set to zero, we obtain a simpler linear equation, dN/dt = αN. Obviously, the nonlinear equation includes the linear one as a special case. Thus, the more complex equation represents a larger family of models than the linear ones and, therefore, is more general. It can pick out all the real systems that are described by the linear equation plus a range of others.
(I have slightly changed the equation from what the authors presented, to make my argument clearer. I think the original argument is unchanged, though)
i.e. simple models are less general because they cover less of the model space. Which is odd, because I think this is precisely why a simpler model is more general.
The argument for the greater generality of a simple model is that any mechanism not in the model is not relevant, so the conclusions we get from the model are relevant whatever other mechanisms might be acting. In other words, if we are using the simple model dN/dt = αN we are assuming other models like dN/dt = αN + βX1+a, dN/dt = αN + βeY, dN/dt = α sin θt N etc. etc. will show the same behaviour that we are investigating. Of course, they will differ in other ways but the purpose of any model should be to investigate specific phenomena, and how it behaves outside of that is generally not relevant.
On this reading, a more complicated model is less general because it fixes more of the model: dN/dt = αN + βX1+a precludes terms like eδX. Less abstractly, if we have built a model with sexual reproduction, we should not be using it to make inferences about asexual populations.
Of course, we can expand a model to have proportions of sexual and asexual reproduction, which would be more general than one that only assumed one mode of reproduction. But in practice, I think more complex models are usually developed in a different direction: they are made more complex by adding more mechanisms (e.g. age-structured reproduction, individual-based behaviours). These end up being much less general: a model of the behaviour of Cromerian giraffes will work well to predict their behaviours (such as cycling), but will not be of much use for predicting the head-perching behaviours of parrots.
Evans et al. suggest that complex models can be made general by exploring their behaviour over a range of parameters:
The price for complexity is that such models usually need to be tied to data from specific systems. We are still left with the problem of how to generate general insights from models that are tied to specific systems. A key strategy currently used to solve this problem is the use of simulation experiments. Such experiments are performed on models, but parallel the kinds of experiment performed on laboratory systems. Techniques include analysing confounding factors, such as heterogeneities and stochasticity, changing the number of types of entity and process considered in the system, and systematically varying the parameters and variables of the model to determine whether its predictions are strongly or weakly influenced by changing values and, thus, which processes are more or less important dynamically. Utilising such simulation experiments requires the systematic consideration of possible (but not actually occurring) scenarios to understand the scope and limits of the model in question.
In tying a complex model to specific systems, a modeller is making assumptions specific for that system, which makes it less general (e.g that giraffes cannot fly, or unicycle beyond certain points). Exploring the (large) parameter space still only explores the model within these assumptions. In practice the lack of generality is even worse, because with more parameters it is more difficult to explore the parameter space: with a simple model like dN/dt = αN we can fully characterise its behaviour over all possible values of α. But with (say) 20 parameters and a model too complex for an analytic analysis, some choices have to be made about which parameters to explore, and what parameter values to use. We simply don’t have the computing power to explore the full space of parameters. Thus any generality that is gained by adding complexity is lessened simply because we are human, and cannot understand the model across the full range of its generality.
The other problem with using complex, specific, models to explore general behaviours is that it becomes difficult to understand why a model is behaving the way it is. Modellers are reduced to carrying out experiments with specific choices of parameters, and then using statistical methods like ANOVA to summarise the behaviour of the model. In essence it is treated as a black box. But the point of a model of general phenomena is to understand the systems where we see the phenomena: to unpack the black box. I suspect that the complexity of models is used as an alternative to thinking: a model is built to predict the behaviour of Cromerian giraffes, and subsequently the modeller thinks that the model can be used to explore general locomotory behaviours, without the bother of thinking about what aspects of locomotion (unicycling, flying, attraction to balding heads etc.) are the most relevant. Thinking first, and then developing a minimal model should be a preferable path, as it removes unnecessary parts of the model. this makes it easier to focus on what is important and thus easier to understand what the model is telling you. It also means there is a smaller parameter space to explore, and if you are simulating, each simulation will usually take less time (because there are fewer functions to evaluate). What’s not to like? Other than having to think, of course.
One strange idea that crops up in this paper is this:
[Simple models] do not need to be tested against specific data because they represent concepts rather than systems.
Unlike in physics, where general models have to make testable predictions, ecology has embraced an approach where models claiming generality are untestable in any real system
Frankly, this idea is rubbish. Evans et al. criticize Volterra for comparing his predator-prey models (which produce cyclic dynamics) to data on fisheries in the Adriatic (where the data have cycles) for being confused. Likewise:
May and Anderson in 1979 compared the output of their model with data from real populations exposed to diseases. They found that ‘some of the theoretical conclusions can be pleasingly supported by hard data, while others remain more speculative’. This confusion of model purposes gives the false impression that simple demonstration models can provide actual explanations of specific systems.
But no explanation for the confusion is given – apparently it’s just that Volterra, Anderson, and May all erred by not doing things the Evans et al. way. Even if simple models “show that the modelled principles are sufficient to produce the phenomenon of interest” (which is a reasonable point of view), they are still intended as explanations of the real world. So surely they have to be tested against the real world, one way or another. To deny that is to deny that simple models of real-world phenomena are models of the real world. Idealised, yes, but they are still meant to say something about the real world.
I think the test of a simple, general, models could be rather informal. Volterra showed that predator-prey dynamics can produce cycles. One does not need to fit his model to the data to say that it works as an explanation: it is enough to see cycles in data with predators and prey. At Intecol last month, Sarah Calba gave a presentation about “prediction”, and argued that the concept is multi-faceted. We predict about the future (e.g. what the distribution of Cromerian giraffe will be in 2063), but we also make different sorts of predictions about the past and present, e.g that Cromerian giraffe cycle. And this can be tested, by going to Cromer and looking for giraffe, to see if any are cycling. The predictions are more qualitative, and serve a different purpose: they let us test our model explanations of the natural world, rather than tell us about the state of system that we are predicting for.
The beauty of simple models is that they help us understand some aspects of a range of real world systems. Volterra built his model to explain cycles in Adriatic fish. But the same model also provides an explanation of cycles in North American lynx. And chemical reactions. And it is also used in economics. But apparently it’s not general.
Evans MR, Grimm V, Johst K, Knuuttila T, de Langhe R, Lessells CM, Merz M, O’Malley MA, Orzack SH, Weisberg M, Wilkinson DJ, Wolkenhauer O, & Benton TG (2013). Do simple models lead to generality in ecology? Trends in ecology & evolution DOI: 10.1016/j.tree.2013.05.022