I talked myself into giving a talk in a group seminar about meta-analyses in September, but that was shifted to November, so that everyone could attend. So, here it is.
I have a couple of thoughts about how meta-analyses in ecology are different to medicine, where they are mainly used. I think this should colour how we approach them. There were also a couple of points brought up during the discussion that are worth raising.
Most medical meta-analyses have quite a tight focus: a treatment for some ailment (e.g. a drug) is being investigated in several studies. In contrast, ecological meta-analyses tend to ask questions about effects across species and regions, e.g. questions like the effects of plant litter on vegetation. This, I think, changes the emphasis, and suggests that the classical meta-analysis approach needs to be adapted. In particular, I don’t see much use for fixed effect analyses. These assume that there is a single “true” value for whatever statistic is being examined: a treatment has one effect, across all study populations. But if we’re looking at effects across species, I’m sceptical that this would hold. Thus, it is better to use a random effects model (which assumes that each studies has a different effect, and then summarises the distribution). Using a random effects model changes the interpretation, though. We can’t say that what is estimated is the overall effect (e.g. “doubling plant litter reduces germination by 30%”). Instead, we can say something like “plant litter generally has a negative effect on vegetation, although the magnitude of the litter effect varies”. Indeed, it might be that the effect can go in both directions: one can imagine a situation where the variation between studies is large, so that although the average effect is zero, there is almost always an effect, but it can go in all sorts of directions.
A related thought is that meta-regression should probably be more important in ecology and evolution. These go beyond summarising the (average) effect to modelling what it depends on (e.g. whether there is a difference between glasshouse and field studies). I’ve been partly responsible for one of the more excessive meta-regressions, where we ended up modelling the standard errors too and it’s an approach that is asking to be exploited. Of course (as we discussed during my talk), we need enough studies to be able to do this.
Another big issue we got onto during the discussion session was the problems of making sure the information given in papers was enough to be able to do the meta-analysis. This means estimates need to be given (whether significant or not), as well as their standard errors: not having the standard errors leads to terrors like Bayesian analyses. It’s difficult to see how to enforce this: it would need journals to crack down. But our discussion shifted onto a slightly different topic of reproducible research: providing the data and tools to repeat analyses. There are initiatives to encourage researchers to provide their data (e.g. through Dryad), some journals now insist on this being done and the DfG (the main German funding agency) will soon insist on this too. But it would be useful to also have the precise statistical fiddlings available, e.g. the R code. This would obviously mean that missing statistics could be calculated (and the data also mined in all sorts of other ways).
Hm, as I’m mentioning R, one thing I forgot to include in my talk was that there are several R packages for doing meta-analyses: MADAM, meta, metafor and rmeta, as well as a package, copas, for adjusting for publication bias. There are also a couple of packages for meta-regression: metaLik and metatest, plus several more specialised packages. Of these I’ve only looked at meta and rmeta, so I can’t compare them all. Anyone care to chime in?
Nice, Bob. There’s been a bit of discussion recently about the relevance of meta-analyses in Ecology recently, with some advocating throwing out babies and bathwater and others being more pragmatic.
Hillebrand and Cardinale (2010) is a nice place to start reading about this.
Now, I’m sure there’s something else I should be doing just now…
Whittaker’s rant that started that off was mentioned yesterday (not by me). I’m not sure if he was suggesting throwing out all meta-anayses, or just the ones looks at productivity-diversity relationships. They do look particularly messy.
Useful post, thanks Bob! And your point about the differences between ecology and medicine is one that always occurs to me when I see ‘systematic reviews’ proposed as a general tool for providing conservation evidence…
Good point, Tom. I’ve heard some eminent British ecologists strongly advocating this approach, never thought about potential pitfalls though.
Providing raw data is at least as valuable for reanalysis as for meta-analysis. I know I occasionally go back and reanalyse "old" data (another point – how can data be "old"?!) using new techniques or insights. I would be happy to have someone make use of my raw data in new and interesting analyses. I think one way to encourage open provision of raw data will be to ensure that those who do reanalysis or meta-analysis adequately credit the value of the data (and its collectors), and to ensure that the "system" properly values such provision. Of course, to use (or re-use) the raw data properly means understanding it, and this works better when those who collected the data are involved in characterizing it – a giant database, no matter how well constructed, will have limitations on its ability to provide insight into the context of the data. To be honest, data collection is often the hardest part of science; it seems a shame that that effort is not exploited to its fullest.
Ken – I fully agree. I know there’s often a reluctance in ecology for people to give out their data, which is a shame. There’s an interesting paper in PLoS One suggesting that people are more likely to share their data if it gives more certain results.
I fully agree that the researchers who collect original data should be properly credited with doing so.
However, I have some reservations about the extent to which they should be included in any re-analyses. For one thing, it questions the validity of an ‘independent’ re-analysis. This won’t always be a problem however.
But Bob’s point highlights another issue: field biologists (or other primary data collectors) often have a strong ‘intuition’ about what the data says, before any statistical analysis is carried out. This can have a strong effect on the type of questions asked of the data.
Some colleagues of mine were involved in a collaboration on an existing data set with a very experienced field researcher, who really took a lot more convincing than otherwise necessary that the data didn’t fit his/her hypothesis. The good news is that they published a very nice paper out of this. But it’s sometimes hard to say who "understands" the data better…
So full credit is an absolute must, but independence is also very important in some cases.
A thoughtful post, with which I’m inclined to agree. I gave you a shout-out over at the Oikos blog.
@Mike Fowler: good point about how some "distance" from the original data can be a good thing.
One thing I’ve often been curious of is that we don’t seem to use log odds ratios in ecological meta-analyses very often (I can think of one exception Hyatt et al 2003 Oikos). Part of this is that we don’t like thinking in terms of odds ratios, but, it’s so useful! Particularly for questions regarding discrete events, such as success of pollination, survivorship from different disturbances, and more.
Thanks for the link love, Jeremy. I actually saw it yesterday, when I was looking for your zombie ideas post, for a student (he liked it, and it also saved him from a lot of work).
Jarret – we’d have to teach biologists to understand their logistic regressions first. I guess, though, that there would usually be a lot of covariates, so the log odds turn up as ergression coefficients in one way or another.
I just stumbled onto this, so a bit late in commenting… In medicine, it is unlikely that there is a consistent effect across all studies. Genetic and environmental differences among the study population means that effects of treatments are also likely to vary. So random effects models should be used frequently there too.
Cheers,
Mick (mickresearch.wordpress.com)
interesting post which i have some views on:
1) fixed v random is the same in medicine and ecology. Most serious analysts in both fields tend to advocate random effects. The only issue here is that if you are lucky enough to have homogeneous samples in your different studies or strata then random effects may give more weight to the small studies than would be ideal. This can be an issue particularly where study size and quality are linked. The ecological view of medical meta-analyis as reductionist fixed effect synthesis to increase power is at variance with most medical meta-analysis.
2) meta-regressions are important. I completely agree that the pooled effect is often of very limited utility in ecology. Exploring heterogeneity and understanding reasons for variation is much more important. This is often true in medicine too, where the medical community advocate use of individual patient data to avoid the problems of aggregation bias associated with meta-regression (I’ve written a 2012 plos one manuscript about this).
3) The main advantages of Bayesian approach relate to ability to build nuanced models which allow different aspects of studies to be considered similar (exchangeable). In other words, it is adoption of the machinery not the the use of priors (on variance parameters or elsewhere) which is most useful. Convinced Bayesians (like myself) think belief is important too, but it secondary to just having a method for building complex models that don’t fall over like maximum likelihood models
4) Metafor in R is a great general package that includes meta-regression and lots of other tweaks: highly recommended
I’ll end on a general note- as an ecologist, who became an analyst and now synthesizes medical data, I can say that research synthesis is generic. Applications vary, but the underlying issues are the same in both medicine and ecology. The journal of research synthesis methods is a good place to go to find articles that reflect this interdisciplinary. Ecologists could learn an awful lot about meta-analysis particularly in a decision-theoretic context from the medical community. Similarly, the medical community could learn a lot from ecologists. Both groups need to communicate and also have to engage with statisticians and methodologists. A bit more respect and communication all round would go a long way!