In which, late as ever, I remember to say something about peer review.
There has been a bit of a flurry of commenting about peer review around OT just recently, with both RPG and Girl, Interrupting weighing in. The general consensus with them, and other bloggers, here, and most commenters on their posts, would be something like:
“Peer review: imperfect but necessary – and better than the alternatives”.
Now, I guess that would also broadly sum up my own view. I would add, I think, that peer review is more of a problem for grants than for papers. But, of course, this is more a function of the ludicrous success (or rather failure) rates for grant applications than of the refereeing process. More on that sort of stuff from me here, and in Stephen Curry’s recent post here.
Anyway, as I am feeling guilty (as usual) about not posting anything here for a while, I have disinterred what I wrote about peer review for an editorial in Physiology News a couple of years back (Autumn 2008, to be precise). Somewhat pompous and self-consciously “balanced” style, yada yada, but you’re used to that by now.
Peer review, which lies at the heart of the scientific process, has a long history; the style of (‘single blind’) anonymous peer review now in common use was described by the Royal Society of Edinburgh in 1731, though the basic idea is far older. Despite this, widespread application of a system of anonymous expert referees did not become commonplace until later than is often believed, around the middle of the 20th century. Expert peer review in some form is now pretty much universal in scientific journals. So is there anything new to say?
Science vs non-science
One point that scientists sometimes need to remember is that peer review is not just important ‘internally’, but externally too. Properly functioning peer review is a key way to distinguish science from non-science (nonsense?). In a world where we are bombarded by apparently scientific claims – often for things that are being sold to us – it is important to have ways of telling sales talk and science apart, and peer review is one. As Sense About Science put it, peer review is an ‘essential arbiter of scientific quality’(1).
A fly in this ointment, of course, is that there is peer review and peer review. Journals in the top couple of dozen, or possibly more, journals in established subject categories – such as ‘physiology’ – maintain rigorous review processes, as we all regularly experience. But there are a lot of journals, and reviewing standards vary widely. As a recent report for the Publishing Research Consortium puts it (2):
‘Because the peer review standards of different journals vary, it is widely believed [by scientists] that almost any genuine academic manuscript, however weak, can find a peer-reviewed journal to publish it if the author is persistent enough.’
So, while publication in a peer review journal is some kind of quality mark, there is a blur at the edges.
Bibliometrics … inevitably
One way we might seek to clarify the reliability of work published in peer review journals are journal ‘pecking orders’, nowadays often substituted by bibliometric rankings. The most common of these, journal impact factors (IFs), while derided regularly and with plenty of justification, are clearly here to stay. As The Physiological Society’s journal editors put it a couple of years ago:
‘There is considerable divergence of opinion about the significance of [impact factors as a] measure of journal quality, but author surveys continue to confirm that a journal’s IF is among the most important considerations in choosing where to publish.’
That is, impact factor is a proxy for pecking order, and thus perhaps, to a limited extent, for reviewing standards. This may indeed be the only thing for which journal IFs are useful, since their worthlessness in assessing individual scientists and their work has been attested repeatedly, most trenchantly by David Colquhoun (3).
Of course, journal IFs are only useful even in this restricted fashion within subject categories; their inadequacies when NOT comparing like with like are well described. While general science journals like Nature and Science have far higher IFs than journals like J Physiol and J Gen Physiol, I have yet to meet a single scientist in physiology who thinks that the reviewing at the general journals is more rigorous. To take another example, comparative physiology journals have notoriously low IFs, though the technical quality of the work they publish is high.
Can we improve anything about peer review?
While peer review is clearly vital, it is also imperfect; I suspect there is no scientist alive who does not have some complaint to recount. To their credit, journals try hard to ‘audit’ their review processes, and to tweak them in ways that will help them work better. The latest fashion seems to be for more explicit instructions to reviewers as to how to assess a paper and write a reviewer’s report. David Linden, the editor of the Journal of Neurophysiology, writes about this here (4), and Nature Cell Biology has also weighed in (5).
One thing I was glad to see in the latter was the exhortation to reviewers to ask ‘Are all claims made supported by the data?’ A personal view is that reviewers tend to be harder on whether they think experiments are technically correctly done – as Nature Cell Biol puts it: ‘Are key experiments or crucial controls missing? Are the data significant and definitive?’ – than on the inferences authors subsequently draw from the data. Though this is understandable, the danger here is that if an author repeats the same rather tenuous extrapolation in several published papers, they can then conceivably write a review citing all these papers and boosting tenuous extrapolation to the status of ‘well-tested hypothesis’ – next stop the textbooks. So a personal plea would be for reviewers to pay a bit more attention to what authors say in the Discussion, as well as in the Methods and Results.
This also brings me to a final point – what happens if you see a paper that clearly has something in it that is wrong, but that the journal’s referees have missed? It is certainly possible to write to the journal or its editors, though such letters rarely, in my experience, see the light of day.
My preferred solution to this problem is an electronic response thread following the online version of the article, something favoured by some medical journals and increasingly by exclusively
online journals. While it is perfectly possible to pen a short review commenting on a paper or papers, this rarely happens either, mostly for ‘activation barrier’ reasons. Writing a review, even a short one, is hard work. In contrast, penning an e-letter is a lot easier, and in the best cases generates quite an informative online debate. It is a bit like seeing the reviewing happening live, except after publication, and sometimes shows up things the journal reviewers missed. Another recent Nature Cell Biol editorial reveals that response threads will be coming soon for the Nature Group journals (6) [Note: now in place, since this was written 2 yrs ago], and it will be interesting to see who else follows suit.
So to peer reviewers out there: keep up the reviewing standards, watch the authors’ extrapolations – and see you in an electronic response thread soon. Even after a few hundred years, this is no time to get sloppy.
1 “Peer review”. Sense About Science.
2 Peer Review: Benefits, perceptions and alternatives. Publishing Research Consortium.
3 Colquhoun D (2007). How to get good science. Phys News 69, 12–14. [online on his blog here]
4 Linden DJ (2008). Warm, fuzzy feeling. J Neurophysiol 100,1.
5 Good review (2008) Nature Cell Biol 10, 371.
6 What to publish? (2008). Nature Cell Biol 10, 247
[More hotlinks later if I can be bothered!]
So what would I add to that today?
Well, one thing is that, like the new President of the Royal Society in his Horizon TV programme the other night, I stand by the second paragraph – we need peer review. To those like me who spend a fair amount of time combating pseudoscience, it is absolutely clear that “serious” peer review, along the lines described above, remains one of the major differences between real science and the cargo cult stuff.
Unfortunately, the practitioners of many pseudosciences have got wise to this, and have set up their own journals. These too practice a kind of peer review – it is just that there it is review by your real peers, namely other people who have suspended their skepticism and critical faculties. Often this is done unknowingly, with people fooling themselves in their desire to believe – a kind of confirmation bias. But it is nonetheless a major problem, certainly in the area of complementary and alternative medicine (CAM).
There are certainly people who feel traditional pre-publication peer review should be replaced by something else, like “crowd source” peer review, or post-publication review. My friend David Colquhoun occasionally muses along these lines, and there are also vocal proponents like Cameron Neylon. I don’t think I am convinced. I am a fan of the idea of post-publication “crowd” review –see my words above. But I have my doubts about whether it is really workable.
One reason is that journals that do run online comments threads to allow “crowd response” tend to have rather little in the way of comments on many papers. Nature, for instance, only seems to attract comments on articles when the topic is something that gets the trolls and nuts in a frenzy, at which point the ratio of insane to rational comments tends to be about 10 (or more) to 1. For examples try, for instance, anything mentioning the fringe types who do not HIV is the cause of AIDS. That is not crowd review – more like spittle-flecked shouting.
The only journal I have seen thus far with an intermittently successful online comments set-up is the British Medical Journal, and even there it is patchy. Some articles do attract reasoned critique. Others, again, attract only the green ink brigade. Or, in the case of anything relating to vaccines, end up back where we just noted with Nature threads and HIV.
So in the end, I am back with the analogy that makes Cameron Neylon so tetchy. Yes, that Churchill one, about peer review being imperfect, but less imperfect than the alternatives.
Though we were discussing this after RPG’s post on peer review the other day, I hadn’t tracked down the real source of Churchill’s actual remark, which I will quote.
“Indeed, it has been said that democracy is the worst form of government except all those other forms that have been tried from time to time”
Per Wikiquotes, and via Cameron Neylon’s blog, the source is The Official Report, House of Commons – otherwise known as Hansard – (5th Series), 11 November 1947, vol. 444, cc. 206–07
Cameron Neylon makes the interesting point that the analogy we make between this and peer review is flawed, since alternatives to traditional peer review have not really been tried in the modern scientific era. And I think I would accept that that is a good reason to trial some alternatives, in a limited way. But until there is a really compelling body of evidence suggesting we dump pre-publication peer review, I shall be sticking to my take of Faute de mieux.