Impact factors are clouding our judgement

Nature has an interesting news feature this week on impact factors. Eugenie Samuel Reich’s article — part of a special supplement covering various aspects of the rather ill-defined notion of impact — explores whether publication in journals such as Nature or Science is a game-changer for scientific careers.

The widely-held assumption is that they are. And the stories from young scientists interviewed by Reich*, who had almost all published in Nature or Science or Cell back in 2010, would appear to confirm that. Their papers in prestige journals won jobs or grants or opened doors to clinical trials that has previously been shut. Or at least that’s what they assume or believe or feel; no-one can quite be sure because the rules of the game are unspoken and unwritten.

Nature Cover - Oct 17, 2013

 

But the trouble is that these unofficial rules appear nigh on universal. I was certainly mindful of them when I embarked on my research career more than twenty years ago (as I mentioned in my contribution to this week’s Nature Podcast). Regular visitors to this blog will be aware that since then I have modified my views and now see the excessive influence of impact factors as a kind of addiction for which the scientific community needs to find a cure.

It can be a hard argument to pitch because the culture of dependence is so embedded. The lure of high-impact journals is strong and underpinned by some rational motivations. As noted by Finnish scientist Annele Virtanen, one of  Reich’s interviewees, the competition for publication in Nature of Science acts as a spur for scientists to be ambitious in their research. No bad thing of course but the trouble sets in when rewards are tied too closely to the particular achievement of a Nature or Science paper.

While it is certainly true that many of the papers published in these journals are of very high quality — and that on average they garner more citations than papers in journals focused on particular disciplines — it is too often forgotten that the impact factor not only disguises the very real variation in performance of papers in any one title but flatters them because its dubious method of calculation skews the measure of average performance significantly towards the higher end of the distribution.

The scientific community too often overlooks the granularity of the data and thinks only in terms of impact factors. Thus everyone who gains entry to Nature or Science wins an impact factor prize — irrespective of the actual significance of their work; and of course, those who narrowly fail to make the grade (in an assessment process that is highly stochastic) lose out. Virtanen, the recent beneficiary of this system, says “I can’t see so many bad sides” — a common enough view. But would she prefer to trust her future prospects to the uncertainty of getting her next paper through the narrow doors of the very top journals or would it be better to be able to rely on a system that does a more rigorous and fairer job of assessing what she has actually done?

There are positive moves in this direction. Sandra Schmid, head of cell biology at the University of Texas South-western Medical School, has recently taken steps to move away from the over-reliance on impact factors in her hiring procedures. I am pushing for similar revisions at my own institution. These moves chime with the recent San Francisco Declaration on Research Assessment which hopes to encourage all stakeholders in the scientific process — universities, funders, publishers and learned societies — to revise and improve their methods of assessment, specifically to eliminate the unhealthy lure of the impact factor. In the UK, the Wellcome Trust has long had a stated policy of not considering where applicants’ work is published when judging grant proposals, a policy now adopted by the UK Research Councils. But it is one thing to formulate a policy; quite another to make it work.

In the editorial accompanying this week’s supplement on impact Nature restates its long-standing opposition to the mis-use of impact factors in judging individuals or individual papers (a position that has been usefully repeated in other Nature-branded titles such as Nature Chemistry and Nature Materials). This is laudable but insufficient. It sits uneasily with the full-page adverts that appear annually with the announcement of yet another incremental rise in impact factor. This recent one, trumpeting Nature’s 2011 IF of 36 is accompanied by the strap-line ‘Energize your scientific career…’. What are researchers to make of that if not a repetition of the unspoken mantra that publication in top-tier journals is vital to real success in science?

The editorial also repeats the line that the obsession with impact factors is a problem for the scientific community to address. And that is true — it is a largely self-inflicted problem and it is primarily our responsibility to sort it out. But I find it odd that Nature appears to see itself as apart from that community, especially when its editors and reviewers are drawn from within it. I don’t think the dividing line is so easy to discern — especially given the supportive commentary of some Nature journal editors.

Where I do agree strongly with the editorial line is in the declared need for “for research evaluators to be explicit about the methods they use to measure impact.” In this, Nature, and indeed all scholarly journals can help — and at negligible cost. I call for them to publish all the data on which their impact factor calculations are based. Every year, when the new impact factor is released and advertised, please also publish the citation numbers and distributions on which it is based. This transparency will help to demystify the magical lure of that one number by revealing a truer picture of the performance of all the papers that contribute to it. It will reveal the variation in granular detail — the big hitters and the damp squibs. Comparison between journals would be enriched because the real overlap in the citation distributions — too often forgotten in the obsession with just one number — would be made evident.

PLOS is already leading the way in making this type of information available. Nature could do a tangible and valuable service to the scientific community by a simple act of transparency. It could blow away some of the clouds that are presently obscuring our judgement.

Now that I come to think of this proposal, I can’t see any bad sides.

Update (20-10-2013, 15:50) — If I’d had the time I would have read all of the articles in Nature‘s supplement before publishing this post and made a point of including a link to the piece by David Shotton on efforts to make citation data open.

*Update (22-10-2013, 15:33) — The original version of this post referred to the Nature author as Eugenie Samuel Reich, which is her full name; but she kindly informs me that her surname is simply Reich. The text had been modified to reflect this. Apologies for any confusion.

 

This entry was posted in Open Access, Scientific Life and tagged , . Bookmark the permalink.

27 Responses to Impact factors are clouding our judgement

  1. Mike Taylor says:

    “While it is certainly true that many of the papers published in these journals are of very high quality …”

    I would dispute this, certainly in my own field of vertebrate palaeontology. It’s simply not possible to squish anything resembling a useful descriptive paper down into the length limits these journals allow — especially not for the spectacular and complete specimens that tend to make it into the tabloids.

    As a result we have papers like Sereno et al. (1999), widely known as the WIJP (Woefully Inadequate Jobaria Paper), which describes two new genera of sauropod and introduces a novel palaeobiological hypothesis (which BTW turned out to be wrong) all in five pages. Those descriptions are essentially of no scientific value whatsoever. Fourteen years on from this paper, we’re still awaiting any further description of Jobaria. (Happily there has been some subsequent work on Nigersaurus, the other sauropod described in that paper, though nothing like the proper descriptive monograph that the sensational material merits.)

    The science of palaeontology would unquestionably have been much better served had this paper been rejected from Science, and a proper description published in a proper journal instead.

    Reference

    Sereno, Paul C., Allison L. Beck, Didier. B. Dutheil, Hans C. E. Larsson, Gabrielle. H. Lyon, Bourahima Moussa, Rudyard W. Sadleir, Christian A. Sidor, David J. Varricchio, Gregory P. Wilson and Jeffrey A. Wilson. 1999. Cretaceous Sauropods from the Sahara and the uneven rate of skeletal evolution among dinosaurs. Science 282:1342-1347.

    • Stephen says:

      It is certainly the case that there are reports of dubious or low value published by the likes of Nature and Science but that doesn’t conflict with my general contention.

      • Mike Taylor says:

        Well, I’m making a rather stronger claim than that. I’m saying that across the whole of my field, papers in Science and Nature are of little scientific worth. That may not be true in other fields — I could hardly judge. But in palaeo, S&N papers are (in the best case) adverts from full-length that will subsequently appear elsewhere; or (worst case) an extended abstract describing a project that would have good is someone had actually done it.

        • Stephen says:

          “S&N papers are (in the best case) adverts from full-length that will subsequently appear elsewhere…”

          That how Nature used to operate of course — short, readable dispatches about the latest results. I’ve been delving into the archives from 1910-1940 in the past few weeks and really enjoying it. Part of me wishes Nature would return to that publishing model.

      • Sorry, I ought to have clicked on ‘reply’ for my comment below but was in a hurry. One need not resort to “I know a bad Nature paper” arguments – we have actual data and they show that whatever people have looked at, you can’t use it to justify the status of teh GlamMagz:
        http://www.frontiersin.org/Human_Neuroscience/10.3389/fnhum.2013.00291/full

    • Phillip Lord says:

      A nice and more recent examples comes from Science.

      http://www.sciencemag.org/content/337/6102/1628.abstract

      The actual paper is largely devoid of content, and you have to look up the supplementary material to see anything of use. This includes some code written in a PDF, from where it is very hard to extract. Ironic, really, as the code is used to translate data to DNA from which is it also hard to extract.

      • Pep Pàmies says:

        It is most often true that for papers published in high-impact journals part of the evidence necessary to support the claims is buried in the supplementary information, which is more often than not treated as a dumping ground for all the stuff that doesn’t fit in the body of the paper.

        Yet the stringent length limits of high-profile journals have a purpose. Those journals offer authors broader exposure, and it is in the authors’ interest to avoid putting off many non-specialists by writing long papers full of details that the non-expert is not able to grasp.

        In fact, in order to increase the reach of their most important work, authors ought to write a comprehensible, shorter story with the main evidence, and relegate the rest of the *necessary* evidence and discussion somewhere else (typically the infamous supplementary information), where interested specialists can find (ideally) all the necessary details.

        That said, there is a lot of room for improvement in the supplementary-information department.

  2. Hakim Meskine says:

    I was enjoying your article up until the end when you called on publishers to release the data on which “their” impact factor calculations are based. I find that statement surprising. Surely you are aware that the Impact Factor is not based on data collected by the publishers, but is based on the so-called Journal Citation Report as aggregated by Thomson-Reuters through their Web of Knowledge platform. Publishers are in no position to release that data since they do not “own” it.

    As to the data released by PLOS, I assume that you refer to article-level metrics, which while tremendously useful, differ significantly from the Impact Factor. The IF for any PLOS journal, just like any other, is calculated based on Thomson-Reuters’ JCR.

    Nevertheless, your overall point does stand: some transparency is indeed badly needed. At least until the scientific community is able to ween itself off of this unhealthy fascination with this most unscientific measure of scientific impact.

    • Stephen says:

      I’m sorry to have spoiled your enjoyment… ;-) However, I am aware that IFs are calculated from data gathered by Thomson-Reuters — I have noted it in earlier posts and didn’t bother to repeat here. But my point was to encourage journals to be more transparent about where their impact factors come from. They are keen to advertise the number when it is released each year but few make the effort to dig beneath and provide greater (and necessary) transparency — though see this exception by Nature Materials.

      As for PLOS, the data I linked to at the end of the piece includes a plot of citation statistics for PLOS ONE, made using data that (if I understand correctly) they gather using their own ALM API; it doesn’t give the same IF as Thomson-Reuters, but at least the data are there. Of course the question that arises immediately is why don’t other journals follow suit?

      And naturally I am also inclined to ask why Thomson-Reuters chooses not to present the distributions on which their JIFs are based? We seem to be badly served.

      Through all the debates on open access publishing I have often heard publishers claiming to be ‘partners’ with the research community. I suspect that is a genuine impulse, albeit one conflicted by commercial interests. I would like to see our partners help out more in finding solutions to the deep-seated problems of impact factors.

  3. Stephen wrote:
    “While it is certainly true that many of the papers published in these journals are of very high quality …”
    Mike replied:
    “I would dispute this, certainly in my own field of vertebrate palaeontology.”

    And we don’t even need anecdotes or arguments to test that statement. The data is available:
    http://www.frontiersin.org/Human_Neuroscience/10.3389/fnhum.2013.00291/full

    Why are people using weak arguments, when they have strong data at there disposal? Put differently: why would one weaken their own arguments by omitting the data that backs them up?

    • Pep Pàmies says:

      I can’t find data in that paper of yours that you link to supporting the argument that the majority of papers in high-impact journals is not of high quality.

      With all due respect, I think you make the same mistake that people do when extrapolating the IF of a journal beyond its meaning. That is, because the IF is the average of a skewed (Pareto-type) distribution, it is most often not representative of the typical number of citations per paper.

      Similarly, here you claim that the data show that high-impact journals do not publish higher quality science than the average journal on the basis of a lack of meaningful correlation between journal rank and citations. But the coefficient of determination that you used is calculated for about 30 million publications, the majority of which are, of course, of low-impact. So you seem to use the average to make a claim about the tail of the distribution (occupied by the high-impact bunch of journals). This is conceptually the same mistake as using the IF as a proxy for the typical number of citations of the ‘average’ paper (in fact, the mode of a highly skewed distribution is way off from its average) .

      An example of the tail of a skewed distribution showing different behaviour than the rest of it is the journal cited half-life versus IF.

      [Note: I work for Nature Materials, yet my comments here are strictly personal.]

      • “Similarly, here you claim that the data show that high-impact journals do not publish higher quality science than the average journal on the basis of a lack of meaningful correlation between journal rank and citations.”

        Now such a silly thing we would never attempt – where did you get this idea from? Certainly not our paper.

  4. Ian Mulvany says:

    Some of these issues were discussed at the ALM meeting hosted by PLOS last week in San Francisco. Of particular interest will be the comments by Amy Brand who works on the tenure process at Harvard. They use citation counts in that process only as a reality check to see if the candidate falls within the normal expected range in terms of academic output on that measure. The bulk of the assessment is based on looking at their contributions in detail, and the recommendations of the person’s peers. You can see my writeup here: http://partiallyattended.com/2013/10/17/plos-alm-13-day2/

    • Stephen says:

      Thanks Ian – I’ll take a look.

      Do you know how technically difficult it would be for other journals to gather their own stats on citations (as PLOS seems to do)? What I would like to see ultimately are the distributions for different journals on the same plot — I suspect there would be a lot of interesting overlap.

      • Ian Mulvany says:

        There are three core sources for this information – CrossRef, WebOfScience and Scopus. In addition for biomedical literature you can also pull data from PubMed and PMCEurope.

        CrossRef can only deliver some of the data because some publishers do not allow their citation data to be used by others. ACS journals, for example, are usually quite tight fisted about this kind of information because on principle they do not tend to be fans of open systems.

        To pull data from WebOfScience and CrossRef you need to have an access key from these providers, which could cost money to get, or could be provided for research purposes, depending on your situation. Usually these sources stipulate that the person pulling that data cannot redistribute it programatically, which means that if a publisher did get their data from these sources, they would probably be prevented from pooling that data into a community resource.

        Data from PMC and EuroPMC is open, and freely distibutable.

        Google Scholar also collects this kind of info, but does not allow it off of their platform.

        When comparing citation counts, all sources report strong variation. Scopus and Google Scholar seem to be most broad in coverage. My understanding is that Google Scholar has potentially more clearly identified errors, but on the other hand indexes more data.

        For the purposes of counting the impact factor, only WOS data points should be relevant, but of course closed.

        In terms of publishers getting any of this data themselves, at the moment I would recommend one of the following routes:

        - Ask WOS or Scopus directly
        - Run the PLOS ALM tool yourself, with api key integration that you negotiate with the appropriate party. We have just spun up an instance of this tool at eLife and are looking at making this data available via our own instance for sources that we are allowed to redistribute.
        - pull data from altmetric.com – they have coverage of just under 1.4M articles, so a good snapshot.

        We could put pressure on WOS to make distribution charts available per publisher. We could encourage better reporting across the industry though better mechanisms for displaying citation level data for journals from whichever source was available. We should push to make citations open – these should not be copyrightable as they are facts.

        Until we have a fully open citation graph this issue is going to continue to be a pain.

        When you have access to this graph you can do amazing things. The folk at eingenfactor have shown great tools for discovery of content, and in a domain where this information is open – the physics ArXiV – a couple of people can produce something wonderful such as http://paperscape.org/.

        Hope that helps.

        • Mike Taylor says:

          “Until we have a fully open citation graph …”

          It’s insane that we don’t have this yet.

        • Stephen says:

          Thanks Ian for such an informative comment. It would seem entirely reasonable to ask TR to provide journals with the data on which their IF is calculated so that this information could be published and thereby give a less misleading impression of relative journal performance.

          Do you know what sources of citation the PLOS ALM tool has access to?

          • Ian Mulvany says:

            As currently configured the PLOS ALM tool has access to three sources of citation, one of which comes in two flavours.

            The PLOS tool can be provided with a private key to access Web Of Science, and Scopus, but this data cannot be distributed.

            It can be configured to get naive citation counts from Crossref using an OpenURL resolver, and if you are a publisher with a crossref account, and access to the full publisher set of crossref data (access that has to be agreed to by other publishers via the crossref “cited-by” agreement), you can access a ricer data set. Generally most publishers tend to allow this level of access, including Elsever. My experience while working at Mendeley was that the ACS stable of journals tend not to.

    • Stephen says:

      OK – I like the sound of what Amy Brand is doing at Harvard. This, from your blogpost, sounds very encouraging:

      “They look at citations to papers – journal name is not listed. This citation report is not extremely important, they just want to know if the person is in range, it’s only interesting if it indicates that there are any outliers. I’ll say that again. For the citation record they don’t look at the journal names. They only use citations to help understand whether the candidate is operation within the normal expected range within their discipline, and in comparison to their peers. It’s used as a sanity check, and not insanely used as the key piece of evidence.”

    • Stephen says:

      Thanks also for pointing me in Amy Brand’s direction. Her piece in eLife about appointment procedures at Harvard is great. (I’m partly leaving this comment here so I can always come back and find her article).

  5. Richard Sever says:

    Moving away from the tyranny of Impact Factors is something we all want, but even if IFs were abolished entirely the ‘branding power’ of top journals may remain.

    It’s worth recalling that most of the Nature and Cell siblings were incredibly successful even before they were awarded Impact Factors. This was not solely attributable to the hard work of their professional editors… As an academic editor of society journal once said to me, “The single most effective thing we could do to improve our journal would be to add the word ‘Nature’ to the title”.

    Winning the battle against IF misuse is one thing. Changing a culture in which papers are judged by the company they keep (in a journal) is another.

    • Stephen says:

      I think there is always going to be competition of this sort and that’s no bad thing. The problem is that JIF or journal name has come, unjustifiably, to be the dominant mark of achievement in research. We need the tools to facilitate judgements that are based on broader information — some additional metrics, for sure, but crucially also procedures that focus on the research that people have done.

  6. Harvey Kane says:

    I believe T/R gives the criteria and how it compiles data for the IF of each journal.

    see: http://thomsonreuters.com/journal-citation-reports/

    That being so, why should a Publisher which uses the IF in its advertising republish what T/R makes readily available? Or, do you just want the advertising to state as published in JCR?

    • Stephen says:

      They certainly give more information that just the bald number. What I’d like to see (though it’s not mentioned in the link in your comment) is permission for journals to publish the citation distributions that the JIF is based on. Better yet, T/R would publish these themselves and put them in the public domain but I don’t think that’s likely since, as I understand it, you have to subscribe to their Journal Citation Reports to be able to see the impact factors of different journals.

      The general response to my post, both here and on Twitter, appears to indicate that accessing the citation data needed to generate the distributions is “problematic”. That’s a great shame — think there’s a powerful case for more openness on these measures.

  7. Pingback: The Badge of Honour for you Career? | BMS3016 2013-2014

  8. Pingback: The Schekman Manoeuvre | Reciprocal Space

  9. Pingback: Morsels For The Mind – 13/12/2013 › Six Incredible Things Before Breakfast