Data not shown: time to distribute some common sense about impact factors

It’s that time of year when all clear-thinking people die a little inside: the latest set of journal impact factors has just been released.

Although there was an initial flurry of activity on Twitter last week when the 2015 Journal Citation Reports* were published by Thomson Reuters, it had died down by the weekend. You might be forgiven for thinking that the short-lived burst of interest means that the obsession with this damaging metric is on the wane. But this is just the calm before the storm. Soon enough there will be wave upon wave of adverts and emails from journals trumpeting their brand new impact factors all the way to the ridiculous third decimal place. So now is the time to act – and there is something very simple that we can all can do.

For journals, promotion of the impact factor makes a kind of sense since the number – a statistically dubious calculation of the mean number of citations that their papers have accumulated in the previous two years – provides an indicator of the average performance of the journal. It’s just good business: higher impact factors attract authors and readers.

But the invidious effects of the impact factor on the business of science are well-known and widely acknowledged. Its problems have been recounted in detail on this blog and elsewhere. I can particularly recommend Steve Royle’s recent dissection of the statistical deficiencies of this mis-measure of research.

There is no shortage of critiques but the impact factor has burrowed deep into the soul of science and is proving hard to shift. That was a recurrent theme of the recent Royal Society meeting on the Future of Scholarly Scientific Communication which, over four days, repeatedly circled back to the mis-application of impact factors as the perverse incentive that is at the root of problems with the evaluation of science and scientists, with reproducibility, with scientific fraud, and with the speed and cost of publishing research results. I touched on some of these issues in a recent blogpost about the meeting; (you can listen to recordings of the sessions or read a summary).

The Royal Society meeting might have considered the impact factor problem from all angles but  discovered once again – unfortunately – that there are no revolutionary solutions to be had.

The San Francisco Declaration on Research Assessment (DORA) and the Leiden Manifesto are commendable steps in the right direction. Both are critical of the mis-use of impact factors and foster the adoption of alternative processes for assessment. But they are just steps.

That being said, steps are important. Especially so if the journey seems arduous.

Another important step was made shortly after the Royal Society meeting by the EMBO Journal and is one that gives us all an opportunity to act. Bernd Pulverer, chief editor of EMBO J., announced that the journal will from now on publish its annual citation distributions, which comprise the data on which the impact factor is based. This may appear to be merely a technical development but it marks an important move towards transparency that should help to dethrone the impact factor.

 

EMBO J. - Citation Distributions

 

The citation distribution for EMBO J. is highly skewed. It is dominated by a small number of papers that attract lots of citations and a large number that garner very few. The journal publishes many papers that attract only 0, 1 or 2 citations in a year and a few that have more than 40. This is not unusual – almost all journals will have similarly skewed distributions – but what it makes clear are the huge variations in citations that the papers in any given journal attract. And yet all will be ‘credited’ with the impact factor of the journal – around 10 in the case of EMBO J.

By publishing these distributions, the EMBO Journal is being commendably transparent about citations to its papers. It is not just a useful reminder that behind the simplicity of reducing journal performance to a single number is an enormous spread in the citations attracted by individual pieces of work. As Steve Royle’s excellent analysis reveals, the IF is a poor discriminator between journals and a dreadful one for papers. Publishing citation distributions therefore directs the attention of anyone who cares about doing evaluation properly back where it belongs: to the work itself. The practice ties in nicely with articles 4 and 5 of the Leiden Manifesto.

So what can you do? Simple: if in the next few weeks and months you come across an advert or email bragging about this or that journal’s impact factor, please contact them to ask why they are not showing the data on which the impact factor is based. Ask them why they are not following the example set by the EMBO Journal. Ask them why they think it is appropriate to reduce their journal to a single number, when they could be transparent about the full range of citations that their papers attract. Ask them why they are not showing the data that they rightly insist authors provide to back up the scientific claims in the papers they publish. Ask them why they won’t show the broader picture of journal performance. Ask them to help address the problem of perverse incentives in scientific publishing.

*The title is somewhat confusing since the 2015 JCR contains the impact factors calculated for 2014.

This entry was posted in Open Access. Bookmark the permalink.

10 Responses to Data not shown: time to distribute some common sense about impact factors

  1. I’m seriously considering doing just that when I get IF spam again. On the other hand, what data are they supposed to show? What data is it that EMBO is showing? What citations are counted? The ones that are negotiated with TR or all citations? What sources are considered?

    If the journals only show the negotiated data, it’s about as honest and transparent as drawing the curves by hand. If they show other data, which one?

  2. I’m not sure what is learned from publishing the distributions by journals. Firstly, it is reasonably well-known that most papers receive minimal numbers of citations; secondly, the graph is not in a comparative context (unlike Impact Factor where most researchers have some idea of comparators); thirdly, like journal impact factor, there is still no data to say the graph is of any use for decision-making, so what is the use of asking journals to provide these data? FInally, I suspect that the citation data do not belong to the journal or its editor and they are not free to publish these graphs; the data are collected with substantial effort and expense by Scopus and Thompson-Reuters buying thousands of Journals and entering each citation.

    • Stephen says:

      Thanks for the comment, Pat. To respond to your points:

      1. It may be well known that most papers receive low numbers of citations but I don’t think the skew of the citation distributions is widely understood, so it it good to put that information in the public domain.

      2. If the data are made available, comparisons can readily be performed – as in the revealing analysis by Steve Royle that I linked to in the piece.

      3. I disagree that “there is still no data to say the graph is of any use for decision-making”. I and many others have sat on numerous committees where people (and their grant applications) are assessed on the basis of impact factors. I have had universities ask me to consider JIFs when assessing candidates for promotion. I have seen departmental documents providing JIF data to staff in order to guide their publishing choices. It is widely known that selections of papers for the REF are strongly influenced by JIFS, despite HEFCE’s assurances that they will not be taken into consideration. The whole reason for the emergence of DORA is because of the widespread appreciation that JIFs have an undue influence in the way assessments are made.

      4. The proprietary nature of the data on which JIFs are calculated is a problem. It would be far better if these data were openly available.

      • Thanks for response.
        1) Agreed the skew is not as widely known as impact factor; but what do we do with this information? (Other than say it is yet more evidence IF is not a good metric.) Is there anything wrong with a few papers getting lots of citations, and should there be an aim to change this? I certainly try to avoid writing papers that get no citations, but not entirely successfully!

        2) I’d looked at the link and thought it a nice analysis – more evidence against use of IF was my take-home. (Just what more evidence do academics need …?)

        3) Couldn’t agree more with your comment about the misuse of impact factors. I’ve complained about them being included in a couple of recent promotion cases I’ve reviewed. Given hundreds of paragraphs of REF instructions, what could be more clear than Paragraph 53 of Panel criteria and working methods: “No sub-panel will make use of journal impact factors, rankings or lists, or the perceived standing of the publisher, in assessing the quality of research outputs.”? Certainly this was followed to the letter on the sub-panel where I was a member, and I don’t think anybody on the panel thought IF was any sort of surrogate for research quality.

        4) Here I disagree with you. I would object to using public money to fund large-scale collection and analysis of citation data. The ‘preliminary evidence’ of the commercial data does not suggest there would be value in a more rigorous study. Spending lots on such a study would be likely to encourage even more widespread misuse, and searches for spurious correlations.

  3. Stephen says:

    Pleased to see that PeerJ has taken up the challenge…

  4. Pingback: Skewering the impact factor » Responsible Metrics

  5. Pingback: The curious case of impact factors | Sara Hänzi