Impact factors declared unfit for duty

Regulars at this blog will be familiar with the dim view that I have of impact factors, in particular their mis-appropriation for the evaluation of individual researchers and their work. I have argued for their elimination, in part because they act as a brake on the roll-out of open access publishing but mostly because of the corrosive effect they have on science and scientists.

I came across a particularly dispiriting example of this recently when I was asked by a well-known university in North America to help assess the promotion application of one of their junior faculty. This was someone whose work I knew — and thought well of — so I was happy to agree. However, when the paperwork arrived I was disappointed to read the following statement the description of their evaluation procedures:

“Some faculty prefer to publish less frequently and publish in higher impact journals. For this reason, the Adjudicating Committee will consider the quality of the journals in which the Candidate has published and give greater weight to papers published in first rate journals.”

Which means of course that they put significant weight on impact factors when assessing their staff. Given the position I had developed in public (and at some length) I felt that this would make it difficult for me to participate. I wrote to the institution to express my reservations:

“…I think basing a judgement on the name or impact factor of the journal rather that the work that the scientist in question has reported is profoundly misguided. I am therefore not willing to participate in an assessment mechanism that perpetuates the corrosive effects of assessing individuals by considering what journals they have published in. I would like to be able to provide support for Dr X’s application but feel I can only do so if I can have the assurance of your head of department that the Committee will work under amended criteria and seek to evaluate the applicant’s science, rather than placing undue weight on where he has published.”

The reply was curt — they respected my decision for declining. And that was it.

I feel bad that I was unable to participate. I certainly wouldn’t want my actions to harm the career opportunities of another but could no longer bring myself to play the game. Others may feel differently. It was frustrating that the university in question did not want to talk about it.

But perhaps things are about to take a turn for the better? Today sees the publication of the San Francisco Declaration on Research Assessment, a document initiated by the American Society for Cell Biology (ASCB) and pulled together with a group of editors and publishers.

Logo of the San Francisco Declaration on Research Assessment

The declaration, which has already been signed by over 75 institutions and 150 senior figures in science and scientific publishing, specifically addresses the problem of evaluating the output of scientific research, highlights the mis-use of impact factors as the central problem in this process and explicitly disavows the use of impact factors. I can hardly believe it. This is the research community, in its broadest sense, taking proper responsibility for how we conduct our affairs. I sincerely hope the declaration becomes a landmark document.

All signatories, whether they be funding agencies, institutions, publishers, organisations that supply metrics or individual researchers, commit themselves to avoiding the use of impact factors as a measure of the quality of published work and to finding alternative and transparent means of assessment that are fit for purpose.

The declaration has 18 recommendations — targeted at the different constituencies. The first one establishes its over-riding objective:

“Do not use journal-based metrics, such as Journal Impact Factors, as a surrogate measure of the quality of individual research articles, to assess an individual scientist’s contributions, or in hiring, promotion, or funding decisions.”

The remainder go into more detail about what each of the different players in the business of science might do to escape the deadening traction of impact factors and develop fairer and more accurate processes of assessment. By no means does this spell the end of ardent competition between scientists for resources and glory. But it might just be a step towards means of evaluation that are not — how shall I put it? — statistically illiterate.

I urge you to download this document (available as a PDF) read it and circulate it to your colleagues, your peers, your superiors and those junior to you. Tell everyone.

And of course, you should sign it.

 

Update 17th May, 18:28 — I have been discussing my decision — mentioned above — not to participate in the review of a promotion candidate over at Drug Monkey’s blog. He is very critical of my stance and I think may have a point (see his comment thread for details). As a result, while I have not changed my view of the reliance that the selection procedure at the institution involved places on the journal names,  I emailed them this morning to offer my services as a reviewer (their deadline has not yet passed). I also pointed out this blogpost and Drug Monkey’s reply by way of explanation but also with a view to pursuing a discussion about their selection process. If they take me up on my offer, I think I can provide a review and incorporate into it my concerns about the implicit reliance on journal impact factors.

This entry was posted in Open Access, Science and tagged , , . Bookmark the permalink.

41 Responses to Impact factors declared unfit for duty

  1. Fantastic news. And of course you did right to opt out of a misguided assessment process.

  2. Pingback: Screwing over a junior colleague to make your point about Impact Factor is stupid | DrugMonkey

  3. stephenemoss says:

    Well done! Impact factors are an abomination, but don’t you think there will still be a hierarchy of journals in people’s minds? For those of us who started publishing before impact factors existed, Nature, Science and Cell were perceived as the ultimate challenge, and it was the challenge rather than the as yet unborn impact factor that fuelled the desire to publish in those journals. So whilst I agree that all papers should be taken on merit with complete disregard for where they are published, it is difficult to see that ignoring impact factors will level the playing field when it comes to the unquantifiable kudos of publishing in what I am sure will continue to be seen as ‘the top journals’.

    • Dave Fernig says:

      Impact factors go a long way back, to the early days of ISI, which Eugene Garfield founded in 1955. What has changed is their use as a proxy for quality.

      The so called “top journals” now can publish papers that are unreadable. These have a huge number of authors, several PhD theses and postdocs of experimental evidence crammed into 4 pages and 20, 30, 40, 50, 60 pages of supplementary. The four pages are written in a manner so dense that it has become a foreign language, which I do not recognise. Might as well be in classical Greek, which I cannot read or speak. This is not always ground breaking science (contrast this with the DNA structure paper, which is a pleasure to read), but brutal trench warfare, won by sheer weight of numbers, for the greater glory of the PI.
      In such instances, we need to think about the training the postdocs and students are receiving? One thing for sure, when they leave that lab, their CV looks great on impact factor, but they don’t know how to write a paper.
      Having spent yet another afternoon on the Faculty UoA REF subpanel committee ploughing through colleagues’ papers, yes, read it and then make up your mind and do not look too closely at the journal. It is also very interesting to consider senior people in a field who have clearly made a difference in our thinking. There are a significant number who did not publish these papers in “top journals”, instead choosing the journal of their learned Society.

    • Stephen says:

      Stephen – I’ve pondered this too. Even if we were to abolish JIFs tomorrow, I know the community would still want ‘quality destinations’ for their work. Some might even argue that their very existence acts as a spur for scientists to do their very best work. I’m not sure how to unpick that Gordian knot but it really has to be about appreciating what a scientist has done, not where they have reported it.

      If it wasn’t so late, I might be able to think straighter.

      • Mike Taylor says:

        To me, the most fundamental issue of all is to judge the paper, not the journal. If that means using another flawed metric, such as raw citation count, then that would be a huge step in the right direction, even though plainly not really good enough. But it’s insane that a never-cited paper in Nature is worth ten times as much as 100-times-cited paper in The Obscure Journal Of Whatever, and that’s where judgement-by-journal has got us.

    • Steve Caplan says:

      I completely agree. In my circles, I don’t know anyone who sits and tabulates IF numbers for any of her/his published papers, but there is absolutely no question that the quality of most papers published in certain journals are almost always higher than the quality of those published in others. Irrespective of IF, I do feel that the desire to have manuscripts accepted into such journals (and I’m not talking about Cell/Science/Nature; just highly respected journals) is a motivating force for good science.

      I think the only way (which I would not suggest) to completely level the playing field would be to abandon peer-review and have everyone upload whatever data they see fit. This, of course (the post-publication review) would be a disaster–so I think journal hierarchies will be here to stay. But I also agree that the movement afloat to quantify everything is damaging. We see this not just in lazy attempts to assess scientists careers, but in the science itself. Often scientists are forced to rather arbitrarily “quantify” data that really can only properly be evaluated qualitatively.

      Just as an example, cell biologists such as myself can often observe qualitative differences in the distribution of a protein in the cell. While it may be possible to figure out some criteria that will measure a modest difference between cells treated one way and another way, this criteria often doesn’t do justice to the overall difference. We could, say, measure the decrease in protein X in the Golgi apparatus–and perhaps there will be a quantifiable difference of 10%. However, anyone qualitatively looking at the cells will see that there are other major differences–the protein distribution is more diffuse, the punctae are smaller, their shape differs and so on. This exactly parallels the idea of using IF as a tool for assessment.

      • Stephen says:

        I think we need to be careful with lines such as ” there is absolutely no question that the quality of most papers published in certain journals are almost always higher than the quality of those published in others.”

        What has to be emphasised again and again is that such statements are only true on average. The problem with the mis-use of impact factors has arisen because people lazily assume that it is universally true that a Nature paper is always better than a J Mol Biol paper, for example. We need to find ways to celebrate and incentivise publication of good work rather than publication in a certain journal.

        • Mike Taylor says:

          I think we need to be careful with lines such as ” there is absolutely no question that the quality of most papers published in certain journals are almost always higher than the quality of those published in others.”

          I would go much much further than this. It’s been shown that citation count of papers correlates only very weakly the the IF of the journals they’re published in, and that retraction rate correlates rather more strongly. Certainly in my own field, length and illustration restrictions mean that papers published in Science and Nature are essentially worthless as actual science to be built on (though of course invaluable to the authors as advertisements).

          Judging papers by the venue that they appear in is not just a conceptual mistake. It’s also a horrible, horrible practical mistake that reliably gives the wrong answers.

        • Steve Caplan says:

          I purposefully left out the C/N/S trio, because I agree that they are over-inflated, and stats show they have higher rates of papers retracted, etc. Having said that, the statement I made that you quoted above: that most of the papers in some journals are of higher quality than most of the papers in other journals–well if pressed, I will use a few names as examples.

          In my field, the journal Molecular Biology of the Cell is considered an excellent journal. There is little or no rejection at the editorial level, and this journal is run by scientists with active scientists as monitoring editors. It is also the American Society for Cell Biology (the very same ASCB that issued the proposal for research assessment that you discuss). MBoC, as it is known has had a variable IF over the 15 or so years that I’ve followed it, and to be honest I don’t follow it’s IF. It could be anywhere between 4-10. But ask anyone in the field, and the journal and its standards meet with respect. Obviously there are ranges of quality in published papers, and occasionally poor quality papers in my field creep in. But very infrequently.

          At the same time, there is a relatively new journal known as BMC Cell Biology. Asked to review a manuscript for this journal several years back, I continually met with pressure from the company to accept substandard manuscripts. I suspect this is because the journal is exclusively online, and the journal publication charge is not inexpensive. After going through this experience, I vowed never to review again for this journal. I assume it continues to “thrive” although I never read papers published there. I would be concerned that conclusions might not be trustworthy, if other papers were treated the way the one I reviewed was.

          So YES, I do think that statement is accurate, and I provided a single example but have a wider experience with this sort of thing.

          • DrugMonkey says:

            So there is no possibility of a good and important paper landing in BMC Cell Biology?

            • Steve Caplan says:

              Read my virtual lips: …but there is absolutely no question that the quality of MOST papers published in certain journals are ALMOST ALWAYS HIGHER than the quality of those published in others.

              “So there is no possibility of a good and important paper landing in BMC Cell Biology?”

              Yes, there is, but it’s essentially rare.

  4. Bashir says:

    “I certainly wouldn’t want my actions to harm the career opportunities of another…”

    Do you think that in this case they have?

    • Stephen says:

      It is impossible to say but I think the risk to the candidate was minimal. The field in which our interests overlap is a large one and I imagine there was a large pool of possible reviewers to draw from. Most institutions would, I imagine, draw up a long-list of reviewers in case some were unable — for whatever reason — to provide a evaluation.

      That said, it was not a decision taken lightly. As you might see from the link to Drug Monkey’s blog above, others have different views.

  5. Pingback: Ninth Level Ireland » Blog Archive » Impact factors declared unfit for duty

  6. Mike Taylor says:

    “… asked by a well-known university in North America …”

    I can’t help feeling that you should name and shame in situtations like this.

    • Stephen says:

      I don’t think there would be any point. Most institutions probably have a similar policy, even if it’s not written down. I hope the SF declaration might be the start of a movement to turn that situation around.

  7. DrugMonkey says:

    My concern was also that you opted out of your chance to make the covert advance. Participating, without reference to JIF, serves a purpose. It dilutes those of your colleagues that do chose to refer to JIF as if it were meaningful. Similarly, I think anyone that refuses to review for GlamourMags does a similar disservice. Play the game but refuse to play *their* game.

    • Mike Taylor says:

      It is their game. The only winning move is not to play.

      • DrugMonkey says:

        That is incorrect. If you could convince *everyone* not to play then perhaps. But this is unlikely and most certainly is not the case at present. Change and improvement can come from within. Every paper that sneaks in to a high JIF journal on actual merit, while lacking GlamourBait, strikes a blow for progress.

    • Stephen says:

      DrugMonkey — I don’t think it’s necessarily so cut and dried though, as I commented in your response post, I could have pushed my argument with greater tenacity.

  8. Bob O'H says:

    Which means of course that they put significant weight on impact factors when assessing their staff.

    Really? The passage you quoted didn’t mention impact factors at all. You know that impact factors don’t measure quality, and the text asks you to “consider the quality of the journals”, not their impact factor. But you till conflated them.

    • Mike Taylor says:

      To be fair, the phrase “higher impact journals” is at least suggestive of the use of impact-factors. If the evaluators simply meant “higher quality journals” they would presumably have said so.

      But in any case, impact factor per se is not really the issue here. Although it’s a terribly flawed way of measuring the quality of a journal, it is at least a way of doing so. The problem is measuring one thing (journal quality, by any means) and using it as a metric of a different thing (researcher quality). That is just an abjectly stupid thing to do.

      It’s like judging Ryan Giggs to be a bad footballer because he pays for Wales, which is a bad footballing country. Even if you accept that the FIFA rankings of countries are OK (which many people don’t), judging the individual by which team he appears in is beyond the merely mistaken, and betrays a fundamental incomprehension.

      • Bob O'H says:

        It’s like judging Ryan Giggs to be a bad footballer because he pays for Wales, which is a bad footballing country.

        No, Giggs has little choice over which nation to play for. But one does chose which journal to submit to, and the journal chooses which papers to accept. So perhaps a better analogy might be that Giggs plays for Man U, who he has chosen to play for, and they have chosen to let him play for them.

    • Stephen says:

      Bob – The passage I quoted mentions ‘higher impact journals”, a phrase that I think most readers would interpret as a reference to impact factor. In any case, the broader point is that the name of the journal should be immaterial to the assessment. The important thing to evaluate is the work.

      • Bob O'H says:

        The passage also mentions “quality of the journals” and “first rate journals”. And doesn’t mention impact factors. So, at best you have a “dog whistle” argument, and I suspect it’s not strong unless you’re sensitised to worrying about impact factors.

        The broader point is more interesting, I think, but you only mention it in passing in the post.

        • Stephen says:

          Bob – maybe I’m missing something but you seem to be studiously ignoring the phrase ‘higher impact journals’ which occurs first and sets the tone for the paragraph. The other phrases that you do mention, ‘quality of the journals’ and ‘first rate journals’ re-inforce the impression that the location of publication is the most important factor under consideration, rather than the work itself (which is not mentioned at all). It is interesting to note that the institution in question did not challenge my interpretation; I believe that is because it is a natural one to make in the present circumstances.

          There is no dog whistle in play here — or if there is, it’s one that everyone can now hear. The not-so-subliminal message sent out by documents like this (and practice now common across the world) is that the name of the journal where you publish is more important than what you do. It is for that reason that people concerned about the misuse of impact factors (or their surrogate — journal names) have come together to formulate the SF declaration.

          The broader point — which I agree is more interesting and challenging — I have discussed at length elsewhere.

          • Bob O'H says:

            The other phrases that you do mention, ‘quality of the journals’ and ‘first rate journals’ re-inforce the impression that the location of publication is the most important factor under consideration, rather than the work itself

            Fine, but you’re shifting the goalposts. You wrote “Which means of course that they put significant weight on impact factors when assessing their staff.”, but no mention of impact factor was made: I didn’t ignore “higher impact journals”, and indeed checked to see if it said “higher impact factor journals”. It didn’t.

            Now you’re retreating to arguing against “the location of publication is the most important factor under consideration”, which is not the same issue.

            • Mike Taylor says:

              It is the same issue. Whether journal prestige is assessed by Impact Factor or some other metric is of very little importance here. (Stephen was arguably wrong to assume that IF was intended here, but it doesn’t materially affect the issue.)

              The issue is that papers (and therefore researchers) are routinely assessed not by their content but by their venue. Doing so is a mistake, plain and simple. It gives the wrong results, and rewards the wrong behaviour.

            • Bob O'H says:

              No, Mike, it isn’t the same issue. It is possible to judge journals without using impact factors, and indeed I think most scientists do that. For example, ask ecologists to rank the BES journals by reputation, and I doubt many will give the same rank as impact factor. Journal reputation is a social construct, and is not just based on impact factor.

              It’s astonishing to see the claim “Whether journal prestige is assessed by Impact Factor or some other metric is of very little importance here.” on a post with the title “Impact factors declared unfit for duty”.

            • Mike Taylor says:

              I suppose we will have to wait for Stephen to tell us what he intended. But for what it’s worth, I’d agree that the title “impact factors unfit for duty” is rather misleading.

              Leaving Stephen aside for a moment, my contention at least is that it’s judging papers by venue that is most fundamentally wrongheaded. Whether that judgement-by-venue involved impact factor or something different is a relatively minor issue. Yes, impact factor is a crappy measure even for evaluating journals. But that’s not what exercises me.

  9. Indeed, scientific publications should be assessed on the basis of their content rather than on the basis of the journal in which the publication is published. However, this means that each member of a committee that makes decisions about funding, hiring, tenure, or promotion should read thoroughly all relevant publications of the scientists that are evaluated, and be able to assess objectively their content. This is simply not feasible for a variety of practical reasons, including the limited time that committee members have, and the fact that the people that are available for participating in a particular committee might not be among the most relevant experts in the core field of the publications that are evaluated. In countries with relatively small or developing research communities there simply might not exist unbiased experts in the core field of the evaluated publications, while routinely involving foreign reviewers might not feasible. This is why the use of the impact factor has thrived: the impact factor allows committee members to delegate part of their evaluation on the assessment performed by the 2-3 reviewers that initially accepted the publication. The problem is that the impact factor is a very weak and indirect estimation of the true relevance of a particular paper.

    Committee members should instead delegate their evaluation to all of the true experts in the core field of the assessed publication, who might have read that publication anyway during their routine research activities. Each scientist reads thoroughly, on average, about 88 scientific articles per year, and the evaluative information that scientists can provide about these articles is currently lost. Aggregating in an online database reviews or ratings on the publications that scientists read anyhow can provide important information that can revolutionize the evaluation processes that support funding or hiring decisions.

    For this to work, scientists should publicly share ratings and reviews of the papers they read anyway. Spending 5 minutes to rate a paper that has just been read would save a couple of hours for each committee member who is later tasked to evaluate that paper, for which of the several committees that assess that paper.

    You may already start sharing ratings and reviews of the papers that you read on Epistemio, a website that I have founded, at http://www.epistemio.com .

    You may read more about this at http://doi.org/mjr (R. V. Florian (2012), Aggregating post-publication peer reviews and ratings. Frontiers in Computational Neuroscience, 6 (31).). You may rate or review this paper at http://www.epistemio.com/p/pj0X34Ek .

  10. Pingback: The end of the impact factor as we know it? | Åse Fixes Science

  11. Stephen says:

    Bob – we’ve run out of nesting room so I’m picking up here.

    Your contention that “It is possible to judge journals without using impact factors, and indeed I think most scientists do that” is the ultimate source of our disagreement. In my view journal names, reputations and impact factors are so entangled in people’s minds that I’d say most scientists automatically (and perhaps even unconsciously) conflate them. Maybe that’s a perspective unique to the molecular life sciences.

    The key point that I suspect we are all agreed on is the need to shift the focus away from the journal, to the paper – the work itself. However you assess a journal’s impact, impact factor or reputation, that overall assessment is always an average of some sort that should not be applied to individual publications. That is the thrust of my piece and the SF declaration.

    • Bob O'H says:

      That’s certainly true. Although one could argue that when assessing an individual’s work one can use the average impact factor of their work. I think this is perfectly admissible, as long as one assumes that the JIF is a good measure of journal quality.

  12. Pingback: San Francisco Declaration on Research Assessment | Mostly physics

  13. Pingback: DORA: 18 raccomandazioni per valutare la ricerca | PuntoMedLibrary

  14. Pingback: Data Pub | Impact Factors: A Broken System

  15. Pingback: Impact factors declared unfit for duty | British Politics and Policy at LSE

  16. Pingback: Impact Factors: A Broken System | Data Pub