Bora has a post up about impact factors that links to a discussion in Epidemiology. It’s the usual stuff: how awful they are because they have all sorts of problems. We all know this, but of course it doesn’t stop us all checking the IFs of the journals we want to submit to, and assiduously following our h-index (if anyone would care to cite this paper or this one I’d appreciate it. Thanks).
I mention this for a couple of reasons. First, the literature on impact factors, h-indices etc. is diffuse: everyone in science has an interest, so they get discussed all over the place: in ecology, chemistry, physics, mathematics etc. It would be nice if all this literature was brought together in a single review. Or perhaps it has, but somewhere obscure. Web of Science gives almost 1500 hits for the term, which will take a bit of time to go through. Connotea has about 170 entries with the tag “impact factor”, which is easier to handle. But if anyone knows of an overview, please shout up! I guess this is also a subtle plea for you to add any other studies you know of to Connotea.
My second reason springs from thinking about Brian Derby’s recent post about the h-index and self-citation. This prodded me to think about what the fundamental problem with impact factors is. My conclusion is (not surprisingly) that we’re all doing it wrong. All the indices and indexes that have been proposed are simple to calculate, so that it is easy to grasp what they are. But it is less easy, or indeed impossible, to understand what they tell us. A journal with an impact factor of 5 is better than one with an IF of 2. But better in what way? Try answering that question without parroting the definition of an impact factor,and you’ll see the problem. It is difficult to understand what exactly they are meant to measure.
The hope is that impact factors will correlate with quality (whatever that is), but although the metrics can be shown to behave in intuitively right way, they can also be shown to behave badly in other ways. But this is all based on some intuitive understanding of how they should behave, and without making that clearer, it is difficult to see how to progress.
I think the problem should be tackled from a different direction, by trying to model the phenomenon of citation. Mathematically, we can think of citations as events, and we want to count these events. There are a set of models called, wait for it, counting processes, which are used in (you’ve guess it) event history analysis. They model the way in which our counts of the number of events increases. For example, they can be used to model epidemics, where the count is of the number of people (or animals or plants) coming down with the disease. The process can be described by how the rate of events changes over time or between different epidemics. The modelling would then be of the rate of citation, and how this changes over time and between journals and authors.
We would expect a better paper to be cited more often, so the overall rate should be higher. But some papers are “slow burners”, only accumulating citations slowly. Others burn brightly, getting lots of citations for a short period before being passé. Judging the relative worth of these two types of paper might be difficult, but we can make progress by model the process of citation. At the very lest we can reduce the problem down to comparing a few parameters.
So what? First, by reducing the citation pattern down to a simpler model, we can estimate the parameters of the model and predict what will happen. This is like predicting how many people will go down with ‘flu when the epidemic is just starting: the pattern is known well enough from previous epidemics that we know how the next one will behave. For citations, we have enough past papers to study that we should be able to get a good handle on what may happen to a paper. There could be considerable parameter uncertainty: a paper good be a slow burner, or it could be destined for obscurity, but this will affect the uncertainty in the predictions, not whether they can be predicted.
So, from a partial pattern, we can predict ahead. From this we should be able to estimate, for example, the total number of citations a paper will get (that’s the area under the rate curve), or the number it will have in, say, 5 years. We would also be able to average over all papers in a journal, or all papers by an author, to give a score for them, and hence get scores for people or journals.
Once we have this sort of model of citation, we can use it to create measures of quality. We can define precisely what our measure of quality will be: the total number of citations for an average paper, the number in 5 years, etc. and then calculate those based on the model. of course, this can be done even if the paper is still being cited, the standard errors are just higher. The advantage is that the quality measures can be defined so that they are understandable: we create ones which measure something that is easy to understand, and then apply that to the model. If you want to compare fast and slow citees, a discount can be applied, so that later citations count less. The discount can be chosen as one wants, and questions of the validity of it can be aimed explicitly at the discounting, i.e. we can see where the problem is in the calculation.
The point of Brian’s post was to flag the problems of self-citation in indices. But for the modelling approach this isn’t so much a problem as an opportunity to extend the model and get another (well cited) paper. Self- and non-self papers can be modelled together, so we can separate out their effect, and also clearly see the self-citers for who they are. The term “marked point processes” will appear in the paper somewhere.
I would argue that this approach lets us define quality in a quantitative way (and argue endlessly over that of course), and then measure it explicitly. We will know precisely what we are measuring and we should be able to understand more easily any un-intended effects on the measure. For example, if we define quality of a researcher by the number of citations per paper they get in 5 years, it will be easier to say “yes but Prof. Trellis’ papers are well cited for a year, but then everyone gives up, where as Dr. Lyttleton’s papers would continue to be cited for decades”. And we can back that up with the numbers.
Sounds great, doesn’t it? There are a couple of problems:
- Finding a good model. There are several possibilities, of course, and none will be perfect. They will have different assumptions, and may or may not fit. The challenge is to get something about right but not too complex. But this is true for any statistical modelling.
- Getting the data. Someone has to trawl through Web of Science for a few months.
- Finding the time/money: The big one! If someone gives me the funding, I could try and hire a student or -post-doc- researcher. This is the usual problem: I also have N other projects I should be working on. Anyone want to collaborate?
- Persuading people to use the methods: We can do the modelling, but if nobody is going to use them, does it matter?
- Finding simple approximations: I suspect a big problem would be that the computational time will be too long. So, finding something simple to calculate, but strongly correlated with the complex measure would help.
Once all this is done, I will be able to take over the world. Evil laughter is optional.