Measurements: the Good, the Bad and the Ugly

Measuring us seems endemic to academic life now (as indeed to the NHS or local Councils or any other part of our civic society). The Forum for Responsible Research Metrics is charged with coming up with ways to use metrics in our universities in ways that are constructive and relevant. There are far too many potential metrics out there – and there will be a whole set more devised when Jo Johnson’s recently announced Knowledge Exchange Framework gets worked up into something concrete. I’m involved with the earliest of the three (currently) academic frameworks, the Research Excellence Framework, through my role as Chair of the Interdisciplinary Advisory Panel for REF2021, and we will have our own part to play in checking that any criteria we use don’t rely on meaningless if precise numbers. The Teaching Excellence Framework  (I am tempted to refer to it as the ‘middle-aged’ framework, as it was the middle one to be constructed; since it’s only in its first year of evaluations that is probably misleading) has been criticised by many as designed to come up with numbers that don’t actually address the questions that need answering about our university teaching.

To take another example from this week, ‘Oxford tops research grant table for third year’ said the headline, entirely truthfully if you consider amount of money awarded. But, being a proud Cantabrigian I looked further down to see how Cambridge had performed. Lo and behold, the success rate for Cambridge was actually a couple of percentage points higher than for Oxford. Which figure of merit is more important? Oxford had put in a larger number of (potentially larger on average) grants yielding a higher income than Cambridge, but proportionately fewer of them had succeeded. I could claim that Cambridge had been the most successful university, by literally considering the success rate (Leeds had also returned a success rate of 32%, the same as Cambridge). Perhaps what is most interesting, for Cambridge at least, is that their success rate had shot up by 4%: the grant-writing had, apparently, improved significantly. Is that the metric we should be looking at: most improved?

Several universities have been told off for using misleading numbers this week by the Advertising Standards Agency. For instance the University of Leicester must stop claiming to be “a top 1% world university”, although no doubt some league table somewhere would allow them to be so described; and the University of Strathclyde has been told to change the claim “We’re ranked No. 1 in the UK” for physics, but someone, somewhere, had presumably put them at the top of a specific list. Metrics are tricky animals and academics (or university administrators) are incredibly good at finding some way of finding an appropriate number that can be used to their advantage. The University of Poppleton does this reliably on the back page of the THE each week.

The trouble is, when one moves beyond institutional metrics to individual ones, things can get really nasty. How much grant income did you produce last year? If the answer doesn’t satisfy the senior management, are there consequences? What is your h index  and is it a consideration in whether or not you get appointed in the first place or promoted subsequently? What about the journal impact factor in which you published your last paper? Does this matter? – although if your institution has signed up to DORA (as I have in a personal capacity),  it may not matter as long as the promotion committees remember. Does the management want to have these sticks to beat you with?

It is an interesting irony that the organisation that seems to have been most passionate about removing the ‘individual’ from REF2021 is the Royal Society. It might be seen by many as an elitist organisation, but actually it explicitly stated, in its submission in response to the REF2021 consultation earlier this spring

The decoupling of individuals from output should reduce pressure on those who take time out of research and on early career researchers, whose recruitment would be based more on their research potential and not their ‘REFability’. It would also begin to remove disincentives to hire, and reverse the demotivation and restore morale to technology specialists, industry collaborators and members of large teams.

And

Using a new volume measure and portfolio approach to assessment means there would be no need to set a maximum or minimum number of outputs per staff member, thus avoiding recoupling outputs and individuals, with all the invidious consequences highlighted by the Stern Report.

However, the collective consultation responses rejected this position. Universities wanted to be able to tie outputs to individuals as, one has to assume, a management tool. Far from seeing the Stern recommendation to move towards an institutional REF as a freeing up of academe, where outputs were valued in the round not in the way they scored individuals, they resisted such an attempt. As a member of the grouping within the Royal Society that helped to produce our response to the consultation, I am dismayed that something that will continue to put immense problems on the life of the individual has been reinforced in the face of an attempt by the Stern Review and HEFCE to reduce it. As a champion for diversity it upsets me that there will still be a need for individuals with ‘special circumstances’ – having a baby perhaps or long term sick leave – to produce a justification of why they haven’t produced even a single output within the REF period. The removal of any tie-in of outputs to individuals would have obviated this need.

Metrics in general are designed to fit the ‘norm’, to fit what people collectively believe an ideal academic looks like. One who breaks the mould – for instance by working part-time, by being more interdisciplinary than their colleagues or by preferring to produce fewer but more thorough papers – can be disadvantaged by a standard set of metrics.  Using metrics which haven’t been thought through sufficiently to look for inherent biases against such individuals will likely disadvantage them. One obvious such group are women. Research has shown variously that women are: likely to win slightly smaller grants (data based on Wellcome awards);  are less likely to cite their own papers;  they publish less and are less likely to be cited by others when they are the lead author. I don’t intend to argue why these findings might be as they are; I merely want to point out that metrics that don’t consider such matters will be damaging to the individual. This is particularly the case about citations and the slavish use of h indices. I hope the Forum for Responsible Research Metrics very much has these issues in their sights, issues which were highlighted in the original analysis of the possible use of metrics in The Metric Tide report which I wrote about when it first appeared.

The impact of all these ‘measurements’ on the individual, for their health and well-being, is a crucial part of how we as a sector thrive – or, quite possibly, do not thrive. There are many issues about ‘objectification’ and ‘measurement’ that are deleterious for an academic’s mental health at any stage of the career ladder. I will have more to say about this in a later post.

 

 

This entry was posted in Research, Science Culture and tagged , , , , . Bookmark the permalink.

2 Responses to Measurements: the Good, the Bad and the Ugly

  1. Brigitte says:

    I can’t help but think that ‘responsible metrics’ is an oxymoron. As soon as metrics are imposed on something that something can no longer be done responsibly. I might be wrong, of course….

  2. Beppe Battaglia says:

    I agree with every words, however I realised that metrics are the response to the inflation of universities, there are so many and all in a free market competing with each other for everything. The romantic in me would suggest that each part of the world should have a medic, a school and a university and resources should be spread equally. But the realist and cynic knows we cannot go back on free market and even if I twich while I write this, perhaps, we should embrace other new-liberal approaches and regulate these metrics to create unambiguous outputs which cannot be manipulated or if they have to be only by a regulating body.