A Basketful of Metrics?

For those who were involved with any aspect of REF2014, it had similarities to a slow speed nightmare. For those embroiled in preparing the submissions, not only was it extremely, ridiculously time-consuming, but it was also a heavy burden of responsibility because the potential financial stakes were so high. I have no idea what it was like to be on the receiving end of the submissions this time around, but I do know what it felt like for RAE2008, when I sat on one of the panels: it was most certainly no bed of roses. I cannot imagine this time was any different. The person who commented to me in 2008 that after reading all that paperwork I must have a wonderful overview of the state of physics in the UK had no idea how hard it was to retain any facts at all by the end of the mind-numbing process. Indeed, in the interests of not making any improper disclosure of salient facts about performance, it was rather important not to remember anything.

So, as I wrote about during the preparation for submission, I could find nothing remotely funny to say about REF2014. On the other hand, I also had little conviction that any purely metric-based analysis would be viable, despite Dorothy Bishop’s strongly expressed view supporting the idea. Now, the publication of The Metric Tide, the ‘Report of the independent review of the role of metrics in research assessment and management’ commissioned by HEFCE, supports my belief. Metrics on their own, however convenient, are simply not robust enough to stand in for peer review. That is the bottom line of their findings, as it was in Elsevier’s analysis reported a few weeks ago (and if ever an organisation had a motivation for finding the opposite conclusion, surely a publisher who wants to trumpet impact factors might have been expected to have one).

The problem with metrics is that academic research is simply not monolithic or in the least bit homogeneous, plus academics as a community are smart and so can probably game most (all?) systems devised. Even within a discipline, heterogeneity of approach is the norm not the exception. In physics, for instance, there is a huge difference between the field of particle physics – where teams of hundreds will be involved in most outputs from CERN – to my own sub-discipline of soft matter physics where five authors would be a lot. Even in REF2014 this difference caused issues for different panels about when you had to justify your specific contribution to a submitted output and when not. Trying to find a set of metrics that worked uniquely across the board is just too challenging a task.

So the report comes out in favour of a ‘basket’ of metrics to be used in a light touch way in conjunction with peer review. Not even a nicely constructed (woven?) basket, in their view, can be constructed to cover all situations without accompanying peer review. So, in one sense, this report isn’t very interesting. On the other hand, the carefully gathered evidence, the correlation analysis the group were able to produce in that short time window before the information on output scores were permanently destroyed (provided in detail in an appendix to the main report), mean that we really are in a stronger position to say no to simple metrics. Which in many ways is regrettable. It means, as successive governments require us collectively to be assessed, monitored and scored, huge amounts of time, energy and consequently money will need to continue to be devoted to the exercise. Many senior academics will lose sleep over the forms as they try to doubleguess how best to present their department’s case to optimise the cash they receive.

However, there are some really helpful snippets of information to be found in the report and useful recommendations for different parts of the academic ecosystem. Some refer simply to the importance of an open and interoperable data infrastructure, to enable data to be captured robustly, as James Wilsdon, the chair of the steering group, reports here.  Naturally I was interested in what they had to say about equality and diversity issues. And what they said seemed very constructive and worthwhile for the community to reflect upon. They brought together evidence on unconscious bias as manifest in the lower number of citations that papers with women as ‘dominant’ author receive. Additionally they noted the evidence for the lower number of self-citations women tend to make. It is also clear that systems such as h indices are likely to disadvantage early career researchers, who may anyhow feel under extreme pressure to publish in journals with high impact factors – however much the JIF may be a discredited concept. All these factors indicate why pure metrics would be disastrous for the health of the academic community.

Which leads me into the final aspect where I think the report is really helpful. It highlights what sort of practice institutions may indulge in themselves with regard to metrics that are far from helpful for the wellbeing of individuals and the community. It is becoming increasingly clear that performance-informed research assessment increases the pressure on researchers without necessarily leading to any improvement in the research they do. Any identification of the use of internal metrics can lead to unintended (as well as intended) consequences. Organisations that have signed up to DORA (Declaration on Research Assessment)  should ensure all parts of their institution abide by this declaration; it’s not always clear that that happens currently. Clear recommendations are made in the report (I pull out three from a much longer list):

‘At an institutional level, HEI leaders should develop a clear statement of principles on their approach to research management and assessment, including the role of quantitative indicators.

Research managers and administrators should champion these principles and the use of responsible metrics within their institutions.

HR Managers and recruitment or promotion panels in HEIs should be explicit about the criteria used for academic appointment and promotion decisions.’

These recommendations are to be welcomed. Let us hope organisations pay heed.

 

This entry was posted in Research and tagged , , . Bookmark the permalink.

One Response to A Basketful of Metrics?

  1. David Stern says:

    I am about to actually read the report but I think all it will show is that the kind of peer review used in the REF measures something different than the metrics they used. We don’t know from that which is better. I’m rather sceptical of the quality of this type of secondary peer review. It just seems a huge waste of effort to peer review again what was originally peer reviewed when articles were published, grants were awarded, and people were hired.