Are metrics a necessary evil, or can they be a force for good? Anthony Olejniczak sticks up for stats

Scholars have long debated the merits and pitfalls of metric-based evaluation – the pages of Scientometrics andJASIS&T contain dozens of analyses and variations of the h-index (to name just one metric). But in recent years, bibliometric data have gained attention beyond the scope of the scholars who study them.

As research funding fails to keep up with the growth of institutions and researchers, competition for that funding becomes increasingly fierce and purse-holders turn to metrics to inform their decisions. And those using or interacting with universities also want to compare and distinguish between institutions. Governments, students, ranking companies, bibliometric providers, industry and funding bodies are all driving the demand for methods to quantify research outputs.

Shutterstock

Academics should suggest suitable metrics before others are imposed on them

Metric system

In short, metrics are here to stay. Nonetheless, their ubiquity brings challenges, and many researchers express misgivings about, or even resent, the growing emphasis on metrics. Perhaps chief among these issues is deciding exactly what to measure. For instance, can we compare two universities in the discipline ‘chemistry’ if one department has mostly analytical chemists and the other, largely biochemists? These two subfields have vastly different rates and modes of obtaining funding and disseminating research, yet most quantitative methods operate at the ‘discipline’ level, insensitive to these variations, creating potentially dubious comparisons.

Another challenge is choosing the time frame. For example, a university administrator seeking to measure the impact of his or her policies is best served by assessment data that are current and frequent. With data extending back three, five or 10 years, they can measure (after only one or two years) the impact of their strategies. But a researcher’s 30-year career citation count is unnecessary. Conversely, full career histories are essential if we want to characterise the overall research prowess of an academic unit. Unfortunately, the relative ‘academic age’ of a unit is often overlooked in measurement schemes and metric-based rankings. This disadvantages relatively young departments, and can mask declining performance in formerly productive units.

Perhaps the most often-cited challenge is that some things simply cannot be measured, or cannot be adequately measured. A research university may be said to have three missions: teaching, research and service. However, only teaching and research are ‘measurable’, and in both cases it is not clear that the ultimate, and arguably the most important, impacts are captured (for example, student success after graduation, or local economic or public health benefits). Moreover, among the measurable aspects of research (articles and citations, awards won, funding obtained, conference proceedings presented, years since PhD etc) there are some disciplines in which these metrics are unfit indicators of a scholar’s activity. For example, performing arts and other disciplines where exhibition is a major means of dissemination are challenging to quantify because the breadth of opportunities and available venues for a performance or exhibition is tremendous. And, given the subjective nature of the material, it is unclear whether collecting records of such exhibitions is even a fruitful pursuit.

Data with discipline

The challenges are substantial, and include many others not mentioned here; but this should not stop us from using metrics. Moreover, given the wealth of data now available, it is the academy itself – the body of researchers trained in making and improving observations – that is best-placed to refine metrics and create new ones. The discussion within the ivory tower must change from ‘should we measure?’, to ‘what should we measure?’ Disciplinary experts have a window, possibly open only briefly, to determine the best way to measure themselves before others, less informed, impose metrics upon them.

This will also ensure that metrics are not simply a burden to be endured, but a genuinely useful tool. A refined taxonomy of subfields, taking into account researchers’ academic ages, collaborators, publishing habits and countless other data points can be extracted from existing sources. These data enable predictive models and allow fair comparisons to be made between disciplines, institutions and countries. Such specific metrics and comparisons, honed by the scholars who know the discipline, are a positive force. They can be used to promote the merits of a group’s research when applying for funding, to create an objective framework within which to mentor younger scholars and to guide administrative decisions.

But perhaps the greatest promise of metrics lies in cost saving. Assessment exercises such as the UK’s Research Excellence Framework and the US National Research Council’s Assessment of Research-Doctorate Programs require huge investments of time and money, often from the institutions being evaluated. There may never be a metric-based replacement for the judicious evaluations of expert panels and survey respondents, but supplementing them with well-chosen data could vastly reduce the time and expense.

If predictive models can accurately reproduce the results of human deliberation in these studies, then it is difficult to justify the costs of employing scholars in these extracurricular exercises.

Anthony Olejniczak is co-founder and chief knowledge officer at Academic Analytics