A bar chart and line graph printed are printed on a piece of paper which sits on a wooden desk. A person holds a pen and uses it to point to the bar chart near the top of the page.

Review of the role of metrics in research assessment

Research Councils & HE sector bodies Excellence Frameworks (REF, TEF and KEF)HE funding & research strategy 2014

Identifying useful metrics for research assessment

What empirical evidence (qualitative or quantitative) is needed for the evaluation of research, research outputs and career decisions?
What metric indicators are currently useful for the assessment of research outputs, research impacts and research environments?
What new metrics, not readily available currently, might be useful in the future?
Are there aspects of metrics that could be applied to research from different disciplines?
What are the implications of the disciplinary differences in practices and norms of research culture for the use of metrics?
What are the best sources for bibliometric data? What evidence supports the reliability of these sources?
What evidence supports the use of metrics as good indicators of research quality?
Is there evidence for the move to more open access to the research literature to enable new metrics to be used or enhance the usefulness of existing metrics?

The responses of the RGS-IBG to consultations on this topic dating back to 2008, have been critical of the use of metrics in research assessment, with a strong preference for peer review. Our position and that of the community, whom we consulted in writing this response, remains largely unchanged.

At the core of this is trust by the academic community in the process and the outcome of research assessment. The current REF, despite all its challenges, largely has sustained this trust.

There is no evidence we are aware of for metrics, established or new, that can adequately capture originality, significance and rigour of academic outputs. We do not believe citation data, or other variants, quantify ‘quality’ independent of peer review. Analysis and evidence to this effect is presented in Richards, K. et al. (2009). The nature of publishing and assessment in Geography and Environmental Studies: evidence from the Research Assessment Exercise 2008. Area, 41(3), 231-243.

We have concerns about the integrity of the established metrics available (Scopus, WoK, Google-Scholar), in terms of what gets counted and what can be abused.

In terms of research grants metrics, the bias is heavily in favour of capital intensive disciplines. In many parts of the social sciences large research grants are not appropriate; research relies on data collected by others (the census is a classic example)

How should metrics be used in research assessment?

What examples are there of the use of metrics in research assessment?
To what extent is it possible to use metrics to capture the quality and significance of research?
Are there disciplines in which metrics could usefully play a greater or lesser role? What evidence is there to support or refute this?
How does the level at which metrics are calculated (nation, institution, research unit, journal, individual) impact on their usefulness and robustness?

There are inherent problems with existing metrics and the granularity that would be needed for meaningful comparisons. It is well established that citation metrics show significant differences within as well as across disciplines. Geographers are commonly amongst the top 1% of cited social scientists but the magnitudes of citation are several times smaller than Physical Geographers in the top 1% of the Geoscience or Environmental Science communities. A different measure of this (recognising fully the difference between journal impact factors and citations of individual papers), 10 journals on the ISI Physical Geography list have an Impact factor >5; only 1 Human Geography journal has such a value. Moreover within the Human Geography list significant differences exist between sub-disciplines, Historical and Economic Geography, for example.

The issue of monographs remains.

In the 2007 report on bibliometrics from the University of Leiden Centre for Science and Technology Studies that underpinned much of the early discussion of metrics in the consultation prior to the development of the 2014 REF, there is a detailed analysis of normalisation. There has been no serious assessment of this very complex issue in the UK (as far as we are aware), and such use of bibliometrics as is permitted in the 2014 REF only involves raw citation counts. Before there is any suggestion of moving towards a metrics-based assessment, there must be proper assessment of this matter. There is no direct mapping that links Units of Assessment with fields of enquiry and thence with journal titles, and so normalisation is exceptionally difficult, but is needed to enable fair comparisons between different fields. The process of normalisation during a research assessment exercise will almost certainly need to be "bespoke", and undertaken and policed during the exercise, and would almost certainly require a technical support unit of some sophistication and scale to work with subject panels. This would need to be costed and evaluated against the costs of the present peer-review style of exercise. One suspects that if normalisation is undertaken in a manner that commands the confidence of the wider academic community, it would not be much less costly.

We note the further complications now introduced into REF as there are not single discipline units of assessment, rather in many cases multiple disciplines in one unit (e.g. C-17 Geography and Archaeology).

Even greater questions are raised around metrics to quantify ‘impact’ (which most expect to be even more significant in the next REF). Moreover, the scale, reach and significance of research impact varies across disciplines – and even across the different faces of a discipline such as geography.

Using bibliometrics as the basis for relative comparison either within or between disciplines is fraught with problems (see further below). Any evidence that is presented should be made open access to give the widest possible scrutiny of it.

Following REF-2014, we suggest Hefce select four very different disciplines (one from each of the main panels) and carry out a systematic comparison of bibliometric based research assessment with that of peer review -- independent and objective research based on current REF outcomes.

‘Gaming’ and strategic use of metrics

What evidence exists around the strategic behaviour of researchers, research managers and publishers responding to specific metrics?
Has strategic behaviour invalidated the use of metrics and/or led to unacceptable effects?
What are the risks that some groups within the academic community might be disproportionately disadvantaged by the use of metrics for research assessment and management?
What can be done to minimise ‘gaming’ and ensure the use of metrics is as objective and fit-for-purpose as possible?

We continue to have concerns about issues of equality – particularly for young/ emerging scholars. We know from the citation and readership data for papers in our own journals, ‘established’ ‘recognised’ disciplinary figures are privileged.

There is a strong belief across the community that metrics will affect the behaviour of individual researchers and institutions

The focus upon bibliometrics will have uncertain consequences for early career researchers, those who take career breaks, those in emerging areas of research, and those whose research has a particular international focus or because of its nature takes a long time to come to publication– such fields may be discouraged if they do not earn citations.

Downloads

File nameFiles

File type

Size

Download

Review of the role of metrics in research assessment

.pdf

155 KB

Download all files