Why is it so difficult to understand the benefits of research infrastructure? – Digital Science

[ad_1]

Persistent identifiers – or PIDs – are long-lasting references to digital resources. In other words, they are a unique label to an entity: a person, place, or thing. PIDs work by redirecting the user to the online resource, even if the location of that resource changes. They also have associated metadata which contains information about the entity and also provide links to other PIDs. For example, many scholars already populate their ORCID records, linking themselves to their research outputs through https://www.crossref.org/” target=”_blank” rel=”noreferrer noopener”>Crossref and https://datacite.org/” target=”_blank” rel=”noreferrer noopener”>DataCite DOIs. As the PID ecosystem matures, to include PIDs for grants (https://www.crossref.org/community/grants/” target=”_blank” rel=”noreferrer noopener”>Crossref grant IDs), projects (https://www.raid.org.au/” target=”_blank” rel=”noreferrer noopener”>RAiD), and organisations (https://ror.org/” target=”_blank” rel=”noreferrer noopener”>ROR), the connections between PIDs form https://blog.datacite.org/introducing-the-pid-graph/” target=”_blank” rel=”noreferrer noopener”>a graph that describes the research landscape. In this post, Phill Jones talks about the work that the MoreBrains cooperative has been doing to show the value of a connected PID-based infrastructure.

Over the past year or so, we at https://www.morebrains.coop/” target=”_blank” rel=”noreferrer noopener”>MoreBrains have been working with a number of national-level research supporting organisations to develop national persistent identifier (PID) strategies: https://www.jisc.ac.uk/” target=”_blank” rel=”noreferrer noopener”>Jisc in the UK; the Australian Research Data Commons (https://ardc.edu.au/” target=”_blank” rel=”noreferrer noopener”>ARDC) and Australian Access Federation (https://aaf.edu.au/” target=”_blank” rel=”noreferrer noopener”>AAF) in Australia; and the Canadian Research Knowledge Network https://www.crkn-rcdr.ca/en” target=”_blank” rel=”noreferrer noopener”>CRKN, Digital Research Alliance of Canada (https://alliancecan.ca/en” target=”_blank” rel=”noreferrer noopener”>DRAC), and Canadian Persistent Identifier Advisory Committee (https://www.crkn-rcdr.ca/en/persistent-identifier-pid-governance” target=”_blank” rel=”noreferrer noopener”>CPIDAC) in Canada. In all three cases, we’ve been investigating the value of developing PID-based research infrastructures, and using data from various sources, including Dimensions, to quantify that value. In our most recent analysis, we found that investing in five priority PIDs could save the Australian research sector as much as 38,000 person days of work per year, equivalent to $24 million (AUD), purely in direct time savings from rekeying of information into institutional research management systems.

Investing in infrastructure makes a lot of sense, whether you’re building roads, railways, or research infrastructure. But wise investors also want evidence that their investment is worthwhile – that the infrastructure is needed, that it will be used, and, ideally, that there will be a return of some kind on their investment. Sometimes, all of this is easy to measure; sometimes, it’s not.

In the case of PID infrastructure, there has long been a sense that investment would be worthwhile. In 2018, in his advice to the UK government, https://www.gov.uk/government/publications/open-access-to-research-independent-advice-2018″ target=”_blank” rel=”noreferrer noopener”>Adam Tickell recommended:

Jisc to lead on selecting and promoting a range of unique identifiers, including ORCID, in collaboration with sector leaders with relevant partner organisations

More recently, in Australia, the Minister for Education, Jason Clare, wrote a https://www.arc.gov.au/about-arc/our-organisation/statement-expectations-2022″ target=”_blank” rel=”noreferrer noopener”>letter of expectations to the https://www.arc.gov.au/” target=”_blank” rel=”noreferrer noopener”>Australian Research Council in which he stated:

Streamlining the processes undertaken during National Competitive Grant Program funding rounds must be a high priority for the ARC… I ask that the ARC identify ways to minimise administrative burden on researchers

In the same letter, Minister Clare even suggested that preparations for the 2023 ERA be discontinued until a plan to make the process easier has been developed. While he didn’t explicitly mention PIDs in the letter, organisations like ARDC, AAF, and ARC see persistent identifiers as a big part of the solution to this problem.

A problem of chickens and eggs?

With all the modern information technology available to us it seems strange that, in 2022, we’re still hearing calls to develop basic research management infrastructure. Why hasn’t it already been developed? Part of the problem is that very little work has been done to quantify the value of research infrastructure in general, or PID-based infrastructure in particular. Organisations like https://www.crossref.org/” target=”_blank” rel=”noreferrer noopener”>Crossref, https://datacite.org/” target=”_blank” rel=”noreferrer noopener”>Datacite, and https://orcid.org/” target=”_blank” rel=”noreferrer noopener”>ORCID are clear success stories but, other than some notable exceptions https://doi.org/10.5281/zenodo.1445290″ target=”_blank” rel=”noreferrer noopener”>like this, not much has been done to make the benefits of investment clear at a policy level – until now.

It’s very difficult to analyse the costs and benefits of PID adoption without being able to easily measure what’s happening in the scholarly ecosystem. So, in these recent analyses that we were commissioned to do, we asked questions like:

How many research grants were awarded to institutions within a given country?
How many articles have been published based on work funded by those grants?
What proportion of researchers within a given country have ORCID IDs?
How many research projects are active at any given time?

All these questions proved challenging to answer because, fundamentally, it’s extremely difficult to quantify the scale of research activity and the connections between research entities in the absence of universally adopted PIDs. In other words, we need a well-developed network of PIDs in order to easily quantify the benefits of investing in PIDs in the first place! (see Figure 1.)

Luckily, the story doesn’t end there. Thanks to data donated by https://www.digital-science.com/”>Digital Science, and other organisations including ORCID, Crossref, Jisc, ARDC, AAF, and several research institutions in the UK, Canada, and Australia, we were able to piece together estimates for many of our calculations.

Take, for example, the Digital Science https://www.dimensions.ai/” target=”_blank” rel=”noreferrer noopener”>Dimensions database, which provided us with the data we needed for our Australian and UK use cases. It uses advanced computation and sophisticated machine learning approaches to https://doi.org/10.3389/frma.2018.00023″ target=”_blank” rel=”noreferrer noopener”>build a graph of research entities like people, grants, publications, outputs, institutions etc. While other similar graphs exist, some of which are open and free to use – for example, the https://doi.org/10.5438/jwvf-8a66″ target=”_blank” rel=”noreferrer noopener”>DataCite PID graph (accessed through https://commons.datacite.org/” target=”_blank” rel=”noreferrer noopener”>DataCite commons), https://openalex.org/” target=”_blank” rel=”noreferrer noopener”>OpenAlex, and the https://researchgraph.org/” target=”_blank” rel=”noreferrer noopener”>ResearchGraph foundation – the Dimensions graph is the most complete and accessible so far. It enabled us to estimate total research activity in both the UK and Australia.

However, all our estimates are… estimates, because they involve making an automated best guess of the connections between research entities, where those connections are not already explicit. If the metadata associated with PIDs were complete and freely available in https://resources.morebrains.coop/pidcycle/” target=”_blank” rel=”noreferrer noopener”>central PID registries, we could easily and accurately answer questions like ‘How many active researchers are there in a given country?’ or ‘How many research articles were based on funding from a specific funder or grant program?’

The five priority PIDs

As a starting point towards making these types of questions easy to answer, we recommend that policy-makers work with funders, institutions, publishers, PID organisations, and other key stakeholders around the world to support the adoption of https://www.jisc.ac.uk/rd/projects/a-national-persistent-identifier-research-strategy” target=”_blank” rel=”noreferrer noopener”>five priority PIDs:

DOIs for funding grants
DOIs for outputs (eg publications, datasets, etc)
ORCIDs for people
RAiDs for projects
ROR for research-performing organisations

We prioritised these PIDs based on https://doi.org/10.5281/zenodo.5609266″ target=”_blank” rel=”noreferrer noopener”>research done in 2019, sponsored by Jisc and in response to the Tickell report, to identify the key PIDs needed to support open access workflows in institutions. Since then, https://www.morebrains.coop/repository/” target=”_blank” rel=”noreferrer noopener”>thousands of hours of research and validation across a range of countries and research ecosystems have verified that these PIDs are critical not just for open access but also for improving research workflows in general.

Going beyond administrative time savings

In our work, we have focused on direct savings from a reduction in administrative burden because those benefits are the most easily quantifiable; they’re easiest for both researchers and research administrators to relate to, and they align with established policy aims. However, the actual benefits of investing in PID-based infrastructure are likely far greater.

Evidence given to the UK House of Commons Science and Technology Committee in 2017 stated that every £1 spent on Research and Innovation in the UK results in a total benefit of https://publications.parliament.uk/pa/cm201719/cmselect/cmsctech/1453/145305.htm” target=”_blank” rel=”noreferrer noopener”>£7 to the UK economy. The same is likely to be true for other countries, so the benefit to national industrial strategies of increased efficiency in research are potentially huge.

Going a step further, the universal adoption of the five priority PIDs would also enable institutions, companies, funders, and governments to make much better research strategy decisions. At the moment, bibliometric and scientometric analyses to support research strategy decisions are expensive and time-consuming; they rely on piecing together information based on incomplete evidence. By using PIDs for entities like grants, outputs, people, projects, and institutions, and ensuring that the associated metadata links to other PIDs, it’s possible to answer strategically relevant questions by simply extracting and combining data from PID registries.

Final thoughts

According to https://uis.unesco.org/apps/visualisations/research-and-development-spending/” target=”_blank” rel=”noreferrer noopener”>UNESCO, global spending on R&D has reached US$1.7 trillion per year, and with commitments from countries to address the UN sustainable development goals, that figure is set to increase. Given the size of that investment and the urgency of the problems we face, building and maintaining the research infrastructure makes sound sense. It will enable us to track, account for, and make good strategic decisions about how that money is being spent.

About the Author

https://orcid.org/0000-0003-0525-6323″ target=”_blank” rel=”noreferrer noopener”>Phill Jones, Co-founder, Digital and Technology | http://morebrains.coop”>MoreBrains Cooperative

Phill is a product innovator, business strategist, and highly qualified research scientist. He is a co-founder of the http://morebrains.coop”>MoreBrains Cooperative, a consultancy working at the forefront of scholarly infrastructure, and research dissemination. Phill has been the CTO at Emerald Publishing, Director of Publishing Innovation at Digital Science and the Editorial Director at JoVE. In a previous career, he was a bio-physicist at Harvard Medical School and holds a PhD in Physics from Imperial College, London.

The MoreBrains Cooperative is a team of consultants that specialise in and share the values of open research with a focus on scholarly communications, and research information management, policy, and infrastructures. They work with funders, national research supporting organisations, institutions, publishers and startups. Examples of their open reports can be found here: https://morebrains.coop/repository” target=”_blank” rel=”noreferrer noopener”>morebrains.coop/repository

[ad_2]
https://www.digital-science.com/blog/2022/11/benefits-of-research-infrastructure/”>

Why is it so difficult to understand the benefits of research infrastructure? – Digital Science

A problem of chickens and eggs?

The five priority PIDs

Going beyond administrative time savings

Final thoughts

About the Author

Related posts:

Magnetic avalanches on the sun reveal the hidden engine powering solar flares

Video evidence and eye witness accounts: The science behind why people see different things

The Wallace Line Forms an Invisible Barrier That Keeps Tigers and Koalas from Ever Meeting. We Finally Know What Happened

How climate change is burning Kenya's outdoor workers