8 minute read

Preprint Peer Review: An Indexer’s Perspective

By Michael Parkin (Data Scientist, Content, EMBL-EBI) <parkinm@ebi.ac.uk>

Europe PMC is an open database of life science literature from trusted sources around the globe. The resource is developed by the European Bioinformatics Institute (EMBLEBI) and provides free access to journal abstracts and fulltext articles mirrored from PubMed and PubMed Central, respectively, as well as related content such as preprints, book chapters, clinical guidelines, and patents.

As a strong supporter of open science practices, Europe PMC began indexing preprint abstracts in 2018, collating from several preprint platforms, including bioRxiv, ChemRxiv and PeerJ Preprints. This provided a single site from which to search for life sciences preprints from multiple platforms alongside the journal article literature, improving their discoverability. 1 Europe PMC has also converted over 42,000 preprints to full text and made this content available in multiple formats: HTML for reading on the website; bulk download for manuscript files, figures and supplementary data; and metadata through the API. Further, by taking advantage of the existing infrastructure developed for journal content, Europe PMC facilitates preprint inclusion into workflows such as literature reviews, article citation, and credit and attribution and enriches the preprint abstract records with links to related data and other useful resources. Over the past few years Europe PMC has been regularly adding to the list of preprint platforms indexed , and at the time of writing in January 2023, includes over 530,000 preprints from 28 platforms (Fig. 1).

Fig. 1: The cumulative number of preprints indexed in Europe PMC by platform.

The inclusion of preprints within Europe PMC has not been without challenges. While Europe PMC considers several criteria when deciding to index content from a preprint platform, the aim is to be inclusive, and this often necessitates working with different systems, formats, and approaches. For example, platforms often have different approaches to versioning and withdrawal policies and some have multilingual content. Europe PMC retrieves the bulk of its preprint metadata and abstracts from Crossref.2 This metadata is supplied by the preprint platforms to Crossref when creating DOIs for their preprints and is used to generate a preprint abstract record in Europe PMC (Fig. 2).

Fig. 2: Europe PMC uses metadata from Crossref to index preprint abstracts and makes this content available to communities such as researchers and text-miners.

There are some important metadata elements that are not available in Crossref that would be greatly beneficial to indexing services. Of particular note is the omission of a defined preprint version number, which Europe PMC infers based on a common practice of adding, e.g., “.v2” to the end of a version 2 DOI. It would also help to have a machine-readable way to indicate whether a preprint has been withdrawn or removed; the usual approach taken by preprint platforms is to indicate this in the abstract text. Europe PMC has recently participated in a Crossref working group providing recommendations around these concerns.3

Over the years, Europe PMC has also grappled with accurately indexing “open research platforms,” such as Wellcome Open Research powered by F1000 and, more recently, Access Microbiology, a journal that re-launched as an open research platform. Such platforms sit within an ever-broadening continuum between preprint platforms and journals. On these platforms, articles are made publicly available prior to peer review, and peer review reports are solicited directly by the platform and published alongside the manuscript as each report is received. Europe PMC indexes the manuscripts as preprints prior to peer review and links to the “version of record” available in PubMed for those articles that complete the peer-review process. The recent announcement from eLife of their new publishing model4 represents another challenge for indexing services like Europe PMC, in how to accurately reflect the various versions of the article that are generated in this novel workflow. Key to this will be the clear indication of what peer review is available for a particular version.

While a well-recognised benefit of preprints is speed, the omission of editorial peer review has raised concerns around scientific quality. To counter this, platforms such as Peer Community In, PREreview, and Review Commons have emerged to facilitate peer review of preprints by experts in the field outside of the traditional journal peer-review process. This was particularly important during the early months of the COVID-19 pandemic when the rapid dissemination of research findings was so critical. The existing approach at Europe PMC is to link to such platforms when possible through the use of its External Links mechanism, with “Peer Community In” being the first peer review platform to join in September 2018. This mechanism relies on the peer review platform providing Europe PMC with preprint DOIs and corresponding URLs that direct the user to the peer review(s) on the platform’s website. Following a website redesign in 2019, these are displayed in a dedicated review section on the preprint page on Europe PMC (Fig. 3). Unfortunately, this approach generates a technical burden on the platform and is difficult to scale up and maintain, particularly as preprint peer review platforms and practices continuously evolve.

Fig. 3: An example of a COVID-19 preprint indexed from medRxiv showing the “Reviews” section with a link to peer review reports hosted on the PREreview platform.

An ideal situation would be to have a single source for preprint peer review, similar to utilising Crossref as a single source of preprint metadata. Happily, this challenge has been taken up by Sciety, which has been aggregating preprint evaluations from a wide variety of sources since 2020.5 The initial ambition is to leverage this significant effort from Sciety and utilise the DocMap format6 to greatly improve the visibility of peer review of preprints indexed in Europe PMC. As this scales, adding DocMap metadata into the Crossref metadata will provide us with a single source to access this information. To start, Europe PMC will work with Review Commons, which is also producing DocMaps for their evaluations, as a proof of concept for this workflow.7

Europe PMC recently conducted user research with the aim of identifying what its users would expect to see on the website in terms of preprint peer review, based on metadata expected to be available in DocMaps. Participants liked the fact that Europe PMC is making this information available and easy to discover, and the study participants appreciated both a timeline approach with dates and linked peer review events as well as inclusion of platform logos that clearly indicate where the reviews come from (Fig. 4).

Fig. 4: A mock-up of a new design for a preprint review section in Europe PMC. Entries show the logo of the provider and information such as the date, type and author of the peer review material and includes a link to the material on the platform’s website.

Care needs to be taken in how this is conveyed to the reader, because making the presence of review synonymous with good scientific quality should be avoided. While the existence of a journal publication signifies that the peer reviews were ultimately favourable (often after rounds of reviews and revisions), evaluations of a preprint may well be negative. Accordingly, we consider it essential that the reader has easy access to the content of the reviews.

One of the key questions to address, and one that frequently arises in discussions with collaborators, is what exactly qualifies as “preprint peer review”? Or, in other words, what is required for Europe PMC to label a preprint as having been peer reviewed? For example, the ScreenIT pipeline has been used to assess medRxiv and bioRxiv COVID-19 preprints.8 This pipeline combines several automated screening tools to check issues around statistical reporting errors and omission of ethics statements and data availability, and it posts a public report on the outcome. Although these reports play a valuable role in assessing the quality of a preprint, our user testing suggests that such reports should be considered distinct from “preprint peer review.” Clearly care is needed to determine how a broad range of evaluations should be categorised.

The opportunity that preprints in the life sciences offer has become increasingly evident in recent years, allowing for the rapid communication of research results and the potential for a more transparent, inclusive, and responsive peer review structure. In line with its mission and objectives, Europe PMC will continue to work to support this evolving workflow.

Endnotes

1. Levchenko, M. Preprints in Europe PMC: reducing friction for discoverability. http://blog.europepmc.org/2018/07/ preprints.html (2018).

2. Wood, C. C. & Parkin, M. Using the crossref REST API. Part 12 (with Europe PMC). Crossref. https://www. crossref.org/blog/using-the-crossref-rest-api.-part-12with-europe-pmc/

3. Rittman, M. Better preprint metadata through community participation. Crossref. https://www.crossref.org/ blog/better-preprint-metadata-through-communityparticipation/

4. Urban, L. et al. eLife’s new model and its impact on science communication. Elife 11, (2022).

5. Let Sciety help you navigate the preprint landscape. Sciety https://sciety.org/

6. DocMaps to expand to increase the visibility and machine-readability of preprint evaluations. DocMaps Implementation Group. https://docmaps. knowledgefutures.org/pub/eyp3ckeo/release/1 (2022).

7. Changing the plumbing of scientific publishing. Review Commons. https://www.reviewcommons.org/blog/ changing-the-plumbing-of-scientific-publishing/

8. Weissgerber, T. et al. Automated screening of COVID-19 preprints: can we help authors to improve transparency and reproducibility? Nat. Med. 27, 6–7 (2021).

This article is from: