At least 800 research studies published in crystallography and exotic-chemistry journals originate from a paper mill, a new study suggests.

The preprint identified around 800 research studies published between 2015 and 2022 that recycle images, contain oddities in the methods section and references that cite papers of no relevance. The study suggests that one main purpose of the paper mill is to artificially boost researchers’ performance indicators.

‘I found a few pieces from the same jigsaw,’ says David Bimler, an online sleuth who often comments about questionable research using the pseudonym ‘Smut Clyde’, who authored the study. But when I put the pieces together, he says, ‘it got larger than I was expecting’.

Graph

Source: © David Bimler

Thermogravimetric analysis is a way to measure a number of properties of a material and plots from eight papers uncovered by the study reveal striking similarities when superimposed

Bimler frequently comments about problematic and questionable studies on the post-publication site PubPeer, where scientists discuss papers after publication. For Bimler, it’s clear that the studies originate from a single source, as they contain overlapping features such as using peculiar phrases and contain similar images. He thinks the real number could be even higher when one counts certain papers that are hard to track down or access.

All the authors of the papers in question are doctors based at Chinese hospitals, who need to publish in international journals to be eligible for promotions, notes Bimler. In August 2020, the Beijing municipal health authority released a policy outlining that a physician needs two first-author papers to be promoted to deputy chief physician and three such papers to become a chief physician.

‘Although the Chinese authorities have announced that relevant changes are being made to the system as part of a crackdown on research misconduct, these incentives have obviously not been removed entirely,’ says Jana Christopher, a leading image data integrity analyst in Heidelberg, Germany.

Only a handful of the flagged papers have so far been retracted, Bimler adds. That might partly be because he hasn’t actively contacted the affected journals to alert them to the problem.

Sylvain Bernès, a computational crystallographer at the Meritorious Autonomous University of Puebla in Mexico and a keen PubPeer commenter, tells Chemistry World the problem may have been exacerbated by the fact that ‘it’s very easy to cheat with crystal structures’.

‘The building of a new structure from scratch, including the fabrication of experimental data (diffraction intensities) is a bit more tricky and time-consuming, but many software that allow that are available,’ adds Bernès. ‘Indeed, it takes more time to check and detect issues in x-ray structures than to create fake structures. I assume that this is very good for paper mill productivity.’

Previous research has also indicated that physicians in China are using paper mills widely. In 2020, microbiologist-turned-scientific integrity expert Elisabeth Bik in San Francisco, California and colleagues also identified a paper mill that she thought had churned out hundreds of papers with potentially fabricated images authored by researchers in China.

But Bimler admits that all these findings can’t be confirmed as no one is claiming responsibility for producing papers through paper mills. ‘It gets a bit speculative,’ he says, referring to the inner workings of paper mills. ‘I suspect you’ve got a range of cases from just one person or a couple of people working on a small scale and large industrial-scale ones with quite a few people doing it.’

Bernès thinks paper mills as we know them may disappear in the short term but other strategies to circumvent scientific scrutiny will emerge.

‘I just would like to be sure that not too many invented x-ray structures are landing in the databases, in particular the [Cambridge Structural Database],’ Bernès says. ‘Note that the 800 papers under consideration correspond to approximately 1200 structures deposited in the CSD. That is 0.1% of the whole current knowledge in chemical crystallography. In other words, the database is polluted with wrong data.’