Jennifer Byrne still feels stressed remembering the evening, six years ago, when she recognized the magnitude of the problem. It was a Saturday, and Byrne’s family members “were all out in the living room watching television, like normal people,” she recalls. At her computer in another part of the house, Byrne was on PubMed digging into a handful of suspiciously similar papers investigating an obscure human gene’s connection to various cancers. On a hunch, she plugged a combination of terms into PubMed’s search, including the name of a different gene containing a nucleotide sequence that she’d seen in two studies so far. Up popped around a dozen more of what looked like cookie-cutter papers.
“I just sat there and cried,” remembers the University of Sydney cancer researcher and director of biobanking for New South Wales Health Pathology. “I just went, ‘Oh, my God,...
I just went, ‘Oh, my God, like, what have I done? What have I found?’
—Jennifer Byrne, University of Sydney
Although there’s little concrete data on the scale of the problem, errors regularly make it into the scientific literature. Manipulated images and data, and, on occasion, completely made-up studies get published, as do the inevitable mistakes that sneak into even carefully conducted, peer-reviewed research. But while publishers uniformly say that they welcome and investigate reports from readers about errors, those readers often find the process of correcting the record to be opaque and frustrating. Even those for whom the process goes smoothly can come away with a somewhat more jaundiced view of the published literature.
Despite this, Byrne says she feels she has a responsibility to keep chipping away at the problem. “Once you’ve seen something, you can’t unsee it. You can’t pretend that it’s not happening.”
If you see something
Byrne’s reluctant entry into science sleuthing began in 2015, some weeks before the realization that left her crying at her computer. She’d looked up a little-known gene that had come up in her past research to see whether any new work had been done on it, and to her surprise, turned up five papers. “They were so similar I actually confused them with each other,” she recalls. “And then when I started looking at them, I just thought, ‘There’s something wrong here.’”
For one thing, there was a common nucleotide sequence—not only would it have been an extreme coincidence for multiple papers to use the same one, but it was used as a reagent for different purposes in different studies, which didn’t make sense. Byrne’s bad feeling about the papers only intensified after she uncovered more on that memorable Saturday evening. She reported the five papers to the journals they appeared in.
Then, Byrne saw a 2013 article in Science on the phenomenon of paper mills—businesses that sell authorship on already-accepted papers to researchers. “It was one of those moments where everything clicked—I just went . . . ‘I think this is what I’m looking at,’” she says. She didn’t have conclusive proof, so she got in touch with Cyril Labbé, a linguist at the University Grenoble Alpes in France who had previously published software for detecting duplicate or fake scientific papers. Together, Byrne and Labbé analyzed a growing cohort of suspicious manuscripts Byrne had identified in searches, and in 2017 published a paper on “striking similarities” in 48 of them.
Byrne didn’t stop there: she continued reporting suspect papers to journals and working with Labbé to dig up more examples of dodgy manuscripts in the life sciences literature—although they never did find a smoking gun to implicate paper mills for any of them. One thing she’s learned along the way, she says, is that science depends on trust, “but at the end of the day, none of us know what was done [in a study]—only the authors know that.”
In 2018, Byrne obtained her first funding for the sleuthing work, from the US Office of Research Integrity, enabling her to hire research assistants to contribute to the mission. She has since won funding from the Australian government. These days, she no longer does laboratory research. “I think I’ll make a bigger contribution to science in this way than I would just kind of bashing away . . . taking years to finish manuscripts on laboratory research, when someone using a paper mill can take days to weeks,” she explains. “The paper mills will win, basically, if we don’t do something to stop them, because they can just produce research so much more quickly than a genuine researcher can.”
While Byrne’s experience led to a major career pivot, finding an irregularity in the published literature can affect a researcher’s career trajectory in subtler ways, too. Leon Reteig was far into his PhD at the University of Amsterdam, where he was studying the effects of a brain stimulation method, when he found that the results of his analysis didn’t line up with those of the 2015 study he was trying to replicate. With some digging, Reteig figured out that the older study—performed another member of the lab under the supervision of his thesis advisor, Heleen Slagter—had used an Excel spreadsheet with an error in its code to perform a statistical correction on the data. “This was a really small error, but with quite large consequences,” he says. In fact, correcting it invalidated the 2015 study’s main result.
That was in 2018. Although Slagter acted quickly to correct the error in the literature by requesting a retraction, Reteig says, that retraction wasn’t posted until earlier this year. In the meantime, he and Slagter held off on submitting Reteig’s own research for publication, because they thought it would be “very confusing” if his paper mentioned earlier work for which the published record was wrong. Reteig, now a postdoc at University Medical Center Utrecht in the Netherlands, says that, in the end, he had to change his original plan for the research he’d complete for his PhD, and while he graduated in 2019, he only recently submitted his original work to a journal.
Say something
The Committee on Publication Ethics (COPE) recommends that journals respond to people who raise concerns about published papers, and that they investigate those concerns, notifying the person who sounded the alarm if the investigation results in an outcome such as a retraction or correction. In response to an inquiry from The Scientist, major publishers and journals—including Springer Nature, AAAS, PNAS, PLOS, Cell Press, and eLife—all said they appreciate being made aware of issues in their published papers and that they investigate these reports; some specified that they rely on COPE’s guidelines for doing so.
“There are a small number of journals that take this stuff very seriously,” says Ivan Oransky, the cofounder of Retraction Watch, a site that covers retractions and related issues and has documented Byrne’s and Reteig’s sleuthing, among that of others. Some publishers, such as Cell, PNAS, and Springer Nature, have dedicated staff to handle these issues, for example. “But once you get beyond journals that basically have their own research integrity teams, it’s tumbleweeds, and typically what happens at a lot of these journals is editors either don’t respond, or they find some reason to discount the critique,” adds Oransky, who is also the editor-in-chief of Spectrum and a former editor at The Scientist. “And then even if they do something, it often takes forever.”
Byrne says that she and her team often had to follow up with journal editors in order to get a response to her emails. As she, Labbé, and their colleagues reported earlier this year, the ultimate outcomes were highly variable, ranging from retractions in journals including the International Journal of Clinical and Experimental Medicine and BioMed Research International to a decision by Wiley’s Biotechnology and Applied Biochemistry to take no action on six papers that Byrne’s team flagged. After reporting about 300 papers in total with little success, Byrne says she and her colleagues have moved away from using this tactic in favor of concentrating on publishing their results. “Maybe it’s not a waste of time,” she says of reporting issues to journals. “But . . . it’s very high effort and low reward, and I don’t feel that we can spend the government’s money on that kind of activity.”
Once you get beyond journals that basically have their own research integrity teams, it’s tumbleweeds.
—Ivan Oransky, Retraction Watch
That said, Byrne says she’d still recommend that people who find an issue with a single published paper report it to the journal. If the journal doesn’t respond, says Oransky, one can escalate the complaint to the publisher.
Hampton Gaddy received similar advice last year when he found irregularities in some papers published in Early Human Development, an Elsevier journal. Gaddy says he was doing research for his undergraduate thesis at the University of Oxford when he came across an article on the 1918 flu coauthored by Victor Grech of the University of Malta that seemed to plagiarize another article he’d just read. Gaddy noticed that Grech had many other articles in press, including one projecting half a billion deaths worldwide from COVID-19 and 30 or 40 articles about the TV show Star Trek, at what was supposed to be a pediatric health journal.
Gaddy wrote to the journal, but also took to Twitter to ask whether there were additional steps he should take, and followed the advice he received to contact the publisher. Two days later, he received a response thanking him for reporting “these very serious concerns,” and according to Retraction Watch’s database, 93 of Grech’s papers now either have been retracted or carry expressions of concern. Gaddy says he never heard back from the journal, however, so “we don’t know what the problem was” that let so many problematic papers through.
Murkier waters
In Gaddy’s case, the problems with the manuscripts were difficult to argue with. But situations can get sticky when authors disagree with would-be whistleblowers about alleged issues in their papers. Such a situation has shaped up into a years-long saga for teams led by psychology researchers Elizabeth Phelps of Harvard University and Tom Beckers of KU Leuven in Belgium. It started, Beckers says, in 2015 when he got a grant to build on a finding from a 2010 Nature study by Phelps (then at NYU) on a multiday behavioral intervention for erasing the fear associated with a memory.
On trying to replicate the experiment as originally described, Beckers’s group found that the protocol excluded the vast majority of the research subjects based on their outcomes in the first few days of the intervention, leaving few people for the actual study, he says. So he got in touch with Phelps, who sent detailed records on the study, including its raw data. Over the course of their back and forth, Phelps says she became aware for the first time that the research team had initially recruited far more subjects to the study than noted in the paper, and had used a qualitative exclusion process rather than the quantitative criteria laid out in the paper. Phelps and her coauthors posted an addendum to the paper in 2018 correcting those inaccuracies.
Phelps says her interactions with Beckers had been cordial, and that when she was asked to peer review his group’s verification report on her study (published in Cortex last summer), she was “pretty shocked at how nasty it was.” The verification report describes the 2010 study’s results as “unreliable and flawed,” and was published alongside a replication report—also led by Beckers—on the same study, and an accompanying editorial by the journal’s editors that states the verification report “paints an undeniably bleak picture of this influential Nature paper and concludes that the evidence for its claims”—that in humans, the process Phelps’s group proposed can extinguish the fear associated with a memory—“is unreliable.”
The Nature study, and the issues raised in Beckers’s group’s analysis and replication of it, are complex, and Phelps concedes that “it wasn’t the perfect [paper]—there were issues,” but argues in an email to The Scientist that “there were no more issues with the 2 experiments reported in this paper than other studies.” The results hold up, she adds, an argument she and two coauthors lay out in an October 2020 preprint in response to the Cortex papers. They also argue that most studies on the phenomenon they reported in 2010, known as the reactivation-extinction effect, have replicated it. Beckers’s group countered with their own preprint rebuttal in July of this year, in which they reiterate their position that the original study’s results “were entirely based on arbitrary participant exclusions.”
Both sides in this debate are making their arguments in the scientific literature. But in some high-profile cases, authors have responded to those who report perceived problems not with preprints, but with legal action. After pointing out numerous apparent instances of data irregularities in published papers via an anonymous blog, University of Rochester mitochondria researcher Paul Brookes was outed as the blog’s author in January 2013 and was soon threatened with defamation lawsuits by some of the authors of papers he had blogged about. Brookes deleted the blog’s entries and stopped posting new ones, although he sometimes points out irregularities in others’ papers on his lab’s website. (He told Science in 2014 that none of the threatened lawsuits had materialized.)
Brookes’s experience is not unique. In 2016, former Wayne State University pathology researcher Fazlul Sarkar sued PubPeer, a site where people discuss problems in scientific manuscripts, in an attempt to obtain the identities of anonymous commenters who had alleged that he had committed research misconduct; he lost. And researchers in France recently lodged a legal complaint against microbiologist and prominent science sleuth Elisabeth Bik and a neuroscientist involved in running PubPeer for alleged blackmail and harassment, after Bik took to the site to post notes about possible issues in more than 60 studies by microbiologist Didier Raoult of Aix-Marseille Université.
Such episodes seem to be relatively rare. But as an international group of researchers wrote in an open letter in response to the complaint against Bik, “this strategy of harassment and threats creates a chilling effect for whistleblowers and for scholarly criticism more generally.”
A changed outlook
Whatever the outcome of their efforts to report problematic papers to journals and publishers, people who spoke with The Scientist about their experiences say they now have a somewhat more jaded view of the literature. Byrne, whose most recent preprint with Labbé and other colleagues flagged 712 studies with wrongly identified nucleotide sequence reagents, says that even before her initial discovery of the suspicious papers, “you’re trying to be skeptical as a scientist, and . . . you encounter papers where you can’t reproduce results.” What’s different for her now, she says, is her appreciation of how bad the issues are in some cases: sometimes “people just make stuff up.”
This was a really small error, but with quite large consequences.
—Leon Reteig, University Medical Center in Utrecht
Gaddy, who is now working toward a master’s degree in demography, says he was already a reader of Retraction Watch before he spotted the Grech papers last year, and was well aware of issues that can occur in the scientific literature, but that he nevertheless finds it “disheartening” that the papers were published in the first place—particularly those about COVID-19. On that topic, he’s realized that “there’s just so much stuff in the literature, and that not a lot of it matters, and not a lot of it has that much consequence.”
Even Reteig, who discovered an honest mistake that was corrected, now wonders about the impact of similar errors that haven’t been caught. “It’s very well possible that in many published studies, there are really small errors like this that are very hard to prevent, but that significantly affect the results of the study,” he says—and that, like the mistake he found, might also be extremely hard to spot. (Some research lends support to this idea—one recent analysis found that more than 30 percent of 11,117 genetics papers contained errors attributable to autocorrecting of gene names by Microsoft Excel.) “One thing that this really speaks for is that it should be much more common for researchers to share not just the paper . . . but really also all the materials used,” including the data and analysis codes, he says.
For papers where errors are detected, says Byrne, herself the editor of the journal Biomarker Insights, the publishing industry needs “systemic change to deal with the reality that many papers that are published have got issues and need some kind of post-publication attention.”
For now, Oransky cautions that those who report errors may take a while to see any result. “The top bit of advice that I would give to anyone who’s trying to bring an allegation to a journal or to anyone is not to assume that there’s a standard timeline—there just isn’t,” he says. “‘Prepare for frustration’ is what I would say if I was being even more glib about it.”