In the early 1980s, there was growing concern about the quality of peer review at scientific journals. So two researchers at Cornell and the University of North Dakota decided to run a little experiment to test the process.
The idea behind peer review is simple: It’s supposed to weed out bad science. Peer reviewers read over promising studies that have been submitted to a journal to help gauge whether they should be published or need changes. Ideally, reviewers are experts in fields related to the studies in question. They add helpful comments, point out problems and holes, or simply reject flawed papers that shouldn’t see the light of day.
The two researchers, Douglas Peters and Stephen Ceci, wanted to test how reliable and unbiased this process actually is. To do this, they selected 12 papers that had been published about two to three years earlier in extremely selective American psychology journals.
The researchers then altered the names and university affiliations on the journal manuscripts and resubmitted the papers to the same journal. In theory, these papers should have been high quality — they’d already made it into these prestigious publications. If the process worked well, the studies that were published the first time would be approved for publication again the second time around.
What Peters and Ceci found was surprising. Nearly 90 percent of the peer reviewers who looked at the resubmitted articles recommended against publication this time. In many cases, they said the articles had “serious methodological flaws.”
This raised a number of disquieting possibilities. Were these, in fact, seriously flawed papers that got accepted and published? Can bad papers squeak through depending on who reviews them? Did some papers get in because of the prestige of their authors or affiliations? At the very least, the experiment suggested the peer review process was unnervingly inconsistent.
The finding, though published more than 30 years ago, is still relevant. Since then, other researchers have been uncovering more and more problems with the peer review process, raising the question of why scientists bother with it in the first place.
All too often, peer review misses big problems with studies
Researchers who have examined peer review often find evidence that it works barely better than chance at keeping poor-quality studies out of journals or that it doesn’t work at all. That conclusion has been arrived at in experiments like this one or this one and systematic reviews that bring together all the relevant studies, like this one and this one.
The reasons it fails are similar to the reasons any human process falls down. Usually, it’s only a few reviewers who look at an article. Those reviewers aren’t paid for their time, but they participate out of a belief in the scientific process and to contribute to their respective fields. Maybe they’re rushed when reading a manuscript. Maybe they’re poorly matched to the study and unqualified to pick it apart. Maybe they have a bias against the writer or institution behind the paper.
Since the process is usually blinded — at least on the side of the reviewer (with the aim of eliciting frank feedback) — this can also up the snark factor or encourage rushed and unhelpful comments, as the popular #sixwordpeerreview hashtag shows.
The Lancet editor Richard Horton has called the process “unjust, unaccountable … often insulting, usually ignorant, occasionally foolish, and frequently wrong.” Not to mention that identifying peer reviewers and getting their comments slows down the progress of science — papers can be held up for months or years — and costs society a lot of money. Scientists and professors, after all, need to take time away from their research to edit, unpaid, the work of others.
Richard Smith, the former editor of the BMJ, summed up: “We have little or no evidence that peer review ‘works,’ but we have lots of evidence of its downside.” Another former editor of the Lancet, Robbie Fox, used to joke that his journal “had a system of throwing a pile of papers down the stairs and publishing those that reached the bottom.” Not exactly reassuring comments from the editors of the world’s leading medical journals.
Should we abolish peer review?
So should we just abolish peer review? We put the question to Jeff Drazen, the current editor of the top-ranked medical publication the New England Journal of Medicine. He said he knows the process is imperfect — and that’s why he doesn’t rely on it all that much.
At his journal, peer review is only a first step to vetting papers that may be interesting and relevant for readers. After a paper passes peer review, it is then given to a team of staff editors who each have a lot of time and space to go through the submission with a fine-toothed comb. So highly qualified editors, not necessarily peer review, act as the journal’s gatekeepers.
“[Peer review] is like everything else,” Drazen said. “There are lots of things out there — some are high quality, some aren’t.”
Drazen is probably onto something real in that journal editors, with enough resources, can add real value to scientific publications and give them their “golden glow.” But how many journals actually provide that value add? We’re probably talking about 10 in the world out of the tens of thousands that exist. The New England Journal of Medicine is much more an outlier than the rule in that regard.
Even at the best journals, ridiculously flawed and silly articles get through. A few readers can’t possibly catch all the potential problems with a study, or sometimes they don’t have access to all the data that they need to make informed edits.
It can take years, multiple sets of fresh eyes, and people with adversarial views for the truth to come to light. Look no further than the study that linked autism to the measles-mumps-rubella vaccine, published in the Lancet. That paper was retracted after it was found to be not only fraudulent but also deeply flawed.
For some, that’s a reason to get rid of peer review. Brandon Stell, the president of the PubPeer Foundation, favors “post-publication” peer review on websites like his own (Pubpeer.com). There, users from around the world can critique and comment on articles that have already been published. These crowdsourced comments have led to corrections or even retractions of studies.
“There’s no reason why we couldn’t publish everything immediately on the internet and have it peer-reviewed after it’s been published,” Stell said arguing for abolishing pre-publication peer review. There are already journals that do just this, he added, such as the Winnower.
But replacing one flawed system (traditional pre-publication peer review) with what may be another (post-publication peer review) doesn’t fully solve the problem. Places like PubPeer are a fantastic development, but it’s not yet clear that they’re significantly better at catching errors and bad science consistently compared with traditional pre-publication peer review.
Even with its flaws, at the very least peer review seems to work at least a little better than chance. That’s not great, but that may be better than nothing. In a world without the peer review culture, it’s possible even more bad science would sneak through.
A complex solution for a complex problem
Stell pointed to another great innovation: sites like Biorxiv, which allow researchers to “pre-print” their manuscripts online as soon as they’re ready and get open comment before they’re ever peer-reviewed and published in academic journals. This adds another step in the process to publication, another chance to filter problems before they make it to peer review and onto the scientific record.
Ivan Oransky, a medical journalist who tracks retractions in journals at his site Retraction Watch, had a more holistic view. He didn’t think post-publication review should supplant the traditional process, but that it should be an add-on. “Post-publication peer review is nothing new, but in the past it’s happened in private, with no feedback for the authors or larger scientific community,” Oransky said. Sites like PubPeer open up the process and make it more transparent, and should therefore be strengthened.
“Let’s stop pretending that once a paper is published, it’s scientific gospel,” he added.
We think that’s closer to the solution. Science would probably be better off if researchers checked the quality and accuracy of their work in a multi-step process with redundancies built in to weed out errors and bad science. The internet makes that much easier. Traditional peer review would be just one check; pre-print commenting, post-publication peer review, and, wherever possible, highly skilled journal editors would be others.
Before this ideal system is put in place, there’s one thing we can do immediately to make peer review better. We need to adjust our expectations about what peer review does. Right now, many people think peer review means,
“This paper is great and trustworthy!” In reality, it should mean something like, “A few scientists have looked at this paper and didn’t find anything wrong with it, but that doesn’t mean you should take it as gospel. Only time will tell.”
Insiders like journal editors have long known that the system is flawed. It’s time the public embraced that, too, and supported ways to make it better.