Critique of landmark study: Psychology may not face replicability crisis after all

A study published last year suggested psychological research was facing a replication crisis, but a new paper says that work was erroneous.

|
Jacqueline Larma/AP
A bronze cast of 'The Thinker' by Auguste Rodin, 1880, is seen outside the gateway to the Rodin Museum in Philadelphia in 2004.

Shock waves reverberated through the field of psychology research last year at the suggestion that the field faced a "replicability crisis." But the research that triggered that quake is flawed, a team of psychologists asserted in a comment published Thursday in the journal Science.

The ability to repeat an experiment with the same results is a pillar of productive science. When the study that rocked the field was published in Science in late August, Nature News's Monya Baker wrote, "Don’t trust everything you read in the psychology literature. In fact, two thirds of it should probably be distrusted."

In what's called the Reproducibility Project, a large, international team of scientists had repeated 100 published experiments to see if they could get the same results. Only about 40 percent of the replicated experiments yielded the same results.

But now a different team of researchers is saying that there's simply no evidence of a replicability crisis in that study.

The replication paper "provides not a shred of evidence for a replication crisis," Daniel Gilbert, the first author of the new article in Science commenting on the paper from August, tells The Christian Science Monitor in a phone interview.

The initial study, conducted by the Open Science Collaboration, also openly shared all the resulting data sets. So Dr. Gilbert, a psychology professor at Harvard University, and three of his colleagues pored over that information in a quest to see if it held up.

And the reviewing team, none of whom had papers tested by the original study, found a few crucial errors that could have led to such dismal results. 

Their gripes start with the way studies were selected to be replicated. As Gilbert explains, the 100 studies replicated were from just two disciplines of psychology, social and cognitive psychology, and were not randomly sampled. Instead, the team selected studies published in three prominent psychology journals and the studies had to meet a certain list of criteria, including how complex the methods were.

"Just from the start, in my opinion," Gilbert says, "They never had a chance of estimating the reproducibility of psychology because they do not have the sample of studies that represents psychology." But, he says, that error could be dismissed, as information could still arise about more focused aspects of the field.

But when it came down to replicating the studies, other errors were made. "You might naïvely think that the word replication, since it contains the word replica, means that these studies were done in exactly the same way as the original studies," Gilbert says. In fact, he points out, some of the studies were conducted using different methods or different sample populations. 

"It doesn't stop there," Gilbert says. It turns out that the researchers made a mathematical error when calculating how many of the studies fail to replicate simply based on chance. Based on their erroneous calculations, the number of studies that failed to replicate far outnumbered those expected to fail by chance. But when that calculation was corrected, says Gilbert, their results could actually be explained by chance alone. 

"Any one of [these mistakes] would cast grave doubt on this article," Gilbert says. "Together, in my view, they utterly eviscerate the conclusion that psychology doesn't replicate."

The journal Science isn't just leaving it at that though. Published alongside Gilbert and his team's critique of the original paper is a reply from 44 members of the replication team.

Brian Nosek, executive director of the Center for Open Science who led the original study, says that his team agrees with Gilbert's team in some ways. 

Dr. Nosek tells the Monitor in a phone interview that his team wasn't trying to conclude why the original studies' results only matched the replicated results about 40 percent of the time. It could be that the original studies were wrong or the replications were wrong, either by chance or by inconsistent methods, he says.

Or perhaps there were conditions necessary to get the original result that the scientists didn't consider but could in fact further inform the results, he says.

"We don't have sufficient evidence to draw a conclusion of what combination of these contributed to the results that we observed," he says. 

It could simply come down to how science works. 

"No one study is definitive for anything, neither the replication nor the original," Nosek says. "Anyone that draws a definitive conclusion based on a single study is overstepping what science can provide," and that goes for the Reproducibility Project too. Each study was repeated only once, he says.

"What we offered is that initial piece of evidence that hopefully would, and has, gotten people's theoretical juices flowing, to spur that debate," Nosek says. And spur it has. 

Gilbert agrees that one published scientific paper should not be taken as definitive. "Journals aren't gospel. Journals aren't the place where truth goes to be enshrined forever," he says. "Journals are organs of communication. They're the way that scientists tell each other, hey guys, I did an experiment. Look what I found."

When reproduction follows, that's "how science accumulates knowledge," Nosek says. "A scientific claim becomes credible by the ability to independently reproduce it."

You've read  of  free articles. Subscribe to continue.
Real news can be honest, hopeful, credible, constructive.
What is the Monitor difference? Tackling the tough headlines – with humanity. Listening to sources – with respect. Seeing the story that others are missing by reporting what so often gets overlooked: the values that connect us. That’s Monitor reporting – news that changes how you see the world.

Dear Reader,

About a year ago, I happened upon this statement about the Monitor in the Harvard Business Review – under the charming heading of “do things that don’t interest you”:

“Many things that end up” being meaningful, writes social scientist Joseph Grenny, “have come from conference workshops, articles, or online videos that began as a chore and ended with an insight. My work in Kenya, for example, was heavily influenced by a Christian Science Monitor article I had forced myself to read 10 years earlier. Sometimes, we call things ‘boring’ simply because they lie outside the box we are currently in.”

If you were to come up with a punchline to a joke about the Monitor, that would probably be it. We’re seen as being global, fair, insightful, and perhaps a bit too earnest. We’re the bran muffin of journalism.

But you know what? We change lives. And I’m going to argue that we change lives precisely because we force open that too-small box that most human beings think they live in.

The Monitor is a peculiar little publication that’s hard for the world to figure out. We’re run by a church, but we’re not only for church members and we’re not about converting people. We’re known as being fair even as the world becomes as polarized as at any time since the newspaper’s founding in 1908.

We have a mission beyond circulation, we want to bridge divides. We’re about kicking down the door of thought everywhere and saying, “You are bigger and more capable than you realize. And we can prove it.”

If you’re looking for bran muffin journalism, you can subscribe to the Monitor for $15. You’ll get the Monitor Weekly magazine, the Monitor Daily email, and unlimited access to CSMonitor.com.

QR Code to Critique of landmark study: Psychology may not face replicability crisis after all
Read this article in
https://www.csmonitor.com/Science/2016/0303/Critique-of-landmark-study-Psychology-may-not-face-replicability-crisis-after-all
QR Code to Subscription page
Start your subscription today
https://www.csmonitor.com/subscribe