What Makes Science True?

  • Posted 01.19.17
  • NOVA

What makes science reliable? The ability to reproduce the results of an experiment, known as reproducibility, is one of the hallmarks of a valid scientific finding. But science is facing what many consider a reproducibility crisis, and the stakes are high. Many scientific claims cannot be replicated, and many clinical trials fail as a result.

Close
Running Time: 14:59

This 15-minute video examines the reproducibility crisis and reports on the outcome of five experiments designed to test the reproducibility of cancer biology studies.

While NOVA had no editorial involvement in this video, we believe it is a thoughtful and well-researched exploration of a complex topic, and we are pleased to share it with you. The video was produced by Dakin Henderson and Kelly Thomson, and was funded entirely by the Laura and John Arnold Foundation (LJAF).

In the interest of full transparency: LJAF has an interest in this subject matter, is one of a handful of funders that supports the cost of conducting reproducibility experiments, and provided funding to the organizations featured in this video in order to carry out the reproducibility studies covered here. We have been assured that LJAF had no editorial involvement in the video.

You can also learn more about reproducibility by reviewing the following publications:

Many analysts, one dataset: Making transparent how variations in analytical choices affect results

False-Positive Psychology: Undisclosed Flexibility in Data Collection and Analysis Allows Presenting Anything as Significant

The Statistical Crisis in Science

Why Most Published Research Findings Are False

Journals unite for reproducibility

Promoting an open research culture

Estimating the reproducibility of psychological science

Design, power, and interpretation of studies in the standard murine model of ALS Raise standards for preclinical cancer research

Raise standards for preclinical cancer research

Believe it or not: how much can we rely on published data on potential drug targets?

A Survey on Data Reproducibility in Cancer Research Provides Insights into Our Limited Ability to Translate Findings from the Laboratory to the Clinic

ASCB Member Survey on Reproducibility

1,500 scientists lift the lid on reproducibility

Interpretations of the results of the Reproducibility Project Psychology and the National Institute for Neurological Disorders and Stroke replications vary depending on how "reproducibility" is defined. Original sources for each reproducibility study are:

Amgen

Bayer

NINDS

ALS TDI

RP:P

RP:CB

We hope that this information on reproducibility will start a conversation. We encourage you to share this video and the publications above, and if you have feedback, let us know in the comments. We’d like to hear from you.

Transcript

BUTTERWORTH: What makes science true? What makes science something we should rely on? If you have a novel finding, you should be able to replicate it, or somebody else in a different lab should be able to replicate it. So like if you get in your car in the morning and you turn the ignition on, you, you pretty much expect the car to start up and you go forward. The laws of internal combustion, they’re gonna work today, they’re gonna work tomorrow. So reproducibility is a way of giving validity to science.

NARRATION: Reports are accumulating that much, if not most, of scientific claims can’t be reproduced.

IORNS: Right now the whole literature is of questionable value. Because you don’t know what’s real and what’s not.

IOANNIDIS: Increasingly, many scientists including myself were recognizing that, maybe much of what we believe is not really true.

NARRATION: Scientists are trying to figure out, how much of what is published is reproducible? Why? And what can be done?

IORNS: And this is the way that science is supposed to work, in theory. That you have all of these individual building blocks that are the publications, and you take those building blocks and then you build the next level of understanding. And so if they can’t reproduce those findings we’re wasting a lot of what we do.

NARRATION: It has been called ‘the reproducibility crisis’. The media and the general public are wondering if science itself is broken.

BUTTERWORTH: There’s a vertiginous feeling of standing on a cliff and realizing that so much of the knowledge we thought we had may not actually be very sound knowledge and it's about to crumble away and we’re going to fall into a pit, and science is not going to save us because science is a mess.

NARRATION: But the foundations of science are solid. After all, we are surrounded by the impacts of very real scientific discoveries. This realization, that we think we know more than we really do, is not new. And it points to a fundamental flaw in how we, as a society, think about science.

NARRATION: Imagine living in a time, long ago, before any scientific knowledge, and looking up into a starry night sky. What is happening up there? What does it mean? And what is it that keeps us bound to earth?

Around the year 350 BC, Aristotle was asking these kinds of fundamental questions about the world. Aristotle came from a time when people didn’t really conduct scientific experiments, but if he stood on a ledge and dropped a rock and a feather at the same time, he would have seen how a feather, being lighter, falls more slowly than the rock. Based on observations like this, Aristotle came to the theory that heavier objects fall faster than lighter objects.

IORNS: When we first started the Reproducibility Initiative, it was actually quite controversial. So there was definitely scientists who felt like it might be something that damaged people’s careers, or was offensive to the scientific community.

NARRATION: Elizabeth Iorns, founder and CEO of Science Exchange, is heading up a project to take the experiments from 35 high-impact studies in the field of cancer biology, and see if they hold up to replication.

IORNS: People tend to be very afraid of saying that something was not reproducible. It’s definitely, you know, this cultural issue with reproducibility being associated with fraud. Even though the vast, vast majority of cases would not be because of fraud.

NARRATION: Iorns says that replication studies are a crucial piece of the scientific process that has somehow been lost.

IORNS: Replication studies are so rarely funded and they're so underappreciated, they never get published, no one wants to do them, there's no reward system there in place that enables it to happen. So you just have all of these exploratory studies out there that are taken as fact, that this is a scientific fact that's never actually been confirmed.

NARRATION: Remember when science told us that taking antioxidant supplements was good for you?

There were molecular studies, animal studies, epidemiological studies, and even this randomized, double-blinded human trial of over 2000 participants.

CONCUS ARCHIVAL: Antioxidant vitamins are the latest rage on grocery store shelves. They include vitamin C, E, and beta-carotene.

CONCUS ARCHIVAL:All the experimental evidence points very strongly to, uh, the value of antioxidants as a potential therapeutic tool.

NARRATION: But then the theory completely reversed itself.

NBC ARCHIVAL: Researchers in Texas found, antioxidants that protect healthy cells from the damage caused by free radicals can actually turbocharge the growth of cancerous cells.

ABC7 ARCHIVAL:The idea was, give people more vitamins and it would prevent cancer. But what they find when they do a lot of these studies, is that it can do the exact opposite.

NARRATION: It makes you wonder about the rest of science that gets published

IORNS: Why are these drugs that are coming out and showing that they can cure diseases in animal models just like failing in every clinical trial. It’s fail, fail, fail.

NARRATION: A 2011 analysis reported that, in the field of cancer biology, 95% of potential cancer drugs fail in clinical trials.

IORNS: I would argue that the reason why they have so high a failure rate is because the original studies that were done in the preclinical phase were never properly validated.

NARRATION: Drug companies began to take notice, and in 2011 and 12, pharmaceutical giants Bayer and Amgen each attempted to validate promising studies on potential drug targets. Bayer was only able to reproduce 21% of the findings. Amgen, 11%. More reproducibility studies trickled in. The National Institute for Neurological Disorders and Stroke: 8%. The ALS Therapy Development Institute: 0%. And the Reproducibility Project: Psychology: 36%. Most of the available data on replication rates is in the fields of biomedical science and psychology. But survey data indicates that the problem is widespread. How could so much science be irreproducible?

NOSEK: One reason for irreproducibility is that / the original was wrong. They thought there was something there, that wasn’t there. A second reason is that the replication screwed up. The third is what probably happens very frequently, but is the hardest to unpack, which is, both are true. The original result observed a finding that is there. The replication did not observe a finding, and it isn’t there. How could both of these things be true? Well because no 2 studies are identical.

NARRATION: Aristotle’s theory, that the speed at which objects fall is proportional to their mass, prevailed for almost 2,000 years. Then around the year 1600, legend has it that Galileo stood on top of the Tower of Pisa and essentially replicated the falling object experiment, with one crucial difference. Instead of using a feather and a rock as the heavy and light objects, he used a cannonball and a musket ball. They hit the ground at the same time. Aristotle was wrong.

NOSEK: When we get a result and then we fail to get it then we have a puzzle. What happened? Why is it that these 2 outcomes were different?

NARRATION: Aristotle’s finding couldn’t be reproduced because of a variable that he didn’t even know existed: air resistance, which is closely linked to the shape of the object. But how many scientific studies fail to replicate because of these kinds of unknown factors, and how many are just flat-out wrong? This happens far more often than we would like to think.

IOANNIDIS: Even with good intentions, even with the best intentions, our first impressions, our first discoveries, they’re likely to be false.

NARRATION: Collecting and interpreting data can be extremely complicated, and it’s hard to tell the difference between a real finding and just statistical noise.

IOANNIDIS: People are trying to find relationships and associations and affects in a sea of, of dust and noise and some real affects and if you test lots of things you can come up with seemingly interesting results that have absolutely no meaning. And at the same time the incentive structure, and the reward system for, for how we do science, it’s not aligned with maximizing the yield of, of true results.

NOSEK: I’m incentivized to find positive results. Relationships that exist. When I do this, this other thing happens. It’s much more interesting than saying, when I do this, nothing happens. Positive results are very publishable, and publishing things is what advances my career. When I am doing my research, I have many choices that I can make along the way.

NARRATION: Is there an effect for one variable, and not the other? Maybe some data needs to be excluded. Perhaps gender differences influence the result. How much data is enough? Should the study end now or later?

NOSEK: When I’m looking at these different ways of analyzing the data, I might see the ones that reveal positive results, and be more convinced by them. And I might not do that intentionally, I’m not looking to falsely create positive results. But I might be leveraging chance, unintentionally, just because I have skin in the game.

NARRATION: These kinds of decisions apply to exploratory studies in every scientific field. And when there are multiple variables at play, it’s really easy to find statistically significant, positive results, purely by chance.

NOSEK: And we won’t know which ones occurred by chance, unless we do it again.

IORNS: The current incentive system for doing research at an academic level is extremely broken. So much of it is about these one-off findings that are very unlikely to be real. And so, you really want to create a system where, no matter what experiments you did, so long as they’re really good experiments, even if they’re negative results, they’re still valuable. And they’re still valued.

NARRATION: Iorns and a growing number of researchers, journals, funders and institutions are working to create a new paradigm for scientific research.

IOANNIDIS: There’s lots of action. Across almost any field that you can imagine in science.

NARRATION: They’ve proposed a number of systemic changes to make science more accurate.

BUTTERWORTH: Actually, I’m really optimistic. We are realizing better ways of doing science.

NARRATION: New guidelines and incentives encourage researchers to improve transparency and statistical methods. And to conduct more replications.

IORNS: If a really exciting exploratory breakthrough result comes out, there should be a replication study that follows that and confirms that those results are reproducible.

NARRATION: As of January 2017, the Reproducibility Project Cancer Biology has completed 5 of the 35 replication studies. None of them were perfect replications. But there’s still debate about how to interpret what happened. And when the final results do come in, we’ll have to ask ourselves the same question. What does it mean?

NASA ARCHIVAL: In my left hand I have a feather, in my right hand, a hammer. I guess one of the reasons we got here today was because of a gentleman named Galileo a long time ago, who made a rather significant discovery about falling objects and gravity fields. And we thought that, where would be a better place to confirm his findings than on the moon? And I’ll drop the two of them here and hopefully, they’ll hit the ground at the same time. How 'bout that?

NARRATION: Aristotle’s observation of the feather and rock wasn’t wrong. Only his conclusions were. Today, our society sees the peer-reviewed, published literature as if it were an archive of scientifically-proven facts. But the frontiers of science will always be uncertain, and the published literature is more of a discussion among scientists.

NOSEK: Progress isn’t made in science by figuring out what’s a fact and what’s not. Science makes progress by reducing the uncertainty of explanations. Of taking what is a world of possibilities and reducing it to a smaller world of possibilities. And so it requires a very different orientation that’s difficult even for scientists to embrace that we are going to be wrong a lot. A lot. Because we’re chasing things that we do not understand. That’s why we’re doing science. We are accumulating evidence to try to reduce that misunderstanding.

Credits

PRODUCTION CREDITS

Produced and Directed By
Dakin Henderson & Kelly Thomson
Animation by
Kent Leon Welling
Original Music by
Louis Weeks
Editing, Cinematography and Narration by
Dakin Henderson
Funding Provided by
Laura and John Arnold Foundation
Featuring
Trevor Butterworth
John Ioannidis
Elizabeth Iorns
Brian Nosek
Additional Music
“6 Ghosts I” and “18 Ghosts II” by Nine Inch Nails
Thank You
Jay Bradner
Peter Gotzsche
Marcia McNutt
Ben Pender Cudlip
Nicole Perfito
Lawrence Rajendran
Jackson Solway
Deborah Zarin
Archival Material
abc7 News
Atlantic Monthly Group, Inc.
Conus Archive
The Economist Group
Getty Images
KARGER
The Lancet/Elsevier
NASA
Nature Publishing Group
NBC Bay Area News
PLOS One
Pond5
Salon Media Group, Inc.
Science/AAAS
Scientific American
Shutterstock
The Week Publications, Inc.
Wiley

Related Links