Living Bits: Information and the Origin of Life

Can information theory shed light on the origin and evolution of life?

Thursday, December 4, 2014 The Nature of RealityThe Nature of Reality

What is life?

When Erwin Schrödinger posed this question in 1944, in a book of the same name, he was 57 years old. He had won the Nobel in Physics eleven years earlier, and was arguably past his glory days. Indeed, at that time he was working mostly on his ill-fated “Unitary Field Theory.” By all accounts, the publication of “What is Life?”—venturing far outside of a theoretical physicist’s field of expertise—raised many eyebrows. How presumptuous for a physicist to take on one of the deepest questions in biology! But Schrödinger argued that science should not be compartmentalized:

Support Provided By
Learn More
“Some of us should venture to embark on a synthesis of facts and theories, albeit with second-hand and incomplete knowledge of some of them—and at the risk of making fools of ourselves.”

Schrödinger’s “What is Life” has been extraordinarily influential, in one part because he was one of the first who dared to ask the question seriously, and in another because it was the book that was read by a good number of physicists—famously both Francis Crick and James Watson independently, but also many a member of the “Phage group,” a group of scientists that started the field of bacterial genetics—and steered them to new careers in biology. The book is perhaps less famous for the answers Schrödinger suggested, as almost all of them have turned out to be wrong.

In the 70 years since the book appeared, what have we learned about this question? Perhaps the greatest leap forward was provided by Watson and Crick, who by discovering the structure of DNA ushered in the age of information in biology. Indeed, a glib contemporary answer to Schrödinger’s question is simply: “Life is information that can copy itself.” But this statement offers little insight without a more profound analysis of the concept of information in the context of life. So instead of forging ahead, let’s take a step back instead and first ask: What is information?

The meaning of information

Information is a buzzword that is used in the press and in everyday conversation all the time, but it also has a very precise meaning in science. The theory of information was crafted by another great scientist, the mathematician and engineer Claude Shannon. Without going into the mathematical details, we can say that information is that which allows the holder of that information to make predictions, with accuracy better than chance.

There are three important concepts in this definition. First: prediction. The colloquial use of “information” suggests “knowing.” But more precisely, information implies the ability to use that knowledge to predict something . The second important aspect of the definition is the focus on “something other,” which reminds us that information must be about something . The third and last part concerns the accuracy of prediction. I can easily make predictions about another system (say, the stock market), but if these predictions are only as good as random guessing, then I did not make these predictions using information.

One thing that the stock market example immediately suggests is that information is valuable. It is also valuable for survival: For example, knowledge enabling you to predict the trajectory of a predator so that you can escape it is extremely valuable information. Indeed, it is possible to think of the entirety of the information stored in our genes in terms of the predictions it makes about the world in which we find ourselves: how to make a body that uses the information so that it can be replicated, how to acquire the energy to keep the body going, and how to survive in the world up until replication is accomplished. And while it is gratifying to know that our existence can succinctly be described as “information that can replicate itself,” the immediate follow-up question is, “Where did this information come from?”

The hardest question in science

Through decades of work by legions of scientists, we now know that the process of Darwinian evolution tends to lead to an increase in the information coded in genes. That this must happen on average is not difficult to see. Imagine I start out with a genome encoding n bits of information. In an evolutionary process, mutations occur on the many representatives of this information in a population. The mutations can change the amount of information, or they can leave the information unchanged. If the information changes, it can increase or decrease. But very different fates befall those two different changes. The mutation that caused a decrease in information will generally lead to lower fitness, as the information stored in our genes is used to build the organism and survive. If you know less than your competitors about how to do this, you are unlikely to thrive as well as they do. If, on the other hand, you mutate towards more information—meaning better prediction—you are likely to use that information to have an edge in survival. So, in the long run, more information is preferred to less information, and the amount of information in our genes will tend to increase over time.

However, this insight does not tell us where the first self-replicating piece of information came from. Did it arise spontaneously? Now we find ourselves faced with the question that some have called “The hardest question in science.”

At first glance it might appear that this question cannot possibly be answered, unless the class of molecules that gave rise to the first information replicator has left some traces in today’s biochemistry. Different scientists have different opinions about what these molecules might have been. But there are some things we can say about the probability of spontaneous emergence without knowing anything about the chemistry involved, using the tools of information theory. Indeed, information does not change whether it is encoded in bits, in nucleotides, or is scratched on a rock: Information is substrate-independent.

But information is also, mathematically speaking, extremely rare. The probability of finding a sequence encoding a sizable chunk of information by chance is so small that for practical purposes it is zero. For example, the probability that the information (not the exact sequence) of the HIV virus’s protease (a molecule that cuts proteins to size and is crucial for the virus’s self-replication) would arise by chance is less than 1 in 10 ⁹⁶ . There just aren’t enough particles in the universe (about 10 ⁸⁰ ), and not enough time since the Big Bang, to try out all these different sequences. Of course, the information in the protease did not have to emerge by chance; it evolved . But before evolution, we have to rely on chance or assume that the information “fell from the sky” (an alternative hypothesis that assumes that life first occurred somewhere else and hitchhiked a ride on a meteorite to Earth).

It turns out that scientists have been able to construct self-replicating molecules (based on RNA enzymes) that encode just 84 bits of information, but even such a seemingly small piece of information is still extremely unlikely to emerge by chance (about one chance in 10 ²⁴ ). Fortunately, information theory can tell us that there are some circumstances (particular environments) that can very substantially increase these probabilities, so a spontaneous emergence of life on this planet is by no means ruled out by these arguments.

Unfortunately, while given any particular environment we can estimate what the probability of spontaneous emergence might be, we have very little knowledge about the specifics of these environments on the ancient Earth. So while we can be more confident that spontaneous emergence is a possibility, the likelihood that the early Earth harbored just such environments is impossible to ascertain.

The chances that life emerged beyond Earth are at least as good as the chances it emerged here. Indeed, many meteorites that made it to Earth’s surface carry organic molecules with them, and information-theoretic considerations suggest that the environments they arose in are precisely those that are conducive to life.

Even though so many uncertainties about life and information remain, the information-theoretical analysis convincingly highlights the extraordinary power of life: While information is both enormously valuable and exceptionally rare, the simple act of copying (possibly with small modifications) can create information seemingly for free. So, from an information perspective, only the first step in life is difficult. The rest is just a matter of time.

Go Deeper
Author’s picks for further reading

arXiv: Information-theoretic considerations concerning the origins of life
In this pre-print, Chris Adami provides a more technical look at information and life.

Nature Reviews Genetics: Digital genetics: Unravelling the genetic basis of evolution
In this 2006 paper, Chris Adami reviews the emerging science of digital genetics. (Requires subscription.)

PLoS Biology: Bit by bit: the Darwinian basis of life
Gerald Joyce, who studies the origin of life and the role of RNA in Earth’s earliest life, discusses how information theory can be applied to astrobiology and “alternative life.”

Journal of Molecular Evolution: Monomer abundance distribution patterns as a universal biosignature: examples from terrestrial and digital life
In this paper, Chris Adami and his colleagues show how an “evolving digital life system” produces an analog for the chemical signatures of life. (Requires subscription.)

Editor’s picks for further reading

The Nature of Reality: Is Information Fundamental?
Discover why some theorists think that the fundamental “stuff” of the universe isn’t matter or energy, but information.

FQXi: It From Bit or Bit From It?
Read prize-winning essays from the Foundational Questions Institute’s 2013 essay contest on the theme of information and its role in reality.

Support Provided By

Related

Contemplating Infinity

Imagining Other Dimensions

What Makes Diamonds Sparkle?