TOPICS > Science

Sequencing Life: The Human Genome Map

February 12, 2001 at 12:00 AM EDT


SUSAN DENTZER: The excitement was palpable last year when scientists announced that they’d virtually completed rough drafts of the human genome. Today at a Washington news conference they called the results even more remarkable than they’d expected.

DR. ERIC LANDER: 3.5 billion years in the making, the text is filled with long-sought answers, some amazing surprises, puzzling mysteries, and lots of useful information for medicine.

SUSAN DENTZER: The results reported today stemmed from two separate efforts to decode the genome. One was carried out by a consortium of researchers known as the human genome project, and headed by Francis Collins; a second was done by a private U.S. company, Celera Genomics, under its CEO, J. Craig Venter. In completing the rough draft of the genome, researchers in effect assembled a map of all human DNA, much of it distributed in the form of genes along 23 pairs of chromosomes. DNA’s chemical structure looks a twisted ladder or double helix. The rungs are made up of four chemicals whose names begin with the letters “A,” “T,” “G,” and “C.” These chemicals pair up, forming a total of around three billion pairs. Even now scientists don’t know what most of this DNA does, but about 1 percent of it functions like keys that switch on protein-making factories in cells. These proteins carry out most of the body’s vital work. So sequencing the genome means figuring out how all three billion pairs of letters are arrayed, in large part to trigger the manufacture of proteins. At today’s press conference, the two groups noted that, although they used different methods to sequence the genome, they had arrived at surprisingly similar results. These were sometimes simpler and sometimes more complicated than they’d expected. Eric Lander of MIT worked with the Human Genome Project.

DR. ERIC LANDER: There are far fewer genes than we expected, only about 30,000 or so rather than the figure of 100,000 in the textbooks. There is a lesson in humility in this. We only have twice as many genes as a fruit fly or a lowly nematode worm. What a comedown.

SUSAN DENTZER: On the other hand, Lander said, even this limited number of genes triggers production of myriad proteins.

DR. ERIC LANDER: A typical human gene can make twice as many proteins as the gene in a fly or a worm on average. The proteins themselves also appear to be more complex. There are more multifunctional proteins that do double or triple duty in the cell.

SUSAN DENTZER: Scientists also noted that the sequence tells an amazing story of evolution. For example, humans have apparently inherited more than 200 genes from bacteria that invaded the species long ago, and the relatively small number of genetic variations that occur from one person to another shed new light on our common ancestor, homo sapiens, which emerged in Africa some 100,000 years ago. Much of today’s excitement focused on understanding how newly discovered genetic irregularities contribute to disease. That will pave the way for scores of new drugs and other treatments that could ultimately be the real legacy of sequencing the genome.

RAY SUAREZ: Now to the two men leading the efforts to decode the genome: Dr. Francis Collins, director of the human genome project; and Craig Venter, the president of Celera Genomics. Well, we’ve heard it called many times the book of life. It turns to be a lot shorter than anybody figured. Is that significant?

CRAIG VENTER: I think it’s going to be significant in lots of ways. The implications are going to be on everything from pharmaceutical development and how that happens to how we view our place in the biological continuum and the universe.

RAY SUAREZ: Does it make the job from here on out harder, easier?

CRAIG VENTER: It makes it different than a lot of people expected. I think we were being fed this notion that you get one gene, one protein, one drug. Out of the biotech industry. The number of those that’s going to be happen is going to be counted on both hands probably. What’s going to happen is we have to go into the protein world to really understand where the genome is taking the next level of biology. That’s ten times as complex at least. We have fortunately new tools that we’re setting up to be able to do protein sequencing and characterization at equal if not faster paces than we were able to do the human genetic code. But I think the challenge is understanding the complex of all these pieces working together so that you and I can have this conversation.

RAY SUAREZ: Well, Dr. Collins, go ahead.

DR. FRANCIS COLLINS: Well, I think the book is different than we thought. It tells us because we have only about a third the number of genes that we expected that those genes must be particularly clever in carrying out their functions. We’re still just as complicated as before we figured this out, right? So it must mean that our genes have a certain elegant way of doing multiple tasks more so than perhaps than simpler organisms do. For me as a physician, as somebody who is really interested in tracking down the genes that contribute to disease, to heart disease, to colon cancer, to diabetes, to Alzheimer’s Disease, it means that the number of genes we have to deal with and sift through is a shorter list. And that’s good news. That means we should be able to find the ones we’re most interested in, somewhat more easily. Our haystack isn’t quite as big as we feared it would be. That should advance the rate of progress in the medical consequences of this project, which is really the reason to do it.

RAY SUAREZ: Don’t you have to go down an extra level, an extra layer of complexity to figure out what’s going on, if you have fewer genes that it’s not as Craig Venter suggested one gene, one disease, one protein that has to be modified?

DR. FRANCIS COLLINS: It depends on the circumstance. There are certainly some diseases for which it is one gene, one disease. If you’ve inherited a particularly misspelling of the Huntington’s Disease gene, you’re going to get Huntington’s Disease. If you’ve inherited a particular misspelling of the gene for cystic fibrosis of a certain type, you’re going to get that disease. But diabetes and heart disease and mental illnesses we know and we’ve known long before today are going to be the effects of multiple genes. And we will now move forward with this book of life in front of us to identify those over the course of the next five to seven years. So I’m actually pretty optimistic that this situation puts us in a much more settled way to move forward over the next few years and find what those causes are and then use those diagnostically and better yet therapeutically.

RAY SUAREZ: Help me with a little bit of mechanics. Part of the explanation that we’ve been reading over the years for the 100 to 140,000 gene theory was that human beings are a much more complex organism, more complex nervous systems, more complex systems — a lot of things going on. And now it turns out that we’ve got a third of the genes we thought we did. Where is this complexity hidden now — so few genes more than a round worm?

CRAIG VENTER: It’s a wonderful question. In fact one of the articles that’s in one of the journals compares us to a triple 7 airplane in that we have the same number of parts and therefore they claim it must be solvable but in fact that’s the wrong way. We’re not hard wired like the airplane is. We have 100 trillion cells. If we have 200 to 300,000 different proteins constantly changing, whether they’re phosphorylated or not, we have something like 10 to the 20th different potential combinations in our cells….

RAY SUAREZ: So it’s 10 with 20 zeros?

CRAIG VENTER: That’s right. So instead of the simplicity view of life which we had a large number of genes and there was a gene for everything, in fact, the fact that we have fewer genes means we have to understand these next levels of complexity much more than we would have otherwise.

DR. FRANCIS COLLINS: There’s actually data now to support that. We’ve done a systematic comparison as have the scientists at Celera of the proteins that we humans can make. How do they compare to worms and flies? The numbers of genes are not that different but the number of proteins can be significantly different, and you can get more out of a gene if you’re a human. If you look at proteins, they’re interesting. They have acquired additional domains, additional properties along the evolutionary process. Think of it this way: A worm protein needs to cut another protein called a protease, it will do — it will do it really well but may not do a lot of other things. The human counterpart may not only cut the protein but be regulated in some way. If they’ve got the cutting knife, we’ve got the Cuisinart that you can set all sort of slicing and dicing options to instead of just doing one thing. So our come complexity is recovered, our self-pride is recovered. We’re really just at complicated. It comes about in a different way. The simple idea that gene count explains everything has gone out the window.

RAY SUAREZ: It’s only been a couple of months since that first flush rapturous announcement that the whole thing had been decoded in the first place. This is really your first run at it.

CRAIG VENTER: This is our first look now. It’s so much information it’s taken both teams the last seven months since gathering the data and getting it put together to really try and see what it means. Only a little over 1 percent of those 3 billion letters code for proteins. If you’d asked any of us a year ago, I would have said 3 percent. A lot of scientists would have said 5 to 10 percent. I think we’re all stunned that it’s in the 1 to 2 percent range. These are clear surprises but in fact you have to sort through all the rest of this material. They’re all A, C, Gs, and Ts to find out the right pattern to interpret and say this looks like a gene. But one of the questions you might ask is what is different in our genome from a fruit fly? We have roughly twice as many genes. Do we just have two of everything or are they more complex like Francis said? In fact, they’re definitely more complicated but we see specific sets of genes that expanded in the last 600 million years. We have an immune system; we have a blood system — a great expansion of the central nervous system but the most interesting category that fits with all the things we’ve been talking about is we see a huge expansion in the genes that are responsible for regulating the expression of other genes — so more complex networks and interactions with the same basic components.

RAY SUAREZ: Well, I already knew coming in here that I was more complex than a fruit fly, but before today… before your research came out I didn’t realize that humans shared almost every gene with a mouse, for instance. There’s not that much different.

CRAIG VENTER: And with a dog and a cat. In fact studies at NIH a few years ago showed that in fact the X chromosome the gene order on the human X chromosome is identical as far as scientists could tell at the time to the gene order on the cat X chromosome. I think we only found one or two differences from the mouse X chromosome. Yes, we have the same parts and really puts the emphasis on this difference in regulation, the timing, the rheostats that says these genes should be turned on now. Some people have argued the entire difference between us and chimpanzees is just in the regulation of the gene expression.

DR. FRANCIS COLLINS: And we shouldn’t overstate that there’s a small number of genes between us and other species because if you look at the genes we share they probably do have subtle differences. If there are only a few hundred genes that differ between us and the mouse, it doesn’t mean that you could put those genes back into the mouse and the mouse would start singing opera and playing golf. There would be all sorts of other differences in all of those other 30,000 genes that are subtle enough to have a pretty significant effect.

RAY SUAREZ: So this first round of research, this first set of interpretations, do you feel closer to actually having practical results like drugs and therapies coming out of this than you did last month, two months ago?

DR. FRANCIS COLLINS: I think this is a big step forward. This is a fundamental moment in the history of science where, for the first time, we have the book of life in front of us. We found out it’s not just one book. It’s three books. It’s a history book that tells us a lot about where we came from. It’s a parts manual that tells us about the genes and the proteins and maybe something how they fit together. But most importantly it’s a textbook of medicine. And having this in front of us, even though right now we can’t read a lot of the sentences and we don’t quite understand the language, now gives us that bounded set of information about hereditary which should enable us to come up with diagnostics and preventive medicine strategies and new medicines more rapidly than we could have contemplated in the past without this information. I can’t tell you we’re going to cure all those diseases tomorrow or next week but we are substantially further along now because we have this information.

RAY SUAREZ: So, where do you start? What do you do first?

CRAIG VENTER: The biggest danger is over promising. That’s happened so many times over and over when there’s a basic science advance like this. One thing is very clear. I spent ten years trying to find one gene. That now can be done in a 15-second computer search on our Web site and thousands of scientists did that today maybe saving ten years of research with that 15-second search. So every genome that we’ve published, every genome that has been published is like a catalytic event that changes the baseline for scientists and the world where if they can save ten years, they do it and start at that next stage and build on it. It’s hard to estimate how fast thing will change because we’re in one of these rare periods where things are changing catalytically based on the discoveries that the scientists will make with this new information. Cumulatively that will change things even faster. So we’re both I think extremely optimistic.

DR. FRANCIS COLLINS: Think of this way: Having this information available on the Internet for any scientist with a good idea — which has been the case for the sequence produced by this international consortium every 24 hours for years in the past — allows an empowering of all the brains of the planet to work together now to try to understand what this book is telling us and to move into those medical advances that we all dream of and deserve.

RAY SUAREZ: Are there certain questions that were sort of damned up behind waiting for this first rush of research?


RAY SUAREZ: Where we may see things moving fairly rapidly now?

DR. FRANCIS COLLINS: You now have the ability not to look at one gene at a time and try to guess what its partners are. You have the whole list in front of you. Of course we need new technologies and they’re being developed all the time to allow us to do that on such a global scale. But we have the foundation. No longer will we have to build a house without being sure of what it’s built on. We understand those very important pillars and bricks that underlie human biology.

CRAIG VENTER: But even more important than that, as Francis and people have been doing gene at the time biology, we’ve been limited in terms of the scope of what we can do to try and measure one protein at a time, one gene at a time and guess how that impacts biology as a whole. We’re not alive one gene or one protein at a time. We’re now going to start, as of today, this era of holistic biology where we have to… if you look at that gene chart of all that information there, you can’t just look at one component without taking the rest into consideration. That’s going to fundamentally change how research is done.

RAY SUAREZ: Craig Venter, Dr. Collins, thank you both.


CRAIG VENTER: Thank you.