Rediscovering Biology: Molecular to Global Perspectives
Genomics Expert Interview Transcript: Jonathan Eisen, Ph.D.
Assistant Investigator, The Institute for Genomic Research
An investigator at the Institute for Genomic Research (TIGR). His research interests include DNA repair, extremophiles, and phyogenomics.
Interview with Jonathan Eisen, PhD, Investigator at the Institute for Genomic Research (TIGR). His research interests include DNA repair, extremophiles, and phyogenomics.
Can you talk about how the field of genomics has changed the way genetic research is being conducted?
One of the great things about DNA sequence information and protein sequence information is that it’s very easy to represent on a computer, and what this has allowed people to do is build big databases of sequence information from different organisms and put them altogether into one central location. This allows people to look in [an] organism for a similarity to any gene that anybody else around the world has studied, and this only works if people are willing to share their information with everyone else.
Unlike many other areas of scientific research where people keep their results to themselves for many years, if not their entire careers, [people in the] field of molecular biology have been sharing all of these results and sending their results to these central databases. This allows people to build upon the previous research of other people in a very useful way, rather than having to go look up [a] paper and look at the sequence of the gene printed on a piece of paper somewhere and try and guess if [a] gene is similar to that string of characters on page 35 of some journal. All you do is submit a search request to a big database and it stores all of the information from everybody else’s work if they were willing to share it and tells you the result. This sharing of information about sequence data is what has catalyzed the revolution in molecular biology and evolutionary biology, and it’s what makes genomics useful.
The Human Genome Project received a lot of media because it sequenced humans, but aren’t there a lot of other organisms that have been sequenced?
Well, just as the human genome is useful for studying humans, it’s equally or more useful in some cases for studying other organisms. There are many very important organisms out there that contribute to disease, that contribute to agriculture, [and] that contribute to ecology of the planet. Many of them are being worked on in the same way as the Human Genome Project. In fact, the first complete genome of any organism to be determined was that of a bacterium, Haemophilus influenzae, a pathogen. Once that genome was determined, it spurred a lot of other people into working on the genomics of their favorite organism.
As it is now, there are genomes of maybe 70 or so prokaryotes and maybe 8 or 9 eukaryotes like humans, that have been completed or nearly completed that are available to the public and to scientists to use. This includes a plant genome, an insect genome, many bacterial pathogens, many bacteria from the environment that do things in carbon cycles and nitrogen cycles, and lots of other organisms. People are furiously working on hundreds more as we speak.
Why are you are conducting research on Deinococcus radiodurans?
Deinococcus radiodurans, as the name implies, is a radiation-resistant [bacterium]. It’s the most radiation-resistant organism that we know, and in fact it’s even in the Guinness Book of World Records reflecting that.
If you expose Deinococcus radiodurans cells to gamma radiation, they don’t die very rapidly, and if you compare that to how other organisms die in response to gamma radiation Deinococcus radioduran cells survive the best. [After gamma radiation], bacteria cells like E. coli will die very rapidly; human cells will also die very rapidly.
How was this unique organism discovered?
Deinococcus radiodurans, which used to be known as Micrococcus radiodurans, was not discovered for a long time in microbiology. People were trying to culture organisms out of the soil, out of the ocean, out of plants. No one had found Deinococcus radiodurans or anything like Deinococcus until the ’50s when the government started to use gamma radiation to sterilize [canned] meat. When they gamma radiated large amounts of meat and sent the cans all over the place, they actually discovered that occasionally something was left growing in these cans with the meat.
What was left was Deinococcus radiodurans. The gamma radiation did what it was supposed to do-it killed all the pathogens. It killed things that you didn’t want to eat when you opened up that can of meat. But occasionally if there was one little bit of Deinococcus in the can when you sealed it, it survived the gamma radiation.
The Chernobyl tragedy was a horrible example of what radiation can do to living organisms. How does radiation work to kill an organism?
[In Chernobyl], gamma radiation was spread into the soil and into the air and has two major effects on living organisms. The first is that it’s toxic. That is, it can actually kill cells and it does this by damaging the DNA and the membranes and the proteins, the macromolecules that make up a cell. And if those get damaged enough, an individual cell is going to die.
The second effect is that some of the cells that survive the gamma radiation will, when they replicate, have mutations. [These mutations] will then get passed on from generation to generation if it’s in the germline in humans, that is the sperm and the egg cells.
But for Chernobyl and the atomic bombs and for a lot of other forms of gamma radiation, the main effect that we see is the toxic effect. That is what happens in the first few days to weeks after getting exposed to the radiation-people die. Too many of their cells get killed or damaged so severely that the replication of blood doesn’t work, normal cellular functions don’t work, and a lot of metabolism doesn’t work. The people just can’t recover.
What we now see years after things like the atomic bomb is mutations starting to accumulate. So cancer, which is caused by mutations in cells, is a byproduct of getting exposed to gamma radiation.
So Deinococcus doesn’t exhibit these effects of gamma radiation?
We’re interested in Deinococcus and in why is it able to survive these massive doses of gamma radiation. It also survives UV irradiation and other forms of DNA damaging or toxic agents. But human cells aren’t able to survive those same doses. What makes Deinococcus different? What makes it able to do this? And that’s what we’ve been trying to study by looking at Deinococcus and its genome.
[We] compare radiation resistance of different organisms in multiple ways. One way to do [this] is with a “survival curve.” You start with a million cells of that organism, let’s say it’s a bacterium, and you irradiate it with a particular dose of irradiation. [You then ask] how many cells [have] survived.
If you compare Deinococcus to other organisms like the bacterium E. coli, which a lot of people study, it’s about a thousandfold as resistant to gamma radiation as the E. coli bacterium. So if there were ten [cells] of the E. coli left after gamma radiation, there would be 10,000 or more of the Deinococcus radiodurans cells left after gamma radiation.
You can basically gamma radiate Deinococcus to levels that we thought, before we discovered Deinococcus, should destroy the entire cell, should shred up the membranes, the proteins, the DNA, and yet somehow it’s able to survive that dose.
I think everybody who studies Deinococcus radiodurans is stunned daily when they do their experiments or when they look at the results of other people’s experiments. It still doesn’t make sense to us how this organism is able to survive such doses and it actually doesn’t just survive them. It carries out its business in the presence of radiation. It can live in ongoing doses of radiation. It doesn’t accumulate mutations as far as we can know. So [when] human cells [are exposed to radiation], not only do they die, the ones that survive frequently get mutations. Deinococcus doesn’t have either and we’re always amazed when we look at this.
Have you figured out why Deinococcus is so good at resisting radiation?
When you look at Deinococcus radiodurans or other radiation-resistant organisms, there are three ways we can think of that it would be able to be resistant to the radiation. The first is that it could protect itself from irradiation and we know that organisms that are resistant to ultraviolet irradiation frequently coat themselves with ultraviolet-absorbing pigments so they protect themselves from ultraviolet irradiation.
A second way that an organism could be resistant to irradiation would be if it just tolerates the damage that occurs. You can imagine a Mercedes might tolerate having its front fender bashed in and still be able to drive, but some other car would not even be able to move if the same [thing occurred]. So that’s toleration of damage.
And then there’s a third process which we think is important in organisms, which is the repair of the damage. So if [the radiation] gets through the protective mechanisms of the cell, [the cell has ways] of actually fixing the damage that occurs. So there are these three processes: protection, toleration, and repair that all could contribute to why an organism is resistant to radiation.
There’s some evidence for Deinococcus that’s been accumulating for many years that it is not protecting itself from the gamma radiation, so we already had ruled that one out before looking at the genome sequence. If you irradiate Deinococcus radiodurans, it gets about the same amount of damage to it that any other cell.
Our personal theory is based on other evidence that repair is the main mechanism by which this organism is so resistant to radiation. We have our own personal theories about the type of repair that we think is particularly important for Deinococcus radiodurans and we looked at the genome to try and see if there was evidence for repair in general and those types of repair that we thought were important.
How has the field of genomics affected your research on Deinococcus?
People have been trying to study Deinococcus radiodurans since it was discovered in the ’50s and although you can grow it in the lab, it’s not the easiest organism to work with. There’s not a big amount of literature of people studying all of its different processes and it’s not a model organism. There aren’t thousands of scientists working on it.
One way to learn an enormous amount about the organism very rapidly is to study its genome. So what TIGR decided to do was to sequence the entire genome of this organism. When we determine the sequence of all of those genes that allows us to guess a lot more about the biology than had [previously] been discovered about this organism, it allows us to leapfrog into new types of experimental research.
How does knowing the sequence of Deinococcus radiodurans help check your hypotheses?
The genome of Deinococcus radiodurans is of immediate use for understanding some earlier experiments that were done on Deinococcus. In the past people had mapped within the genome the locations of particular genes that they thought were important for radiation resistance, but they didn’t actually have the sequence of those genes. We can immediately take the data from those old experiments and guess which gene they actually had mapped, because we can read the sequence and try to figure out what that gene might be doing.
In addition, we take the complete genome sequence and try to predict the functions of every gene in the genome. We have found many genes that may be involved in aspects of radiation resistance from the complete genome. We also can take the complete genome sequence [to] help us do other biological experiments, [in which] knowing the sequence of the gene itself is a requirement before you can even do the experiment.
Having the complete genome sequence means we can do those experiments on every single gene in the genome: this catalyzes an enormous amount of experimental work in the organism by having the instructions there for the organism and for doing the experiments.
What was the process used to determine the genes in the sequence of Deinococcus?
We use a suite of tools to determine the entire sequence of Deinococcus radiodurans. Of course, that doesn’t tell us what the organism is doing. That’s just a bunch of A’s, T’s, C’s, and G’s in a string. In Deinococcus’ case, it’s a few million of those. But that doesn’t really tell us anything. What we want to do is figure out where the genes are within that genome, where the start of each gene is, where the stop of each gene is, and only then can we try and guess what’s going on with this organism.
The first thing we do is take the complete genome sequence and using a set of computer programs-bioinformatics tools-we scan the genome to look for where there might be what are called “open reading frames,” which are sections of DNA between a start codon for where a gene is supposed to start and a stop codon for where a gene is supposed to end. That’s what an open reading frame is.
You can get open reading frames in a genome that aren’t real genes, so some of the open reading frames aren’t going to be real. Once you find all the open reading frames your work isn’t done. The problem is that many of these open reading frames aren’t made into genes ever. You expect to have a lot of open reading frames just by chance because the start and stop codons are very common bits of sequence in a complete genome of an organism. What we have to do is try and figure out which of these open reading frames are real and which ones are just statistical noise.
We use computer programs to look through each of the open reading frames and ask: “Is this a good candidate for a gene or is it a bad candidate for a gene?”
There are other signals in addition to the start and stop codon that can tell the organism that this is a gene to be made into a protein, so we look for those in the promoter regions of the gene. We [also] look for other signals, again using a suite of computer programs. Then at the end of that, we have a list of what we think are all of the likely genes in this organism based on the genome sequence.
What type of data does a BLAST search provide to you?
When you look through thousands of genes, like in Deinococcus radiodurans, there are three different types of results that we get when we [conduct a BLAST search for] these genes. The first is that it has no similarity to anything that’s ever been characterized anywhere. We call this a “hypothetical” protein. It’s hypothetical because we don’t know if the computer program that predicted that it was real is correct, and since we didn’t find that it’s similar to anything else, we have no other evidence that it’s real, so we call it hypothetical.
In many cases, we find our gene is similar to genes that have been found in other species, but none of these genes has an assigned function. Since none has been assigned a function and [have not] been studied experimentally, we’re not really certain that they’re all real genes either. [These are] what we call “conserved hypotheticals.” They’re conserved between species. But no one has any good evidence that they’re real either.
What we’re really looking for in most cases, is our gene to be similar to a gene that has a known function, and [it is in these] cases where we can try and guess the function of our gene based on the functions in other organisms. The problem with a lot of genomics is even with more and more genome sequences, a lot of the genes in most species are either hypothetical or conserved hypothetical. For a large fraction of every genome that we look at we can’t make a good assignment of function for those genes.
After using some of these genomics tools, what did you find out about Deinococcus?
When we [ran the gene prediction programs] for Deinococcus, we had about 2,500 genes. [This list represented parts of the genome that] we thought were being made into proteins. Again, this is a prediction-but it’s a useful prediction for us to follow up with other analyses. [Also], just having this list of genes doesn’t tell us the biology of the organism. This is just a bunch of sequences of DNA. They start with a start codon, they stop with a stop codon. They have the right composition to be made into a gene. That’s not enough. We want to know what that gene does in the organism. Is it involved in repairing the DNA? [That] is one of the things we were interested in. So how do we figure that out?
The first step in trying to figure that out is to take each sequence of each gene and compare it to all known genes that have ever been characterized in any other organism. The way we do that comparison is by looking for similarity in the sequence of the gene or the sequence of the protein that the gene is supposed to encode.
We take the sequence of the protein and we use a computer program [BLAST] to compare it to all known sequences of all proteins that have ever been characterized anywhere else. There’s a big database of these called GenBank, that we search against: We ask, “are there any proteins out there that ours is more similar to than we expect by chance?” And if there are, then we can start to look at what the pattern of the similarity is between our protein and something that’s been characterized before.
When we find some similarity that’s more than we expect by chance we can look for different types of similarities. In some cases, it’s just a small piece of the protein that’s similar to a previously characterized protein: These are called “motifs.” They’re small motifs that might have a functional importance.
For example, there are motifs that are involved in binding ATP. These motifs are conserved between species: that is, the sequence in the protein that makes up an ATP binding region in a protein is similar between even distantly related species. That suggests that our protein binds ATP. We can also find much bigger regions of similarity, let’s say, the entire protein is similar to another protein that’s the same size-[it’s] very similar in its entire amino acid sequence.
This level of similarity is something that we use to infer that the genes are homologous, that is they have a common ancestry. Those genes came from the same original gene some time over evolutionary time. So, for example, if you took in humans the gene for betaglobin, and you searched it against one of these databases, you would find that it’s similar to the gene for mouse betaglobin. The reason that they’re very similar to each other is that they came from the same ancestral betaglobin that was in the common ancestor of mouse and humans.
We do the same type of thing with Deinococcus radiodurans. We look for whether or not we can find any genes that have been characterized in other organisms similar to the Deinococcus gene. If the gene that it’s similar to has a known function, then we can at least start to get an idea as to the function of the Deinococcus gene.
For example, when we searched it against these databases, one of the Deinococcus genes was very similar to a gene involved in DNA replication in other organisms. This Deinococcus gene is so similar to these other genes involved in DNA replication that we think it’s also the Deinococcus DNA replication gene. We did the same thing for DNA repair genes-we found many genes in Deinococcus that were similar to genes in other bacteria that were known to be involved in DNA repair. So that’s our first pass. We’re going to guess that that gene is possibly involved in DNA repair in Deinococcus as well.
The reason this analysis works at all is that many genes are very similar to each other when they’re found in different species, and the reason for this is that all organisms on the planet have a common ancestry and the more recent their common ancestor, the more similar they’re going to be to each other in their entire genome. Each individual gene within the genomes of two closely related organisms will also be very similar to each other.
Again, betaglobin in humans is very similar to betaglobin in mice and over evolutionary time, many genes are kept similar to each other even as the species diverge. In particular, [the most conserved genes are the ones] that are involved in the core functions of an organism replicating the DNA, transcribing the DNA into RNA, making the RNA into proteins, or repairing the DNA.
A lot of the core functions between bacteria and humans are very similar and you can use information from humans or from yeast or from the fruit fly to learn about the functions of genes in bacteria. Of course, it probably would be better if you knew the functions of the genes in other bacteria-fortunately, a lot of bacteria have been studied so it works pretty well for Deinococcus.
What were the specific results of your research?
When we finished sequencing the genome and started to analyze the individual genes, we were hoping that the reason for the radiation resistance of Deinococcus would leap out at us from the pages of our results. We were hoping maybe it would have 50 copies of some gene that was involved in radiation resistance in other organisms and therefore the extra copies would explain why Deinococcus was radiation-resistant. Or maybe it had every known DNA repair pathway from every organism that had been ever studied and they were all put together into one organism and that would explain its radiation resistance. Unfortunately, that wasn’t the answer in this case.
When we looked at the complete picture of the entire genome of Deinococcus, even with a little bit of “rose-color glasses,” we couldn’t really see that it was different in terms of its likely DNA repair capabilities from anything else that was out there. It had a lot of genes that we predicted to be involved in DNA repair but so does the bacterium E. coli, so does the yeast, and so does the human. But those organisms aren’t radiation-resistant. So we didn’t really get the answer we had hoped from the genome sequence.
What the genome sequence told us, however, was that we need to rethink what we’re using genomes for. In this case, we were hoping to immediately discover the reason for a novel process in an organism from looking at each individual gene. The problem with this idea is the way that we were guessing the functions of these genes was by comparing to known genes and other organisms. Well, if Deinococcus invented something new, using some gene that had never been used before for DNA repair, well, we were never going to find it by comparing to other organisms-they’re not radiation-resistant. So maybe Deinococcus took a gene that’s involved in making the membrane and turned it into a phenomenal DNA repair gene. Well, we’re just going to think it’s involved in making the membrane because that’s what it does in other species.
So genomics and genome analysis is an incredibly important first step in studying an organism, but it needs to be followed up by experimental work-in particular for organisms that have novel, unusual processes.
There’s an amazing possibility which we hadn’t really thought about for Deinococcus, which is its radiation resistance may actually be due to [a lack of a] gene that other organisms have. So other organisms might have genes that prevent them from being radiation-resistant, and if you delete these genes, maybe that’s what makes Deinococcus radiation-resistant.
We know that the same type of thing happens for pathogens-there are many pathogenic bacteria in which the nonpathogenic strain actually has a gene that suppresses pathogenesis. Most pathogens don’t want to be pathogens all the time. They don’t want to kill their hosts; then they can’t spread around the world.
So they have evolved genes that suppress their pathogenesis: The pathogens actually are missing these genes and the only way you can figure out what is involved in pathogenesis is by looking at the differences between a pathogen and a non-pathogen. In the same way, radiation resistance might actually be due to the loss of genes that have some other function.
For example, after irradiation by gamma rays, you get a lot of double [and single] strand breaks in the genome, and most organisms that don’t normally get exposed to gamma radiation might think that those double strand breaks are part of another process. Maybe they’re thinking that they’re getting attacked by a virus and they have mechanisms to chew it up and get rid of it.
Well, maybe Deinococcus just got rid of those processes or at least reduced them somewhat. So rather than chewing up its DNA after it gets gamma radiated, maybe Deinococcus doesn’t chew it up quite as much.
If Deinococcus doesn’t do that as much, it might make it radiation resistant. That means that scanning the genome for the “gene” for radiation resistance is the wrong way to go about it. We might have to scan the genome for things that are missing that are radiation resistant suppressors in other organisms.
How has your hypothesis changed in light of this new data?
What we’ve learned in part is that we may have been asking the wrong question in the first place. Rather than looking for the genes that make the organism radiation resistant, we might want to also look for the genes in other organisms that suppress their radiation resistance that are missing in Deinococcus. That’s a possible mechanism.
One of these genes might be what’s called a nuclease. Nucleases are things that degrade nucleotides. If Deinococcus was somehow able to better control its nucleases or even delete them altogether it might become more radiation resistant.
[Another theory, or aspect of our research is] what’s called a “cell cycle control” in Deinococcus radiodurans and we know that this exists in humans. [Cell cycle control is] an important part of eukaryotic cell growth, [and] not studied very much in bacteria. But part of the control of the cell cycle in humans is telling the DNA replicating and degrading enzymes when to work and when not to work.
So maybe Deinococcus has something like that. The genes might even be there but they’re turned off and only turned on in very specific situations; therefore after gamma radiation, they may be off and therefore it allows the organism to stitch everything back together before these degrading enzymes get rid of it. It’s sort of a race between the repair process and the degradation process. All you have to do possibly is just slow down the degradation process in the race and the repair will win more often and therefore you’ll be better able to survive.
What is the next step in your research?
The next step is to do more experimental research. We need to go in and look at the function of the genes in Deinococcus that we don’t think are involved in radiation resistance. There are lots of other genes in the genome, some of which we couldn’t predict their functions because they weren’t similar to anything, or they were similar to things that didn’t have a [known] function. We have to go in and try to experimentally determine the function of all of those genes.
But in addition, what we want to start to do more comparisons of Deinococcus with other organisms to see if we can get other organisms that are radiation resistant or [find] other organisms that are closely related to Deinococcus that aren’t radiation-resistant. [We want to] to try and get a better picture of what the differences are between radiation-resistant and non-radiation resistant organisms.
In addition, we need to follow up on some of the predictions from the genome of which genes we think are involved in DNA repair and start to study them experimentally. [We need to] look at what happens to the cells when we make mutations [in the genes] and to study the whole genome expression of all of the genes when we make those cells defective in particular genes. [This process] is called “functional genomics,” which is studying not just the sequence of the genes but the functions of the genes. Only, then will we start to have even a small picture of what makes this organism so radiation-resistant.
How do you feel about your progress with Deinococcus so far?
Well, there are plenty of people who worked on Deinococcus long before the genome sequence came out and they learned quite a few things about its biology. We think that now that we have the genome sequence we’re going to learn a lot more, but it’s not going to be instant. So we weren’t really expecting everything to be solved by the genome sequence. We may have hoped that a little more about the DNA repair would have been solved, but we did learn a lot about the metabolism of the organism, the cell structure, the growth patterns, the degradation of carbon compounds, the amino acid synthesis-all these things are fundamental towards doing further work with the organism. One would hope that maybe in five or ten years, the mechanism of the radiation resistance will really be solved. We didn’t really expect it to be solved immediately.
It’s a very exciting thing to work on and it’s a mystery still. We don’t know how it’s able to do this. That makes it more exciting, not less exciting.
What are the implications of your research on Deinococcus?
I believe it is very important for studying Deinococcus to help us [with the] basic science of understanding the nuts and bolts of DNA repair processes. Certainly understanding the nuts and bolts of something so important as DNA repair will have implications for understanding humans, plants, yeast, and other bacteria. DNA repair is involved in protecting organisms from toxins, in preventing mutations, in influencing evolutionary rates and processes. We just don’t know what the discoveries will lead us to in other organisms.
Of course, we are hoping that there will be practical implications of our studies of Deinococcus as well. New enzymes that might be discovered could be useful in treating DNA damage in humans due to UV radiation or exposure to gamma radiation.
[Another] aspect of Deinococcus that has been of particular interest to the Department of Energy, is what’s called “bioremediation.” Bioremediation is cleaning up toxic waste sites using biological organisms and the reason Deinococcus is thought of as being important for this is that there are a lot of toxic waste sites that have both high levels of radiation and high levels of nasty carcinogenic toxins. Very few organisms can survive the high levels of radiation that are present there in order to degrade the toxins. So if you want to use an organism to degrade the toxins, it’s great to start with an organism that’s already radiation-resistant and then teach it – using either genetics or other tools- how to degrade the toxins at the same time. Then you could imagine releasing it into one of these places. It won’t mind that it’s radioactive and then it’ll degrade the toluene and benzene and other nasty chemicals that no other organisms could do because they’re killed by the radiation.
What are the unanswered questions in your research?
The biggest unanswered question right now in genomics is what do all of those hypothetical and conserved hypothetical proteins do, what are their roles in organisms, are there new metabolic processes or new functions of cells waiting to be discovered in that list of genes for which lots of organisms have but nobody has any idea what they do.
Want questions would you want students/teachers to ask?
I think one thing that we’ve learned from genomics that all scientists certainly and all people should be thinking about is the connections among all organisms. We can’t predict what we’re going to learn from studying something like Deinococcus radiodurans, but the fact that organisms are all related to each other through a common ancestry means that studying Deinococcus can have major implications for studying organisms it doesn’t look like or that we don’t think it has much to do with in the real world.
Another thing that I think people should be thinking about in specific relevance to genomics is that the genome sequence is a starting point for biological research. You’re not done. It doesn’t all of a sudden tell you everything about the organism. The Human Genome Project was meant to be the beginning of medicine based on genomics. Many people thought it was the end, that we were going to cure all these diseases by having the genome sequence. It’s not the way it works.
We will use the genome sequence of these organisms to better do our biological research, but it doesn’t [take the place of] having to do biological research. That’s the key thing that everybody working on genomics or teaching about genomics should know.
1.1 Online Textbook and Video
The online textbook chapters support and extend the content of each video. The Web version can be viewed as a full chapter or as individual sub-sections, and includes links to glossary terms and other related material.