MDC researchers track the output of an entire mammalian genome from DNA to proteins for the first time
A cell’s functions and behavior depend on the total population of molecules present in it at any given time, and how they respond to changes in the environment. Since Francis Crick declared “DNA makes RNA makes protein” in 1958, scientists have unraveled the mechanisms by which the hereditary information in genes is used to produce messenger RNAs and proteins, but counting them to obtain an accurate picture of cells’ contents has been notoriously difficult. One gene can spawn huge numbers of messenger RNAs (mRNAs) over a cell’s lifetime, and one mRNA can be used to produce vast numbers of proteins – but how many? And how long do they function before being taken apart again? Now scientists at the MDC’s Berlin Institute for Medical Systems Biology (BIMSB) have combined a range of new technologies and a mathematical modeling approach to solve some of these questions. The study, carried out by the groups of Matthias Selbach, Wei Chen, and Jana Wolf, tracks the global output of a mammalian cell for the first time, measuring quantities, lifetimes, and predicting rates of synthesis for its RNAs and proteins. The concept for the comprehensive project was developed in intensive discussions between experimental and theoretical scientists. Their work appears in the May issue of Nature and offers new insights into the functions and evolution of animal cells.
The scientists tracked the output of more than 5,000 genes in cells obtained from mice. An accurate census of the cell, Matthias says, requires counting the number of mRNAs synthesized from each gene, the number of proteins made from the template of each mRNA, and the rate at which each type of molecule is degraded.
“Earlier experiments about the lifespans and productivity of mRNAs and proteins have produced a very unclear picture,” Matthias says. “They relied on far smaller numbers of genes, and they typically focused on single steps in the process. Levels of messenger RNA molecules measured in one experiment were usually compared to levels of proteins obtained in another experiment in a different lab, under different conditions. Laboratories didn’t have methods to track what was happening in one cell through all the stages of the process. Additionally, most studies have relied on drugs that block single steps in the process, leading cells to behave unnaturally, or artificial molecules that may not behave like their natural counterparts.”
PhD student Björn Schwanhäusser from Matthias’ lab and his colleagues overcame these limitations by combining new technologies in an original way. Matthias’ lab has become a world leader in a method called stable isotope labeling by amino acids in cell culture (SILAC). This approach relies on growing cells in a standard culture medium, and then moving them to another. The amino acids in the new medium – which will be used to build new proteins starting at the time the cells are transfered – are “heavy” because their atoms have extra neutrons.
An instrument called a mass spectrometer can detect the difference between the two types of amino acids. So scientists can track the speed of degradation of proteins from the old medium and the construction of new molecules in the second medium.
In parallel, Na Li, a PhD student from Wei Chen’s lab, measured RNA levels using the most state-of-art sequencing technology. “It is not a trivial task to determine the absolute mRNA copy number at the genome-wide scale,” Wei says, “Most high-throughput technologies, such as microarrays, tell us only the relative difference in gene expression between different samples. It is almost impossible to use it for absolute quantification. Thanks to recent developments in novel sequencing technologies, we can now obtain quite a precise estimation of copy numbers for thousands of different mRNA transcripts in one cell.”
The mRNA degradation rate was studied using a strategy similar to that for proteins. Instead of amino acids, one of the nucleic acids , the building block of RNAs, appears in a different version between the newly produced mRNAs and old ones. Using biochemical methods, new and old RNA populations can be separated and distinguished from each other. The difference in the level of RNAs between the two samples can indicate how fast RNA transcripts are degraded.
The scientists discovered that on the average, proteins were five times more stable than mRNAs. Proteins had an average “half life” (the time point at which half of the quantity of a given molecule is degraded) of 46 hours, compared to nine hours for mRNAs. “We also discovered a wider range of lifetimes – from very short to very long – for proteins than for mRNAs,” Matthias says. “Interestingly, there was no general correlation between the half-life of a protein and the mRNA from which it was made. In other words, a short-lived mRNA could produce long-lasting proteins, and vice versa.”
Another fascinating result was that proteins were about 900 times more abundant than the mRNAs used to make them – one way to think of this is that on the average, a single mRNA is used to manufacture about 900 copies of the corresponding protein. Although the exact quantities differed a lot between different genes, there was a clear general correlation between the amounts of mRNAs and their corresponding proteins.
Half-lifes and levels of mRNAs and proteins don’t give the complete picture. Obtaining it requires the quantification of transcription and translation rates, but these are difficult to measure. Here, mathematics and modeling again come into play, and the necessary expertise was provided by Jana Wolf’s group. “The transcription and translation rates have been predicted by applying mathematical modeling,” says Dorothea Busse, a postdoc in Jana Wolf’s lab. “Overall, this approach allowed us to fully quantify the gene expression cascade for more than 5000 genes.”
“We found that the average gene spawned about two mRNA molecules per hour, but of course there is wide variation in individual cases,” Matthias says. Those molecules went on to produce high numbers of proteins – an average of about 40 per hour. But here, too, mRNAs show a wide range of usage; the most productive molecules build 100 times as many proteins as the least. And there seems to be an upper limit: the maximum rate seemed to be about 180 proteins per mRNA per hour.
Cells can block protein production at many stages: by not making mRNAs from a gene in the first place; by quickly degrading an mRNA; or by putting the mRNA “on hold” – in other words, keeping it around, but blocking its translation into proteins. mRNAs may be tagged with molecules, for example, that obstruct the protein-synthesis machinery until they are removed again. And finally, the cell may remove proteins from circulation once they have been built.
With so many ways to intervene in the production line, which does the cell use most? Matthias says the major control seems to happen when mRNAs are translated into proteins. “In predicting how many proteins you’ll find,” he says, “the production of new proteins plays a much larger role than breaking them down.” As well as providing key information about this crucial question, the study revealed specific sequences of mRNAs and proteins that can be used to predict how productive they are likely to be.
For example, mRNAs have tails called 3′ UTRs of various lengths which do not encode protein sequences; instead, they influence how the molecules are handled. The new project showed that a longer 3′ UTR usually results in a shorter lifespan for the mRNA. And the scientists found other signs of short lives: if molecules had a predominance of two particular nucleic acids (A and U), or if they permitted a protein called Pumilio2 to bind.
As well as discovering new principles that describe the productivity of particular genes, the scientists hoped to develop a mathematical model that could be used to predict it. When tested in additional, independent experiments, the model provided by Jana’s group successfully provided numbers for about 85 percent of the genes.
Do these general principles hold for other types of cells and other organisms? The scientists compared their results to 2,030 genes in a line of human breast cancer cells grown in the lab. “We focused on genes with clear evolutionary relatives to the mouse counterparts we had studied,” Matthias says. “The model gave accurate data for about 60 percent of the genes. It’s a smaller number than if we remain in the mouse, but most of the variation comes from differences in the rates of translations of mRNAs into proteins.”
The data also provide insights into the evolution of cellular processes. Molecules that carry out similar cellular functions often resemble each other in terms of their lifespans and productivity because they have co-evolved through natural selection. For example, the mRNAs and proteins involved in “housekeeping” tasks that all cells need to survive tend to be more stable than molecules that carry out more specialized tasks. This is likely due to differences in the amount of energy that cells need to carry out the transformation of genetic information into proteins .
“Protein synthesis is the most ‘expensive’ step,” Matthias says. “It consumes more than 90 percent of the energy available for the assembly of molecules, whereas building mRNAs from genes requires less than 10 percent. The study showed that most mRNAs – and especially proteins – are stable. The exceptions are usually molecules that help cells respond quickly to stimuli. This reveals what seems to be an optimal evolutionary principle, a trade-off between energy efficiency and a cell’s ability to respond quickly to environmental changes.”
The findings provide a rich resource for the scientific community. “The unique, quantitative data turned up in the study will help researchers search for common features that determine whether molecules are long-lived or short-lived,” Matthias says. “It should also help us understand the very complex regulatory relationships by which thousands of genes are linked to each other and to the molecules that they produce in cells.”
He adds that the work validates the overall “systems” approach being developed within BIMSB at the MDC. “The work is a good example of the way we can gain insights into the systems level of life by combining different high-throughput technologies with mathematical modeling and follow-up experiments in the wet lab. That strategy is most successful at a place like BIMSB, where we are closely linked to groups with expertise in biology, physics, mathematics, chemistry, computational science, and so on. The questions we are asking can only be solved through technological developments and contributions of groups from several ‘classical disciplines’ that bring their expertise to bear on a common problem. This paper – and hopefully many more to come – show that with the BIMSB, we’ve got a good recipe for doing so.”
– Russ Hodge