Genome-wide identification of transcript start and end sites by transcript isoform sequencing.
Pelechano, V., Wei, W., Jakob, P. & Steinmetz, L.M.
Nat Protoc. 2014 Jun;9(7):1740-59. doi: 10.1038/nprot.2014.121. Epub 2014 Jun 26.
Hundreds of transcript isoforms with varying boundaries and alternative regulatory signals are transcribed from the genome, even in a genetically homogeneous population of cells. To study this transcriptional heterogeneity, we developed transcript isoform sequencing (TIF-seq), a method that allows the genome-wide profiling of full-length transcript isoforms defined by their exact 5' and 3' boundaries. TIF-seq entails the generation of full-length cDNA libraries, followed by their circularization and the sequencing of the junction fragments spanning the 5' and 3' transcript ends. By determining the respective co-occurrence of start and end sites of individual transcript molecules, TIF-seq can distinguish variations that conventional approaches for mapping single ends cannot, such as short abortive transcripts, bicistronic messages and overlapping transcripts that differ in lengths. The TIF-seq protocol we describe here can be applied to any eukaryotic organism (e.g., yeast, human), and it requires 6-10 d for generating TIF-seq libraries, 10 d for sequencing and 2-3 d for analysis.
Heritability and genetic basis of protein level variation in an outbred population.
Parts, L., Liu, Y.C., Tekkedil, M., Steinmetz, L.M., Caudy, A.A., Fraser, A.G., Boone, C., Andrews, B.J. & Rosebrock, A.P.
Genome Res. 2014 May 13. pii: gr.170506.113.
The genetic basis of heritable traits has been studied for decades Although recent mapping efforts have elucidated genetic determinants of transcript levels, mapping of cellular traits downstream of mRNA levels, such as protein abundance, has lagged. Here, we systematically analyze levels of 4,084 GFP-tagged yeast proteins in the progeny of a cross between a laboratory and a wild strain at single-cell resolution using flow cytometry and high-content microscopy. The genotype of trans variants contributed little to protein level variation between individual cells, but explained over 50% of the variance in the population average protein abundance for half of the GFP-fusions tested. To map trans-acting factors responsible for the heritable expression variation, we performed flow sorting and bulk segregant analysis of twenty-five proteins, finding a median of five protein quantitative trait loci (pQTLs) per GFP-fusion. In our mapping analysis, we find that cis-acting variants predominate; the genotype of a gene and its surrounding region had a large effect on protein level six times more frequently than the rest of the genome combined. We present evidence for both shared and independent genetic control of transcript and protein abundance: over half of the expression QTLs (eQTLs) contribute to changes in protein levels of regulated genes, but several pQTLs do not affect their cognate transcript levels. Allele replacements of genes known to underlie trans eQTL hotspots confirmed correlation of effects on mRNA and protein levels. This study represents the first genome-scale measurement of genetic contribution to protein levels in single cells and populations, identifies over a hundred trans pQTLs, and validates the propagation of effects associated with transcript variation to protein abundance.
Role of histone modifications and early termination in pervasive transcription and antisense-mediated gene silencing in yeast.
Castelnuovo, M., Zaugg, J.B., Guffanti, E., Maffioletti, A., Camblong, J., Xu, Z., Clauder-Munster, S., Steinmetz, L.M., Luscombe, N.M. & Stutz, F.
Nucleic Acids Res. 2014 Apr;42(7):4348-62. doi: 10.1093/nar/gku100. Epub 2014 Feb4.
Most genomes, including yeast Saccharomyces cerevisiae, are pervasively transcribed producing numerous non-coding RNAs, many of which are unstable and eliminated by nuclear or cytoplasmic surveillance pathways. We previously showed that accumulation of PHO84 antisense RNA (asRNA), in cells lacking the nuclear exosome component Rrp6, is paralleled by repression of sense transcription in a process dependent on the Hda1 histone deacetylase (HDAC) and the H3K4 histone methyl transferase Set1. Here we investigate this process genome-wide and measure the whole transcriptome of various histone modification mutants in a Deltarrp6 strain using tiling arrays. We confirm widespread occurrence of potentially antisense-dependent gene regulation and identify three functionally distinct classes of genes that accumulate asRNAs in the absence of Rrp6. These classes differ in whether the genes are silenced by the asRNA and whether the silencing is HDACs and histone methyl transferase-dependent. Among the distinguishing features of asRNAs with regulatory potential, we identify weak early termination by Nrd1/Nab3/Sen1, extension of the asRNA into the open reading frame promoter and dependence of the silencing capacity on Set1 and the HDACs Hda1 and Rpd3 particularly at promoters undergoing extensive chromatin remodelling. Finally, depending on the efficiency of Nrd1/Nab3/Sen1 early termination, asRNA levels are modulated and their capability of silencing is changed.
Control of Cdc28 CDK1 by a stress-induced lncRNA.
Nadal-Ribelles, M., Sole, C., Xu, Z., Steinmetz, L.M., de Nadal, E. & Posas, F.
Mol Cell. 2014 Feb 20;53(4):549-61. doi: 10.1016/j.molcel.2014.01.006. Epub 2014Feb 6.
Genomic analysis has revealed the existence of a large number of long noncoding RNAs (lncRNAs) with different functions in a variety of organisms, including yeast. Cells display dramatic changes of gene expression upon environmental changes. Upon osmostress, hundreds of stress-responsive genes are induced by the stress-activated protein kinase (SAPK) p38/Hog1. Using whole-genome tiling arrays, we found that Hog1 induces a set of lncRNAs upon stress. One of the genes expressing a Hog1-dependent lncRNA in antisense orientation is CDC28, the cyclin-dependent kinase 1 (CDK1) that controls the cell cycle in yeast. Cdc28 lncRNA mediates the establishment of gene looping and the relocalization of Hog1 and RSC from the 3' UTR to the +1 nucleosome to induce CDC28 expression. The increase in the levels of Cdc28 results in cells able to reenter the cell cycle more efficiently after stress. This may represent a general mechanism to prime expression of genes needed after stresses are alleviated.
Yeast Growth Plasticity Is Regulated by Environment Specific Multi-QTL Interactions.
Bhatia, A., Yadav, A., Gagneur, J., Zhu, C., Steinmetz, L.M., Bhanot, G. & Sinha, H.
G3 (Bethesda). 2014 Jan 28. pii: g3.113.009142v1. doi: 10.1534/g3.113.009142.
For a unicellular, non-motile organism like Saccharomyces cerevisiae, carbon sources act both as nutrients and as signaling molecules, and consequently affect various fitness parameters including growth. It is therefore advantageous for yeast strains to adapt their growth to carbon source variation. The ability of a given genotype to manifest different phenotypes in varying environments is known as phenotypic plasticity. To identify quantitative trait loci (QTL) that drive plasticity in growth, two growth parameters (growth rate and biomass) were measured for a set of meiotic recombinants of two genetically divergent yeast strains grown in different carbon sources. To identify QTLs contributing to plasticity across pairs of environments, gene-environment interaction mapping was performed, which identified several QTLs that have a differential effect across environments, some of which act antagonistically across pairs of environments. Multi-QTL analysis identified loci interacting with previously known growth affecting QTLs as well as novel two-QTL interactions that affect growth. A QTL that had no significant independent effect was found to alter growth rate and biomass for several carbon sources through two-QTL interactions. Our study demonstrates that environment-specific epistatic interactions contribute to the growth plasticity in yeast. We propose that a targeted scan for epistatic interactions, such as the one described here, can help unravel mechanisms regulating phenotypic plasticity.
Alternative polyadenylation diversifies post-transcriptional regulation by selective RNA-protein interactions.
Gupta, I., Clauder-Munster, S., Klaus, B., Jarvelin, A.I., Aiyar, R.S., Benes, V., Wilkening, S., Huber, W., Pelechano, V. & Steinmetz, L.M.
Mol Syst Biol. 2014 Feb 25;10(2):719. doi: 10.1002/msb.135068. Print 2014.
Recent research has uncovered extensive variability in the boundaries of transcript isoforms, yet the functional consequences of this variation remain largely unexplored. Here, we systematically discriminate between the molecular phenotypes of overlapping coding and non-coding transcriptional events from each genic locus using a novel genome-wide, nucleotide-resolution technique to quantify the half-lives of 3' transcript isoforms in yeast. Our results reveal widespread differences in stability among isoforms for hundreds of genes in a single condition, and that variation of even a single nucleotide in the 3' untranslated region (UTR) can affect transcript stability. While previous instances of negative associations between 3' UTR length and transcript stability have been reported, here, we find that shorter isoforms are not necessarily more stable. We demonstrate the role of RNA-protein interactions in conditioning isoform-specific stability, showing that PUF3 binds and destabilizes specific polyadenylation isoforms. Our findings indicate that although the functional elements of a gene are encoded in DNA sequence, the selective incorporation of these elements into RNA through transcript boundary variation allows a single gene to have diverse functional consequences.
An Evaluation of High-Throughput Approaches to QTL Mapping in Saccharomyces cerevisiae.
Wilkening, S., Lin, G., Fritsch, E.S., Tekkedil, M.M., Anders, S., Kuehn, R., Nguyen, M., Aiyar, R.S., Proctor, M., Sakhanenko, N.A., Galas, D.J., Gagneur, J., Deutschbauer, A. & Steinmetz, L.M.
Genetics. 2013 Dec 27.
Dissecting the molecular basis of quantitative traits is a significant challenge, and is essential for understanding complex diseases. Even in model organisms, precisely determining causative genes and their interactions has remained elusive, due in part to difficulty in narrowing intervals to single genes, and in detecting epistasis or linked quantitative trait loci. These difficulties are exacerbated by limitations in experimental design, such as low numbers of analyzed individuals, and polymorphisms between parental genomes. We address these challenges by applying three independent high-throughput approaches for QTL mapping to map the genetic variants underlying eleven phenotypes in two genetically distant Saccharomyces cerevisiae strains, namely: 1) individual analysis of over 700 meiotic segregants, 2) bulk segregant analysis, and 3) reciprocal hemizygosity analysis, a new genome-wide method we developed. We identified differences in the performance of each approach and, by combining them, identified eight polymorphic genes that affect eight different phenotypes: colony shape, flocculation, growth on non-fermentable carbon sources, and resistance to drugs, salt, and heat. Our results demonstrate the power of individual segregant analysis to dissect quantitative trait loci and address the underestimated contribution of interactions between variants. We also reveal confounding factors like mutations and aneuploidy in pooled approaches, providing valuable lessons for future designs of complex trait mapping studies.
Gene regulation by antisense transcription.
Pelechano, V. & Steinmetz, L.M.
Nat Rev Genet. 2013 Dec;14(12):880-93. doi: 10.1038/nrg3594. Epub 2013 Nov 12.
Antisense transcription, which was initially considered by many as transcriptional noise, is increasingly being recognized as an important regulator of gene expression. It is widespread among all kingdoms of life and has been shown to influence - either through the act of transcription or through the non-coding RNA that is produced - almost all stages of gene expression, from transcription and translation to RNA degradation. Antisense transcription can function as a fast evolving regulatory switch and a modular scaffold for protein complexes, and it can 'rewire' regulatory networks. The genomic arrangement of antisense RNAs opposite sense genes indicates that they might be part of self-regulatory circuits that allow genes to regulate their own expression.
Extensive variation in chromatin states across humans.
Kasowski, M., Kyriazopoulou-Panagiotopoulou, S., Grubert, F., Zaugg, J.B., Kundaje, A., Liu, Y., Boyle, A.P., Zhang, Q.C., Zakharia, F., Spacek, D.V., Li, J., Xie, D., Olarerin-George, A., Steinmetz, L.M., Hogenesch, J.B., Kellis, M., Batzoglou, S. & Snyder, M.
Science. 2013 Nov 8;342(6159):750-2. doi: 10.1126/science.1242510. Epub 2013 Oct17.
The majority of disease-associated variants lie outside protein-coding regions, suggesting a link between variation in regulatory regions and disease predisposition. We studied differences in chromatin states using five histone modifications, cohesin, and CTCF in lymphoblastoid lines from 19 individuals of diverse ancestry. We found extensive signal variation in regulatory regions, which often switch between active and repressed states across individuals. Enhancer activity is particularly diverse among individuals, whereas gene expression remains relatively stable. Chromatin variability shows genetic inheritance in trios, correlates with genetic variation and population divergence, and is associated with disruptions of transcription factor binding motifs. Overall, our results provide insights into chromatin variation among humans.
Genotype-environment interactions reveal causal pathways that mediate genetic effects on phenotype.
Gagneur, J., Stegle, O., Zhu, C., Jakob, P., Tekkedil, M.M., Aiyar, R.S., Schuon, A.K., Pe'er, D. & Steinmetz, L.M.
PLoS Genet. 2013 Sep;9(9):e1003803. doi: 10.1371/journal.pgen.1003803. Epub 2013Sep 19.
Unraveling the molecular processes that lead from genotype to phenotype is crucial for the understanding and effective treatment of genetic diseases. Knowledge of the causative genetic defect most often does not enable treatment; therefore, causal intermediates between genotype and phenotype constitute valuable candidates for molecular intervention points that can be therapeutically targeted. Mapping genetic determinants of gene expression levels (also known as expression quantitative trait loci or eQTL studies) is frequently used for this purpose, yet distinguishing causation from correlation remains a significant challenge. Here, we address this challenge using extensive, multi-environment gene expression and fitness profiling of hundreds of genetically diverse yeast strains, in order to identify truly causal intermediate genes that condition fitness in a given environment. Using functional genomics assays, we show that the predictive power of eQTL studies for inferring causal intermediate genes is poor unless performed across multiple environments. Surprisingly, although the effects of genotype on fitness depended strongly on environment, causal intermediates could be most reliably predicted from genetic effects on expression present in all environments. Our results indicate a mechanism explaining this apparent paradox, whereby immediate molecular consequences of genetic variation are shared across environments, and environment-dependent phenotypic effects result from downstream integration of environmental signals. We developed a statistical model to predict causal intermediates that leverages this insight, yielding over 400 transcripts, for the majority of which we experimentally validated their role in conditioning fitness. Our findings have implications for the design and analysis of clinical omics studies aimed at discovering personalized targets for molecular intervention, suggesting that inferring causation in a single cellular context can benefit from molecular profiling in multiple contexts.
Drift and conservation of differential exon usage across tissues in primate species.
Reyes, A., Anders, S., Weatheritt, R.J., Gibson, T.J., Steinmetz, L.M. & Huber, W.
Proc Natl Acad Sci U S A. 2013 Sep 17;110(38):15377-82. doi:10.1073/pnas.1307202110. Epub 2013 Sep 3.
Alternative usage of exons provides genomes with plasticity to produce different transcripts from the same gene, modulating the function, localization, and life cycle of gene products. It affects most human genes. For a limited number of cases, alternative functions and tissue-specific roles are known. However, recent high-throughput sequencing studies have suggested that much alternative isoform usage across tissues is nonconserved, raising the question of the extent of its functional importance. We address this question in a genome-wide manner by analyzing the transcriptomes of five tissues for six primate species, focusing on exons that are 1:1 orthologous in all six species. Our results support a model in which differential usage of exons has two major modes: First, most of the exons show only weak differences, which are dominated by interspecies variability and may reflect neutral drift and noisy splicing. These cases dominate the genome-wide view and explain why conservation appears to be so limited. Second, however, a sizeable minority of exons show strong differences between tissues, which are mostly conserved. We identified a core set of 3,800 exons from 1,643 genes that show conservation of strongly tissue-dependent usage patterns from human at least to macaque. This set is enriched for exons encoding protein-disordered regions and untranslated regions. Our findings support the theory that isoform regulation is an important target of evolution in primates, and our method provides a powerful tool for discovering potentially functional tissue-dependent isoforms.
Polyadenylation site-induced decay of upstream transcripts enforces promoter directionality.
Ntini, E., Jarvelin, A.I., Bornholdt, J., Chen, Y., Boyd, M., Jorgensen, M., Andersson, R., Hoof, I., Schein, A., Andersen, P.R., Andersen, P.K., Preker, P., Valen, E., Zhao, X., Pelechano, V., Steinmetz, L.M., Sandelin, A. & Jensen, T.H.
Nat Struct Mol Biol. 2013 Aug;20(8):923-8. doi: 10.1038/nsmb.2640. Epub 2013 Jul14.
Active human promoters produce promoter-upstream transcripts (PROMPTs). Why these RNAs are coupled to decay, whereas their neighboring promoter-downstream mRNAs are not, is unknown. Here high-throughput sequencing demonstrates that PROMPTs generally initiate in the antisense direction closely upstream of the transcription start sites (TSSs) of their associated genes. PROMPT TSSs share features with mRNA-producing TSSs, including stalled RNA polymerase II (RNAPII) and the production of small TSS-associated RNAs. Notably, motif analyses around PROMPT 3' ends reveal polyadenylation (pA)-like signals. Mutagenesis studies demonstrate that PROMPT pA signals are functional but linked to RNA degradation. Moreover, pA signals are under-represented in promoter-downstream versus promoter-upstream regions, thus allowing for more efficient RNAPII progress in the sense direction from gene promoters. We conclude that asymmetric sequence distribution around human gene promoters serves to provide a directional RNA output from an otherwise bidirectional transcription process.
The genomic and transcriptomic landscape of a HeLa cell line.
Landry, J.J., Pyl, P.T., Rausch, T., Zichner, T., Tekkedil, M.M., Stutz, A.M., Jauch, A., Aiyar, R.S., Pau, G., Delhomme, N., Gagneur, J., Korbel, J.O., Huber, W. & Steinmetz, L.M.
G3 (Bethesda). 2013 Aug 7;3(8):1213-24. doi: 10.1534/g3.113.005777.
HeLa is the most widely used model cell line for studying human cellular and molecular biology. To date, no genomic reference for this cell line has been released, and experiments have relied on the human reference genome. Effective design and interpretation of molecular genetic studies performed using HeLa cells require accurate genomic information. Here we present a detailed genomic and transcriptomic characterization of a HeLa cell line. We performed DNA and RNA sequencing of a HeLa Kyoto cell line and analyzed its mutational portfolio and gene expression profile. Segmentation of the genome according to copy number revealed a remarkably high level of aneuploidy and numerous large structural variants at unprecedented resolution. Some of the extensive genomic rearrangements are indicative of catastrophic chromosome shattering, known as chromothripsis. Our analysis of the HeLa gene expression profile revealed that several pathways, including cell cycle and DNA repair, exhibit significantly different expression patterns from those in normal human tissues. Our results provide the first detailed account of genomic variants in the HeLa genome, yielding insight into their impact on gene expression and cellular function as well as their origins. This study underscores the importance of accounting for the strikingly aberrant characteristics of HeLa cells when designing and interpreting experiments, and has implications for the use of HeLa as a model of human biology.
Multiple genomic changes associated with reorganization of gene regulation and adaptation in yeast.
David, L., Ben-Harosh, Y., Stolovicki, E., Moore, L.S., Nguyen, M., Tamse, R., Dean, J., Mancera, E., Steinmetz, L.M. & Braun, E.
Mol Biol Evol. 2013 Jul;30(7):1514-26. doi: 10.1093/molbev/mst071. Epub 2013 Apr14.
Frequently during evolution, new phenotypes evolved due to novelty in gene regulation, such as that caused by genome rewiring. This has been demonstrated by comparing common regulatory sequences among species and by identifying single regulatory mutations that are associated with new phenotypes. However, while a single mutation changes a single element, gene regulation is accomplished by a regulatory network involving multiple interactive elements. Therefore, to better understand regulatory evolution, we have studied how mutations contributed to the adaptation of cells to a regulatory challenge. We created a synthetic genome rewiring in yeast cells, challenged their gene regulation, and studied their adaptation. HIS3, an essential enzyme for histidine biosynthesis, was placed exclusively under a GAL promoter, which is induced by galactose and strongly repressed in glucose. Such rewired cells were faced with significant regulatory challenges in a repressive glucose medium. We identified several independent mutations in elements of the GAL system associated with the rapid adaptation of cells, such as the repressor GAL80 and the binding sites of the activator GAL4. Consistent with the extraordinarily high rate of cell adaptation, new regulation emerged during adaptation via multiple trajectories, including those involving mutations in elements of the GAL system. The new regulation of HIS3 tuned its expression according to histidine requirements with or without these significant mutations, indicating that additional factors participated in this regulation and that the regulatory network could reorganize in multiple ways to accommodate different mutations. This study, therefore, stresses network plasticity as an important property for regulatory adaptation and evolution.
Extensive transcriptional heterogeneity revealed by isoform profiling.
Pelechano, V., Wei, W. & Steinmetz, L.M.
Nature. 2013 May 2;497(7447):127-31. doi: 10.1038/nature12121. Epub 2013 Apr 24.
Transcript function is determined by sequence elements arranged on an individual RNA molecule. Variation in transcripts can affect messenger RNA stability, localization and translation, or produce truncated proteins that differ in localization or function. Given the existence of overlapping, variable transcript isoforms, determining the functional impact of the transcriptome requires identification of full-length transcripts, rather than just the genomic regions that are transcribed. Here, by jointly determining both transcript ends for millions of RNA molecules, we reveal an extensive layer of isoform diversity previously hidden among overlapping RNA molecules. Variation in transcript boundaries seems to be the rule rather than the exception, even within a single population of yeast cells. Over 26 major transcript isoforms per protein-coding gene were expressed in yeast. Hundreds of short coding RNAs and truncated versions of proteins are concomitantly encoded by alternative transcript isoforms, increasing protein diversity. In addition, approximately 70% of genes express alternative isoforms that vary in post-transcriptional regulatory elements, and tandem genes frequently produce overlapping or even bicistronic transcripts. This extensive transcript diversity is generated by a relatively simple eukaryotic genome with limited splicing, and within a genetically homogeneous population of cells. Our findings have implications for genome compaction, evolution and phenotypic diversity between single cells. These data also indicate that isoform diversity as well as RNA abundance should be considered when assessing the functional repertoire of genomes.
An efficient method for genome-wide polyadenylation site mapping and RNA quantification.
Wilkening, S., Pelechano, V., Jarvelin, A.I., Tekkedil, M.M., Anders, S., Benes, V. & Steinmetz, L.M.
Nucleic Acids Res. 2013 Mar 1;41(5):e65. doi: 10.1093/nar/gks1249. Epub 2013 Jan7.
The use of alternative poly(A) sites is common and affects the post-transcriptional fate of mRNA, including its stability, subcellular localization and translation. Here, we present a method to identify poly(A) sites in a genome-wide and strand-specific manner. This method, termed 3'T-fill, initially fills in the poly(A) stretch with unlabeled dTTPs, allowing sequencing to start directly after the poly(A) tail into the 3'-untranslated regions (UTR). Our comparative analysis demonstrates that it outperforms existing protocols in quality and throughput and accurately quantifies RNA levels as only one read is produced from each transcript. We use this method to characterize the diversity of polyadenylation in Saccharomyces cerevisiae, showing that alternative RNA molecules are present even in a genetically identical cell population. Finally, we observe that overlap of convergent 3'-UTRs is frequent but sharply limited by coding regions, suggesting factors that restrict compression of the yeast genome.
Genotyping 1000 yeast strains by next-generation sequencing.
Wilkening, S., Tekkedil, M.M., Lin, G., Fritsch, E.S., Wei, W., Gagneur, J., Lazinski, D.W., Camilli, A. & Steinmetz, L.M.
BMC Genomics. 2013 Feb 9;14:90. doi: 10.1186/1471-2164-14-90.
BACKGROUND: The throughput of next-generation sequencing machines has increased dramatically over the last few years; yet the cost and time for library preparation have not changed proportionally, thus representing the main bottleneck for sequencing large numbers of samples. Here we present an economical, high-throughput library preparation method for the Illumina platform, comprising a 96-well based method for DNA isolation for yeast cells, a low-cost DNA shearing alternative, and adapter ligation using heat inactivation of enzymes instead of bead cleanups. RESULTS: Up to 384 whole-genome libraries can be prepared from yeast cells in one week using this method, for less than 15 euros per sample. We demonstrate the robustness of this protocol by sequencing over 1000 yeast genomes at ~30x coverage. The sequence information from 768 yeast segregants derived from two divergent S. cerevisiae strains was used to generate a meiotic recombination map at unprecedented resolution. Comparisons to other datasets indicate a high conservation of recombination at a chromosome-wide scale, but differences at the local scale. Additionally, we detected a high degree of aneuploidy (3.6%) by examining the sequencing coverage in these segregants. Differences in allele frequency allowed us to attribute instances of aneuploidy to gains of chromosomes during meiosis or mitosis, both of which showed a strong tendency to missegregate specific chromosomes. CONCLUSIONS: Here we present a high throughput workflow to sequence genomes of large number of yeast strains at a low price. We have used this workflow to obtain recombination and aneuploidy data from hundreds of segregants, which can serve as a foundation for future studies of linkage, recombination, and chromosomal aberrations in yeast and higher eukaryotes.
System-wide identification of RNA-binding proteins by interactome capture.
Castello, A., Horos, R., Strein, C., Fischer, B., Eichelbaum, K., Steinmetz, L.M., Krijgsveld, J. & Hentze, M.W.
Nat Protoc. 2013 Feb 14;8(3):491-500. doi: 10.1038/nprot.2013.020. Epub 2013 Feb14.
Owing to their preeminent biological functions, the repertoire of expressed RNA-binding proteins (RBPs) and their activity states are highly informative about cellular systems. We have developed a novel and unbiased technique, called interactome capture, for identifying the active RBPs of cultured cells. By making use of in vivo UV cross-linking of RBPs to polyadenylated RNAs, covalently bound proteins are captured with oligo(dT) magnetic beads. After stringent washes, the mRNA interactome is determined by quantitative mass spectrometry (MS). The protocol takes 3 working days for analysis of single proteins by western blotting, and about 2 weeks for the determination of complete cellular mRNA interactomes by MS. The most important advantage of interactome capture over other in vitro and in silico approaches is that only RBPs bound to RNA in a physiological environment are identified. When applied to HeLa cells, interactome capture revealed hundreds of novel RBPs. Interactome capture can also be broadly used to compare different biological states, including metabolic stress, cell cycle, differentiation, development or the response to drugs.
The Role of Ctk1 Kinase in Termination of Small Non-Coding RNAs.
Lenstra, T.L., Tudek, A., Clauder, S., Xu, Z., Pachis, S.T., van Leenen, D., Kemmeren, P., Steinmetz, L.M., Libri, D. & Holstege, F.C.
PLoS One. 2013 Dec 4;8(12):e80495. doi: 10.1371/journal.pone.0080495.
Transcription termination in Saccharomyces cerevisiae can be performed by at least two distinct pathways and is influenced by the phosphorylation status of the carboxy-terminal domain (CTD) of RNA polymerase II (Pol II). Late termination of mRNAs is performed by the CPF/CF complex, the recruitment of which is dependent on CTD-Ser2 phosphorylation (Ser2P). Early termination of shorter cryptic unstable transcripts (CUTs) and small nucleolar/nuclear RNAs (sno/snRNAs) is performed by the Nrd1-Nab3-Sen1 (NNS) complex that binds phosphorylated CTD-Ser5 (Ser5P) via the CTD-interacting domain (CID) of Nrd1p. In this study, mutants of the different termination pathways were compared by genome-wide expression analysis. Surprisingly, the expression changes observed upon loss of the CTD-Ser2 kinase Ctk1p are more similar to those derived from alterations in the Ser5P-dependent NNS pathway, than from loss of CTD-Ser2P binding factors. Tiling array analysis of ctk1Delta cells reveals readthrough at snoRNAs, at many cryptic unstable transcripts (CUTs) and stable uncharacterized transcripts (SUTs), but only at some mRNAs. Despite the suggested predominant role in termination of mRNAs, we observed that a CTK1 deletion or a Pol II CTD mutant lacking all Ser2 positions does not result in a global mRNA termination defect. Rather, termination defects in these strains are widely observed at NNS-dependent genes. These results indicate that Ctk1p and Ser2 CTD phosphorylation have a wide impact in termination of small non-coding RNAs but only affect a subset of mRNA coding genes.
Silencing of Genes and Alleles by RNAi in Anopheles gambiae.
Lamacchia, M., Clayton, J.R., Wang-Sattler, R., Steinmetz, L.M., Levashina, E.A. & Blandin, S.A.
Methods Mol Biol. 2013;923:161-76.
Anopheles gambiae mosquitoes are the major vectors of human malaria parasites. However, mosquitoes are not passive hosts for parasites, actively limiting their development in vivo. Our current understanding of the mosquito antiparasitic response is mostly based on the phenotypic analysis of gene knockdowns obtained by RNA interference (RNAi), through the injection or transfection of long dsRNAs in adult mosquitoes or cultured cells, respectively. Recently, RNAi has been extended to silence specifically one allele of a given gene in a heterozygous context, thus allowing to compare the contribution of different alleles to a phenotype in the same genetic background.
Natural sequence variants of yeast environmental sensors confer cell-to-cell expression variability.
Fehrmann, S., Bottin-Duplus, H., Leonidou, A., Mollereau, E., Barthelaix, A., Wei, W., Steinmetz, L.M. & Yvert, G.
Mol Syst Biol. 2013 Oct 8;9:695. doi: 10.1038/msb.2013.53.
Living systems may have evolved probabilistic bet hedging strategies that generate cell-to-cell phenotypic diversity in anticipation of environmental catastrophes, as opposed to adaptation via a deterministic response to environmental changes. Evolution of bet hedging assumes that genotypes segregating in natural populations modulate the level of intraclonal diversity, which so far has largely remained hypothetical. Using a fluorescent Pmet17-GFP reporter, we mapped four genetic loci conferring to a wild yeast strain an elevated cell-to-cell variability in the expression of MET17, a gene regulated by the methionine pathway. A frameshift mutation in the Erc1p transmembrane transporter, probably resulting from a release of laboratory strains from negative selection, reduced Pmet17-GFP expression variability. At a second locus, cis-regulatory polymorphisms increased mean expression of the Mup1p methionine permease, causing increased expression variability in trans. These results demonstrate that an expression quantitative trait locus (eQTL) can simultaneously have a deterministic effect in cis and a probabilistic effect in trans. Our observations indicate that the evolution of transmembrane transporter genes can tune intraclonal variation and may therefore be implicated in both reactive and anticipatory strategies of adaptation.
Gene loops enhance transcriptional directionality.
Tan-Wong, S.M., Zaugg, J.B., Camblong, J., Xu, Z., Zhang, D.W., Mischo, H.E., Ansari, A.Z., Luscombe, N.M., Steinmetz, L.M. & Proudfoot, N.J.
Science. 2012 Nov 2;338(6107):671-5. doi: 10.1126/science.1224350. Epub 2012 Sep27.
Eukaryotic genomes are extensively transcribed, forming both messenger RNAs (mRNAs) and noncoding RNAs (ncRNAs). ncRNAs made by RNA polymerase II often initiate from bidirectional promoters (nucleosome-depleted chromatin) that synthesize mRNA and ncRNA in opposite directions. We demonstrate that, by adopting a gene-loop conformation, actively transcribed mRNA encoding genes restrict divergent transcription of ncRNAs. Because gene-loop formation depends on a protein factor (Ssu72) that coassociates with both the promoter and the terminator, the inactivation of Ssu72 leads to increased synthesis of promoter-associated divergent ncRNAs, referred to as Ssu72-restricted transcripts (SRTs). Similarly, inactivation of individual gene loops by gene mutation enhances SRT synthesis. We demonstrate that gene-loop conformation enforces transcriptional directionality on otherwise bidirectional promoters.
RNA Polymerase II Collision Interrupts Convergent Transcription.
Hobson, D.J., Wei, W., Steinmetz, L.M. & Svejstrup, J.Q.
Mol Cell. 2012 Oct 2. pii: S1097-2765(12)00770-8. doi:10.1016/j.molcel.2012.08.027.
Antisense noncoding transcripts, genes-within-genes, and convergent gene pairs are prevalent among eukaryotes. The existence of such transcription units raises the question of what happens when RNA polymerase II (RNAPII) molecules collide head-to-head. Here we use a combination of biochemical and genetic approaches in yeast to show that polymerases transcribing opposite DNA strands cannot bypass each other. RNAPII stops but does not dissociate upon head-to-head collision in vitro, suggesting that opposing polymerases represent insurmountable obstacles for each other. Head-to-head collision in vivo also results in RNAPII stopping, and removal of collided RNAPII from the DNA template can be achieved via ubiquitylation-directed proteolysis. Indeed, in cells lacking efficient RNAPII polyubiquitylation, the half-life of collided polymerases increases, so that they can be detected between convergent genes. These results provide insight into fundamental mechanisms of gene traffic control and point to an unexplored effect of antisense transcription on gene regulation via polymerase collision.
easyRNASeq: a bioconductor package for processing RNA-Seq data.
Delhomme, N., Padioleau, I., Furlong, E.E. & Steinmetz, L.M.
Bioinformatics. 2012 Oct 1;28(19):2532-3. Epub 2012 Jul 30.
MOTIVATION: RNA sequencing is becoming a standard for expression profiling experiments and many tools have been developed in the past few years to analyze RNA-Seq data. Numerous 'Bioconductor' packages are available for next-generation sequencing data loading in R, e.g. ShortRead and Rsamtools as well as to perform differential gene expression analyses, e.g. DESeq and edgeR. However, the processing tasks lying in between these require the precise interplay of many Bioconductor packages, e.g. Biostrings, IRanges or external solutions are to be sought. RESULTS: We developed 'easyRNASeq', an R package that simplifies the processing of RNA sequencing data, hiding the complex interplay of the required packages behind a single functionality. AVAILABILITY: The package is implemented in R (as of version 2.15) and is available from Bioconductor (as of version 2.10) at the URL: http://bioconductor.org/packages/release/bioc/html/easyRNASeq.html, where installation and usage instructions can be found. CONTACT: email@example.com.
Set3 HDAC mediates effects of overlapping noncoding transcription on gene induction kinetics.
Kim, T., Xu, Z., Clauder-Munster, S., Steinmetz, L.M. & Buratowski, S.
Cell. 2012 Sep 14;150(6):1158-69. doi: 10.1016/j.cell.2012.08.016. Epub 2012 Sep6.
The Set3 histone deacetylase complex (Set3C) binds histone H3 dimethylated at lysine 4 (H3K4me2) to mediate deacetylation of histones in 5'-transcribed regions. To discern how Set3C affects gene expression, genome-wide transcription was analyzed in yeast undergoing a series of carbon source shifts. Deleting SET3 primarily caused changes during transition periods, as genes were induced or repressed. Surprisingly, a majority of Set3-affected genes are overlapped by noncoding RNA (ncRNA) transcription. Many Set3-repressed genes have H3K4me2 instead of me3 over promoter regions, due to either reduced H3K4me3 or ncRNA transcription from distal or antisense promoters. Set3C also represses internal cryptic promoters, but in different regions of genes than the Set2/Rpd3S pathway. Finally, Set3C stimulates some genes by repressing an overlapping antagonistic antisense transcript. These results show that overlapping noncoding transcription can fine-tune gene expression, not via the ncRNA but by depositing H3K4me2 to recruit the Set3C deacetylase.
Extensive Degradation of RNA Precursors by the Exosome in Wild-Type Cells.
Gudipati, R.K., Xu, Z., Lebreton, A., Seraphin, B., Steinmetz, L.M., Jacquier, A. & Libri, D.
Mol Cell. 2012 Sep 19. pii: S1097-2765(12)00736-8. doi:10.1016/j.molcel.2012.08.018.
The exosome is a complex involved in the maturation of rRNA and sn-snoRNA, in the degradation of short-lived noncoding RNAs, and in the quality control of RNAs produced in mutants. It contains two catalytic subunits, Rrp6p and Dis3p, whose specific functions are not fully understood. We analyzed the transcriptome of combinations of Rrp6p and Dis3p catalytic mutants by high-resolution tiling arrays. We show that Dis3p and Rrp6p have both overlapping and specific roles in degrading distinct classes of substrates. We found that transcripts derived from more than half of intron-containing genes are degraded before splicing. Surprisingly, we also show that the exosome degrades large amounts of tRNA precursors despite the absence of processing defects. These results underscore the notion that large amounts of RNAs produced in wild-type cells are discarded before entering functional pathways and suggest that kinetic competition with degradation proofreads the efficiency and accuracy of processing.
Genetic modifiers of chromatin acetylation antagonize the reprogramming of epi-polymorphisms.
Abraham, A.L., Nagarajan, M., Veyrieras, J.B., Bottin, H., Steinmetz, L.M. & Yvert, G.
PLoS Genet. 2012 Sep;8(9):e1002958. doi: 10.1371/journal.pgen.1002958. Epub 2012Sep 20.
Natural populations are known to differ not only in DNA but also in their chromatin-associated epigenetic marks. When such inter-individual epigenomic differences (or "epi-polymorphisms") are observed, their stability is usually not known: they may or may not be reprogrammed over time or upon environmental changes. In addition, their origin may be purely epigenetic, or they may result from regulatory variation encoded in the DNA. Studying epi-polymorphisms requires, therefore, an assessment of their nature and stability. Here we estimate the stability of yeast epi-polymorphisms of chromatin acetylation, and we provide a genome-by-epigenome map of their genetic control. A transient epi-drug treatment was able to reprogram acetylation variation at more than one thousand nucleosomes, whereas a similar amount of variation persisted, distinguishing "labile" from "persistent" epi-polymorphisms. Hundreds of genetic loci underlied acetylation variation at 2,418 nucleosomes either locally (in cis) or distantly (in trans), and this genetic control overlapped only partially with the genetic control of gene expression. Trans-acting regulators were not necessarily associated with genes coding for chromatin modifying enzymes. Strikingly, "labile" and "persistent" epi-polymorphisms were associated with poor and strong genetic control, respectively, showing that genetic modifiers contribute to persistence. These results estimate the amount of natural epigenomic variation that can be lost after transient environmental exposures, and they reveal the complex genetic architecture of the DNA-encoded determinism of chromatin epi-polymorphisms. Our observations provide a basis for the development of population epigenetics.
Experimental Relocation of the Mitochondrial ATP9 Gene to the Nucleus Reveals Forces Underlying Mitochondrial Genome Evolution.
Bietenhader, M.*, Martos, A.*, Tetaud, E.*, Aiyar, R.S.*, Sellem, C.H., Kucharczyk, R., Clauder-Munster, S., Giraud, M.F., Godard, F., Salin, B., Sagot, I., Gagneur, J., Dequard-Chablat, M., Contamine, V., Denmat, S.H., Sainsard-Chanet, A., Steinmetz, L.M. & di Rago, J.P.
PLoS Genet. 2012 Aug;8(8):e1002876. Epub 2012 Aug 16.
Only a few genes remain in the mitochondrial genome retained by every eukaryotic organism that carry out essential functions and are implicated in severe diseases. Experimentally relocating these few genes to the nucleus therefore has both therapeutic and evolutionary implications. Numerous unproductive attempts have been made to do so, with a total of only 5 successes across all organisms. We have taken a novel approach to relocating mitochondrial genes that utilizes naturally nuclear versions from other organisms. We demonstrate this approach on subunit 9/c of ATP synthase, successfully relocating this gene for the first time in any organism by expressing the ATP9 genes from Podospora anserina in Saccharomyces cerevisiae. This study substantiates the role of protein structure in mitochondrial gene transfer: expression of chimeric constructs reveals that the P. anserina proteins can be correctly imported into mitochondria due to reduced hydrophobicity of the first transmembrane segment. Nuclear expression of ATP9, while permitting almost fully functional oxidative phosphorylation, perturbs many cellular properties, including cellular morphology, and activates the heat shock response. Altogether, our study establishes a novel strategy for allotopic expression of mitochondrial genes, demonstrates the complex adaptations required to relocate ATP9, and indicates a reason that this gene was only transferred to the nucleus during the evolution of multicellular organisms.
Rrp6p controls mRNA poly(A) tail length and its decoration with poly(A) binding proteins.
Schmid, M., Poulsen, M.B., Olszewski, P., Pelechano, V., Saguez, C., Gupta, I., Steinmetz, L.M., Moore, C. & Jensen, T.H.
Mol Cell. 2012 Jul 27;47(2):267-80. doi: 10.1016/j.molcel.2012.05.005. Epub 2012Jun 7.
Poly(A) (pA) tail binding proteins (PABPs) control mRNA polyadenylation, stability, and translation. In a purified system, S. cerevisiae PABPs, Pab1p and Nab2p, are individually sufficient to provide normal pA tail length. However, it is unknown how this occurs in more complex environments. Here we find that the nuclear exosome subunit Rrp6p counteracts the in vitro and in vivo extension of mature pA tails by the noncanonical pA polymerase Trf4p. Moreover, PABP loading onto nascent pA tails is controlled by Rrp6p; while Pab1p is the major PABP, Nab2p only associates in the absence of Rrp6p. This is because Rrp6p can interact with Nab2p and displace it from pA tails, potentially leading to RNA turnover, as evidenced for certain pre-mRNAs. We suggest that a nuclear mRNP surveillance step involves targeting of Rrp6p by Nab2p-bound pA-tailed RNPs and that pre-mRNA abundance is regulated at this level.
Insights into RNA Biology from an Atlas of Mammalian mRNA-Binding Proteins.
Castello, A., Fischer, B., Eichelbaum, K., Horos, R., Beckmann, B.M., Strein, C., Davey, N.E., Humphreys, D.T., Preiss, T., Steinmetz, L.M., Krijgsveld, J. & Hentze, M.W.
Cell. 2012 Jun 8;149(6):1393-406. Epub 2012 May 31.
RNA-binding proteins (RBPs) determine RNA fate from synthesis to decay. Employing two complementary protocols for covalent UV crosslinking of RBPs to RNA, we describe a systematic, unbiased, and comprehensive approach, termed "interactome capture," to define the mRNA interactome of proliferating human HeLa cells. We identify 860 proteins that qualify as RBPs by biochemical and statistical criteria, adding more than 300 RBPs to those previously known and shedding light on RBPs in disease, RNA-binding enzymes of intermediary metabolism, RNA-binding kinases, and RNA-binding architectures. Unexpectedly, we find that many proteins of the HeLa mRNA interactome are highly intrinsically disordered and enriched in short repetitive amino acid motifs. Interactome capture is broadly applicable to study mRNA interactome composition and dynamics in varied biological settings.
Genome-wide H4 K16 acetylation by SAS-I is deposited independently of transcription and histone exchange.
Heise, F., Chung, H.R., Weber, J.M., Xu, Z., Klein-Hitpass, L., Steinmetz, L.M., Vingron, M. & Ehrenhofer-Murray, A.E.
Nucleic Acids Res. 2012 Jan;40(1):65-74. doi: 10.1093/nar/gkr649. Epub 2011 Sep9.
The MYST HAT Sas2 is part of the SAS-I complex that acetylates histone H4 lysine 16 (H4 K16Ac) and blocks the propagation of heterochromatin at the telomeres of Saccharomyces cerevisiae. In this study, we investigated Sas2-mediated H4 K16Ac on a genome-wide scale. Interestingly, H4 K16Ac loss in sas2Delta cells outside of the telomeric regions showed a distinctive pattern in that there was a pronounced decrease of H4 K16Ac within the majority of open reading frames (ORFs), but little change in intergenic regions. Furthermore, regions of low histone H3 exchange and low H3 K56 acetylation showed the most pronounced loss of H4 K16Ac in sas2Delta, indicating that Sas2 deposited this modification on chromatin independently of histone exchange. In agreement with the effect of Sas2 within ORFs, sas2Delta caused resistance to 6-azauracil, indicating a positive effect on transcription elongation in the absence of H4 K16Ac. In summary, our data suggest that Sas2-dependent H4 K16Ac is deposited into chromatin independently of transcription and histone exchange, and that it has an inhibitory effect on the ability of PolII to travel through the body of the gene.
Genome-wide polyadenylation site mapping.
Pelechano, V., Wilkening, S., Jarvelin, A.I., Tekkedil, M.M. & Steinmetz, L.M.
Methods Enzymol. 2012;513:271-96. doi: 10.1016/B978-0-12-391938-0.00012-4.
Alternative polyadenylation site usage gives rise to variation in 3' ends of transcripts in diverse organisms ranging from yeast to human. Accurate mapping of polyadenylation sites of transcripts is of major biological importance, since the length of the 3'UTR can have a strong influence on transcript stability, localization, and translation. However, reads generated using total mRNA sequencing mostly lack the very 3' end of transcripts. Here, we present a method that allows simultaneous analysis of alternative 3' ends and transcriptome dynamics at high throughput. By using transcripts produced in vitro, the high precision of end mapping during the protocol can be controlled. This method is illustrated here for budding yeast. However, this method can be applied to any natural or artificially polyadenylated RNA.
Accumulation of noncoding RNA due to an RNase P defect in Saccharomyces cerevisiae.
Marvin, M.C., Clauder-Munster, S., Walker, S.C., Sarkeshik, A., Yates JR, 3rd, Steinmetz, L.M. & Engelke, D.R.
RNA. 2011 Aug;17(8):1441-50. doi: 10.1261/rna.2737511. Epub 2011 Jun 10.
Ribonuclease P (RNase P) is an essential endoribonuclease that catalyzes the cleavage of the 5' leader of pre-tRNAs. In addition, a growing number of non-tRNA substrates have been identified in various organisms. RNase P varies in composition, as bacterial RNase P contains a catalytic RNA core and one protein subunit, while eukaryotic nuclear RNase P retains the catalytic RNA but has at least nine protein subunits. The additional eukaryotic protein subunits most likely provide additional functionality to RNase P, with one possibility being additional RNA recognition capabilities. To investigate the possible range of additional RNase P substrates in vivo, a strand-specific, high-density microarray was used to analyze what RNA accumulates with a mutation in the catalytic RNA subunit of nuclear RNase P in Saccharomyces cerevisiae. A wide variety of noncoding RNAs were shown to accumulate, suggesting that nuclear RNase P participates in the turnover of normally unstable nuclear RNAs. In some cases, the accumulated noncoding RNAs were shown to be antisense to transcripts that commensurately decreased in abundance. Pre-mRNAs containing introns also accumulated broadly, consistent with either compromised splicing or failure to efficiently turn over pre-mRNAs that do not enter the splicing pathway. Taken together with the high complexity of the nuclear RNase P holoenzyme and its relatively nonspecific capacity to bind and cleave mixed sequence RNAs, these data suggest that nuclear RNase P facilitates turnover of nuclear RNAs in addition to its role in pre-tRNA biogenesis.
A yeast-based assay identifies drugs active against human mitochondrial disorders.
Couplan, E.*, Aiyar, R.S.*, Kucharczyk, R., Kabala, A., Ezkurdia, N., Gagneur, J., St Onge, R.P., Salin, B., Soubigou, F., Le Cann, M., Steinmetz, L.M., di Rago, J.P. & Blondel, M.
Proc Natl Acad Sci U S A. 2011 Jul 19;108(29):11989-94. doi:10.1073/pnas.1101478108. Epub 2011 Jun 29.
Due to the lack of relevant animal models, development of effective treatments for human mitochondrial diseases has been limited. Here we establish a rapid, yeast-based assay to screen for drugs active against human inherited mitochondrial diseases affecting ATP synthase, in particular NARP (neuropathy, ataxia, and retinitis pigmentosa) syndrome. This method is based on the conservation of mitochondrial function from yeast to human, on the unique ability of yeast to survive without production of ATP by oxidative phosphorylation, and on the amenability of the yeast mitochondrial genome to site-directed mutagenesis. Our method identifies chlorhexidine by screening a chemical library and oleate through a candidate approach. We show that these molecules rescue a number of phenotypes resulting from mutations affecting ATP synthase in yeast. These compounds are also active on human cybrid cells derived from NARP patients. These results validate our method as an effective high-throughput screening approach to identify drugs active in the treatment of human ATP synthase disorders and suggest that this type of method could be applied to other mitochondrial diseases.
Functional consequences of bidirectional promoters.
Wei, W., Pelechano, V., Jarvelin, A.I. & Steinmetz, L.M.
Trends Genet. 2011 Jul;27(7):267-76. doi: 10.1016/j.tig.2011.04.002. Epub 2011May 24.
Several studies have shown that promoters of protein-coding genes are origins of pervasive non-coding RNA transcription and can initiate transcription in both directions. However, only recently have researchers begun to elucidate the functional implications of this bidirectionality and non-coding RNA production. Increasing evidence indicates that non-coding transcription at promoters influences the expression of protein-coding genes, revealing a new layer of transcriptional regulation. This regulation acts at multiple levels, from modifying local chromatin to enabling regional signal spreading and more distal regulation. Moreover, the bidirectional activity of a promoter is regulated at multiple points during transcription, giving rise to diverse types of transcripts.
Genome-wide survey of post-meiotic segregation during yeast recombination.
Mancera, E., Bourgon, R., Huber, W. & Steinmetz, L.M.
Genome Biol. 2011 Apr 11;12(4):R36.
ABSTRACT: BACKGROUND: When mismatches in heteroduplex DNA formed during meiotic recombination are left unrepaired, post-meiotic segregation of the two mismatched alleles occurs during the ensuing round of mitosis. This gives rise to somatic mosaicism in multicellular organisms and leads to unexpected allelic combinations among progeny. Despite its implications for inheritance, post-meiotic segregation has been studied at only a few loci. RESULTS: By genotyping tens of thousands of genetic markers in yeast segregants and their clonal progeny, we analyzed post-meiotic segregation at a genome-wide scale. We show that post-meiotic segregation occurs in close to 10% of recombination events. Although the overall number of markers affected in a single meiosis is small, the rate of post-meiotic segregation is more than five orders of magnitude larger than the base substitution mutation rate. Post-meiotic segregation took place with equal relative frequency in crossovers and non-crossovers, and usually at the edges of gene conversion tracts. Furthermore, post-meiotic segregation tended to occur in markers that are isolated from other heterozygosities and preferentially at polymorphism types that are relatively uncommon in the yeast species. CONCLUSIONS: Overall, our survey reveals the genome-wide characteristics of post-meiotic segregation. The results show that post-meiotic segregation is widespread in meiotic recombination and could be a significant determinant of allelic inheritance and allele frequencies at the population level.
Antisense expression increases gene expression variability and locus interdependency.
Xu, Z., Wei, W., Gagneur, J., Clauder-Munster, S., Smolik, M., Huber, W. & Steinmetz, L.M.
Mol Syst Biol. 2011 Feb 15;7:468. doi: 10.1038/msb.2011.1.
Genome-wide transcription profiling has revealed extensive expression of non-coding RNAs antisense to genes, yet their functions, if any, remain to be understood. In this study, we perform a systematic analysis of sense-antisense expression in response to genetic and environmental changes in yeast. We find that antisense expression is associated with genes of larger expression variability. This is characterized by more 'switching off' at low levels of expression for genes with antisense compared to genes without, yet similar expression at maximal induction. By disrupting antisense transcription, we demonstrate that antisense expression confers an on-off switch on gene regulation for the SUR7 gene. Consistent with this, genes that must respond in a switch-like manner, such as stress-response and environment-specific genes, are enriched for antisense expression. In addition, our data provide evidence that antisense expression initiated from bidirectional promoters enables the spreading of regulatory signals from one locus to neighbouring genes. These results indicate a general regulatory effect of antisense expression on sense genes and emphasize the importance of antisense-initiating regions downstream of genes in models of gene regulation.
Execution of the meiotic noncoding RNA expression program and the onset of gametogenesis in yeast require the conserved exosome subunit Rrp6.
Lardenois, A., Liu, Y., Walther, T., Chalmel, F., Evrard, B., Granovskaia, M., Chu, A., Davis, R.W., Steinmetz, L.M. & Primig, M.
Proc Natl Acad Sci U S A. 2011 Jan 18;108(3):1058-63. doi:10.1073/pnas.1016459108. Epub 2010 Dec 13.
Budding yeast noncoding RNAs (ncRNAs) are pervasively transcribed during mitosis, and some regulate mitotic protein-coding genes. However, little is known about ncRNA expression during meiotic development. Using high-resolution profiling we identified an extensive meiotic ncRNA expression program interlaced with the protein-coding transcriptome via sense/antisense transcript pairs, bidirectional promoters, and ncRNAs that overlap the regulatory regions of genes. Meiotic unannotated transcripts (MUTs) are mitotic targets of the conserved exosome component Rrp6, which itself is degraded after the onset of meiosis when MUTs and other ncRNAs accumulate in successive waves. Diploid cells lacking Rrp6 fail to initiate premeiotic DNA replication normally and cannot undergo efficient meiotic development. The present study demonstrates a unique function for budding yeast Rrp6 in degrading distinct classes of meiotically induced ncRNAs during vegetative growth and the onset of meiosis and thus points to a critical role of differential ncRNA expression in the execution of a conserved developmental program.
Yeast Sen1 helicase protects the genome from transcription-associated instability.
Mischo, H.E., Gomez-Gonzalez, B., Grzechnik, P., Rondon, A.G., Wei, W., Steinmetz, L.M., Aguilera, A. & Proudfoot, N.J.
Mol Cell. 2011 Jan 7;41(1):21-32.
Sen1 of S. cerevisiae is a known component of the NRD complex implicated in transcription termination of nonpolyadenylated as well as some polyadenylated RNA polymerase II transcripts. We now show that Sen1 helicase possesses a wider function by restricting the occurrence of RNA:DNA hybrids that may naturally form during transcription, when nascent RNA hybridizes to DNA prior to its packaging into RNA protein complexes. These hybrids displace the nontranscribed strand and create R loop structures. Loss of Sen1 results in transient R loop accumulation and so elicits transcription-associated recombination. SEN1 genetically interacts with DNA repair genes, suggesting that R loop resolution requires proteins involved in homologous recombination. Based on these findings, we propose that R loop formation is a frequent event during transcription and a key function of Sen1 is to prevent their accumulation and associated genome instability.
Genome-wide transcriptome analysis in yeast using high-density tiling arrays.
David, L., Clauder-Munster, S. & Steinmetz, L.M.
Methods Mol Biol. 2011;759:107-23.
In the last decade, it became clear that transcription goes far beyond that of protein-coding genes. Most RNA molecules are transcribed from intergenic regions or introns and exhibit much variability in size, expression level, secondary structure, and evolutionary conservation. While for several types of non-coding RNAs some cellular functions have been reported, like for micro-RNAs and small nucleolar RNAs, for most others no indications of function or regulation have so far been found. Therefore, the RNA population inside a cell is diverse and cryptic and, thus, demands powerful methods to study its composition, abundance, and structure. DNA oligonucleotide microarrays have proven to be of great utility to study transcription of genes in various organisms. Recently, due to advancement in microarray technology, tiling microarrays that extend transcription measurement to genomic regions beyond protein-coding genes were designed for several species. The Saccharomyces cerevisiae yeast tiling array contains overlapping probes across the full genomic sequence, with consecutive probes starting every 8 bp on average on each strand, enabling strand-specific measurement of transcription from a full eukaryotic genome. Here, we describe the methods used to extract yeast RNA, convert it into first-strand cDNA, fragment, and label it for hybridization to the tiling array. This protocol will enable researchers not only to study which genes are expressed and to what levels, but also to identify non-coding RNAs and to study the structure of transcripts including their untranslated regions, alternative start, stop, and processing sites. This information will allow understanding their roles inside cells.
The baker's yeast diploid genome is remarkably stable in vegetative growth and meiosis.
Nishant, K.T., Wei, W., Mancera, E., Argueso, J.L., Schlattl, A., Delhomme, N., Ma, X., Bustamante, C.D., Korbel, J.O., Gu, Z., Steinmetz, L.M. & Alani, E.
PLoS Genet. 2010 Sep 9;6(9):e1001109. doi: 10.1371/journal.pgen.1001109.
Accurate estimates of mutation rates provide critical information to analyze genome evolution and organism fitness. We used whole-genome DNA sequencing, pulse-field gel electrophoresis, and comparative genome hybridization to determine mutation rates in diploid vegetative and meiotic mutation accumulation lines of Saccharomyces cerevisiae. The vegetative lines underwent only mitotic divisions while the meiotic lines underwent a meiotic cycle every approximately 20 vegetative divisions. Similar base substitution rates were estimated for both lines. Given our experimental design, these measures indicated that the meiotic mutation rate is within the range of being equal to zero to being 55-fold higher than the vegetative rate. Mutations detected in vegetative lines were all heterozygous while those in meiotic lines were homozygous. A quantitative analysis of intra-tetrad mating events in the meiotic lines showed that inter-spore mating is primarily responsible for rapidly fixing mutations to homozygosity as well as for removing mutations. We did not observe 1-2 nt insertion/deletion (in-del) mutations in any of the sequenced lines and only one structural variant in a non-telomeric location was found. However, a large number of structural variations in subtelomeric sequences were seen in both vegetative and meiotic lines that did not affect viability. Our results indicate that the diploid yeast nuclear genome is remarkably stable during the vegetative and meiotic cell cycles and support the hypothesis that peripheral regions of chromosomes are more dynamic than gene-rich central sections where structural rearrangements could be deleterious. This work also provides an improved estimate for the mutational load carried by diploid organisms.
Antagonistic changes in sensitivity to antifungal drugs by mutations of an important ABC transporter gene in a fungal pathogen.
Guan, W., Jiang, H., Guo, X., Mancera, E., Xu, L., Li, Y., Steinmetz, L.M., Li, Y. & Gu, Z.
PLoS One. 2010 Jun 25;5(6):e11309.
Fungal pathogens can be lethal, especially among immunocompromised populations, such as patients with AIDS and recipients of tissue transplantation or chemotherapy. Prolonged usage of antifungal reagents can lead to drug resistance and treatment failure. Understanding mechanisms that underlie drug resistance by pathogenic microorganisms is thus vital for dealing with this emerging issue. In this study, we show that dramatic sequence changes in PDR5, an ABC (ATP-binding cassette) efflux transporter protein gene in an opportunistic fungal pathogen, caused the organism to become hypersensitive to azole, a widely used antifungal drug. Surprisingly, the same mutations conferred growth advantages to the organism on polyenes, which are also commonly used antimycotics. Our results indicate that Pdr5p might be important for ergosterol homeostasis. The observed remarkable sequence divergence in the PDR5 gene in yeast strain YJM789 may represent an interesting case of adaptive loss of gene function with significant clinical implications.
Genetic analysis of variation in transcription factor binding in yeast.
Zheng, W., Zhao, H., Mancera, E., Steinmetz, L.M. & Snyder, M.
Nature. 2010 Apr 22;464(7292):1187-91. doi: 10.1038/nature08934. Epub 2010 Mar17.
Variation in transcriptional regulation is thought to be a major cause of phenotypic diversity. Although widespread differences in gene expression among individuals of a species have been observed, studies to examine the variability of transcription factor binding on a global scale have not been performed, and thus the extent and underlying genetic basis of transcription factor binding diversity is unknown. By mapping differences in transcription factor binding among individuals, here we present the genetic basis of such variation on a genome-wide scale. Whole-genome Ste12-binding profiles were determined using chromatin immunoprecipitation coupled with DNA sequencing in pheromone-treated cells of 43 segregants of a cross between two highly diverged yeast strains and their parental lines. We identified extensive Ste12-binding variation among individuals, and mapped underlying cis- and trans-acting loci responsible for such variation. We showed that most transcription factor binding variation is cis-linked, and that many variations are associated with polymorphisms residing in the binding motifs of Ste12 as well as those of several proposed Ste12 cofactors. We also identified two trans-factors, AMN1 and FLO8, that modulate Ste12 binding to promoters of more than ten genes under alpha-factor treatment. Neither of these two genes was previously known to regulate Ste12, and we suggest that they may be mediators of gene activity and phenotypic diversity. Ste12 binding strongly correlates with gene expression for more than 200 genes, indicating that binding variation is functional. Many of the variable-bound genes are involved in cell wall organization and biogenesis. Overall, these studies identified genetic regulators of molecular diversity among individuals and provide new insights into mechanisms of gene regulation.
Natural single-nucleosome epi-polymorphisms in yeast.
Nagarajan, M., Veyrieras, J.B., de Dieuleveult, M., Bottin, H., Fehrmann, S., Abraham, A.L., Croze, S., Steinmetz, L.M., Gidrol, X. & Yvert, G.
PLoS Genet. 2010 Apr 22;6(4):e1000913.
Epigenomes commonly refer to the sequence of presence/absence of specific epigenetic marks along eukaryotic chromatin. Complete histone-borne epigenomes have now been described at single-nucleosome resolution from various organisms, tissues, developmental stages, or diseases, yet their intra-species natural variation has never been investigated. We describe here that the epigenomic sequence of histone H3 acetylation at Lysine 14 (H3K14ac) differs greatly between two unrelated strains of the yeast Saccharomyces cerevisiae. Using single-nucleosome chromatin immunoprecipitation and mapping, we interrogated 58,694 nucleosomes and found that 5,442 of them differed in their level of H3K14 acetylation, at a false discovery rate (FDR) of 0.0001. These Single Nucleosome Epi-Polymorphisms (SNEPs) were enriched at regulatory sites and conserved non-coding DNA sequences. Surprisingly, higher acetylation in one strain did not imply higher expression of the relevant gene. However, SNEPs were enriched in genes of high transcriptional variability and one SNEP was associated with the strength of gene activation upon stimulation. Our observations suggest a high level of inter-individual epigenomic variation in natural populations, with essential questions on the origin of this diversity and its relevance to gene x environment interactions.
High-resolution transcription atlas of the mitotic cell cycle in budding yeast.
Granovskaia, M.V., Jensen, L.J., Ritchie, M.E., Toedling, J., Ning, Y., Bork, P., Huber, W. & Steinmetz, L.M.
Genome Biol. 2010 Mar 1;11(3):R24.
ABSTRACT: BACKGROUND: Extensive transcription of non-coding RNAs has been detected in eukaryotic genomes and is thought to constitute an additional layer in the regulation of gene expression. Despite this role, their transcription through the cell cycle has not been studied; genome-wide approaches have only focused on protein-coding genes. To explore the complex transcriptome architecture underlying the budding yeast cell cycle, we used 8 bp tiling arrays to generate a 5 minute-resolution, strand-specific expression atlas of the whole genome. RESULTS: We discovered 523 antisense transcripts, of which 80 cycle or are located opposite periodically expressed mRNAs, 135 unannotated intergenic non-coding RNAs, of which 11 cycle, and 109 cell-cycle-regulated protein-coding genes that had not previously been shown to cycle. We detected periodic expression coupling of sense and antisense transcript pairs, including antisense transcripts opposite of key cell-cycle regulators, like FAR1 and TAF2. CONCLUSIONS: Our dataset presents the most comprehensive resource to date on gene expression during the budding yeast cell cycle, revealing both protein-coding and non-coding RNA periodicity of expression and the first that profiles non-annotated RNAs. It enables hypothesis-driven mechanistic studies concerning the functions of non-coding RNAs.
Dissecting the genetic basis of resistance to malaria parasites in Anopheles gambiae.
Blandin, S.A., Wang-Sattler, R., Lamacchia, M., Gagneur, J., Lycett, G., Ning, Y., Levashina, E.A. & Steinmetz, L.M.
Science. 2009 Oct 2;326(5949):147-50. doi: 10.1126/science.1175241.
The ability of Anopheles gambiae mosquitoes to transmit Plasmodium parasites is highly variable between individuals. However, the genetic basis of this variability has remained unknown. We combined genome-wide mapping and reciprocal allele-specific RNA interference (rasRNAi) to identify the genomic locus that confers resistance to malaria parasites and demonstrated that polymorphisms in a single gene encoding the antiparasitic thioester-containing protein 1 (TEP1) explain a substantial part of the variability in parasite killing. The link between TEP1 alleles and resistance to malaria may offer new tools for controlling malaria transmission. The successful application of rasRNAi in Anopheles suggests that it could also be applied to other organisms where RNAi is feasible to dissect complex phenotypes to the level of individual quantitative trait alleles.
Trans-acting antisense RNAs mediate transcriptional gene cosuppression in S. cerevisiae.
Camblong, J., Beyrouthy, N., Guffanti, E., Schlaepfer, G., Steinmetz, L.M. & Stutz, F.
Genes Dev. 2009 Jul 1;23(13):1534-45.
Homology-dependent gene silencing, a phenomenon described as cosuppression in plants, depends on siRNAs. We provide evidence that in Saccharomyces cerevisiae, which is missing the RNAi machinery, protein coding gene cosuppression exists. Indeed, introduction of an additional copy of PHO84 on a plasmid or within the genome results in the cosilencing of both the transgene and the endogenous gene. This repression is transcriptional and position-independent and requires trans-acting antisense RNAs. Antisense RNAs induce transcriptional gene silencing both in cis and in trans, and the two pathways differ by the implication of the Hda1/2/3 complex. We also show that trans-silencing is influenced by the Set1 histone methyltransferase, which promotes antisense RNA production. Finally we show that although antisense-mediated cis-silencing occurs in other genes, trans-silencing so far depends on features specific to PHO84. All together our data highlight the importance of noncoding RNAs in mediating RNAi-independent transcriptional gene silencing.
Genome-wide allele- and strand-specific expression profiling.
Gagneur, J., Sinha, H., Perocchi, F., Bourgon, R., Huber, W. & Steinmetz, L.M.
Mol Syst Biol. 2009;5:274. Epub 2009 Jun 16.
Recent reports have shown that most of the genome is transcribed and that transcription frequently occurs concurrently on both DNA strands. In diploid genomes, the expression level of each allele conditions the degree to which sequence polymorphisms affect the phenotype. It is thus essential to quantify expression in an allele- and strand-specific manner. Using a custom-designed tiling array and a new computational approach, we piloted measuring allele- and strand-specific expression in yeast. Confident quantitative estimates of allele-specific expression were obtained for about half of the coding and non-coding transcripts of a heterozygous yeast strain, of which 371 transcripts (13%) showed significant allelic differential expression greater than 1.5-fold. The data revealed complex allelic differential expression on opposite strands. Furthermore, combining allele-specific expression with linkage mapping enabled identifying allelic variants that act in cis and in trans to regulate allelic expression in the heterozygous strain. Our results provide the first high-resolution analysis of differential expression on all four strands of an eukaryotic genome.
Array-based genotyping in S.cerevisiae using semi-supervised clustering.
Bourgon, R., Mancera, E., Brozzi, A., Steinmetz, L.M. & Huber, W.
Bioinformatics. 2009 Apr 15;25(8):1056-62. doi: 10.1093/bioinformatics/btp104.Epub 2009 Feb 23.
MOTIVATION: Microarrays provide an accurate and cost-effective method for genotyping large numbers of individuals at high resolution. The resulting data permit the identification of loci at which genetic variation is associated with quantitative traits, or fine mapping of meiotic recombination, which is a key determinant of genetic diversity among individuals. Several issues inherent to short oligonucleotide arrays -- cross-hybridization, or variability in probe response to target -- have the potential to produce genotyping errors. There is a need for improved statistical methods for array-based genotyping. RESULTS: We developed ssGenotyping (ssG), a multivariate, semi-supervised approach for using microarrays to genotype haploid individuals at thousands of polymorphic sites. Using a meiotic recombination dataset, we show that ssG is more accurate than existing supervised classification methods, and that it produces denser marker coverage. The ssG algorithm is able to fit probe-specific affinity differences and to detect and filter spurious signal, permitting high-confidence genotyping at nucleotide resolution. We also demonstrate that oligonucleotide probe response depends significantly on genomic background, even when the probe's specific target sequence is unchanged. As a result, supervised classifiers trained on reference strains may not generalize well to diverged strains; ssG's semi-supervised approach, on the other hand, adapts automatically.
Bidirectional promoters generate pervasive transcription in yeast.
Xu, Z., Wei, W., Gagneur, J., Perocchi, F., Clauder-Munster, S., Camblong, J., Guffanti, E., Stutz, F., Huber, W. & Steinmetz, L.M.
Nature. 2009 Feb 19;457(7232):1033-7. Epub 2009 Jan 25.
Genome-wide pervasive transcription has been reported in many eukaryotic organisms, revealing a highly interleaved transcriptome organization that involves hundreds of previously unknown non-coding RNAs. These recently identified transcripts either exist stably in cells (stable unannotated transcripts, SUTs) or are rapidly degraded by the RNA surveillance pathway (cryptic unstable transcripts, CUTs). One characteristic of pervasive transcription is the extensive overlap of SUTs and CUTs with previously annotated features, which prompts questions regarding how these transcripts are generated, and whether they exert function. Single-gene studies have shown that transcription of SUTs and CUTs can be functional, through mechanisms involving the generated RNAs or their generation itself. So far, a complete transcriptome architecture including SUTs and CUTs has not been described in any organism. Knowledge about the position and genome-wide arrangement of these transcripts will be instrumental in understanding their function. Here we provide a comprehensive analysis of these transcripts in the context of multiple conditions, a mutant of the exosome machinery and different strain backgrounds of Saccharomyces cerevisiae. We show that both SUTs and CUTs display distinct patterns of distribution at specific locations. Most of the newly identified transcripts initiate from nucleosome-free regions (NFRs) associated with the promoters of other transcripts (mostly protein-coding genes), or from NFRs at the 3' ends of protein-coding genes. Likewise, about half of all coding transcripts initiate from NFRs associated with promoters of other transcripts. These data change our view of how a genome is transcribed, indicating that bidirectionality is an inherent feature of promoters. Such an arrangement of divergent and overlapping transcripts may provide a mechanism for local spreading of regulatory signals-that is, coupling the transcriptional regulation of neighbouring genes by means of transcriptional interference or histone modification.
Widespread bidirectional promoters are the major source of cryptic transcripts in yeast.
Neil, H., Malabat, C., d'Aubenton-Carafa, Y., Xu, Z., Steinmetz, L.M. & Jacquier, A.
Nature. 2009 Feb 19;457(7232):1038-42. Epub 2009 Jan 25.
Pervasive and hidden transcription is widespread in eukaryotes, but its global level, the mechanisms from which it originates and its functional significance are unclear. Cryptic unstable transcripts (CUTs) were recently described as a principal class of RNA polymerase II transcripts in Saccharomyces cerevisiae. These transcripts are targeted for degradation immediately after synthesis by the action of the Nrd1-exosome-TRAMP complexes. Although CUT degradation mechanisms have been analysed in detail, the genome-wide distribution at the nucleotide resolution and the prevalence of CUTs are unknown. Here we report the first high-resolution genomic map of CUTs in yeast, revealing a class of potentially functional CUTs and the intrinsic bidirectional nature of eukaryotic promoters. An RNA fraction highly enriched in CUTs was analysed by a 3' Long-SAGE (serial analysis of gene expression) approach adapted to deep sequencing. The resulting detailed genomic map of CUTs revealed that they derive from extremely widespread and very well defined transcription units and do not result from unspecific transcriptional noise. Moreover, the transcription of CUTs predominantly arises within nucleosome-free regions, most of which correspond to promoter regions of bona fide genes. Some of the CUTs start upstream from messenger RNAs and overlap their 5' end. Our study of glycolysis genes, as well as recent results from the literature, indicate that such concurrent transcription is potentially associated with regulatory mechanisms. Our data reveal numerous new CUTs with such a potential regulatory role. However, most of the identified CUTs corresponded to transcripts divergent from the promoter regions of genes, indicating that they represent by-products of divergent transcription occurring at many and possibly most promoters. Eukaryotic promoter regions are thus intrinsically bidirectional, a fundamental property that escaped previous analyses because in most cases divergent transcription generates short-lived unstable transcripts present at very low steady-state levels.
Identification of mitochondrial disease genes through integrative analysis of multiple datasets.
Aiyar, R.S., Gagneur, J. & Steinmetz, L.M.
Methods. 2008 Dec;46(4):248-55. Epub 2008 Oct 16.
Determining the genetic factors in a disease is crucial to elucidating its molecular basis. This task is challenging due to a lack of information on gene function. The integration of large-scale functional genomics data has proven to be an effective strategy to prioritize candidate disease genes. Mitochondrial disorders are a prevalent and heterogeneous class of diseases that are particularly amenable to this approach. Here we explain the application of integrative approaches to the identification of mitochondrial disease genes. We first examine various datasets that can be used to evaluate the involvement of each gene in mitochondrial function. The data integration methodology is then described, accompanied by examples of common implementations. Finally, we discuss how gene networks are constructed using integrative techniques and applied to candidate gene prioritization. Relevant public data resources are indicated. This report highlights the success and potential of data integration as well as its applicability to the search for mitochondrial disease genes.
Sequential elimination of major-effect contributors identifies additional quantitative trait loci conditioning high-temperature growth in yeast.
Sinha, H., David, L., Pascon, R.C., Clauder-Munster, S., Krishnakumar, S., Nguyen, M., Shi, G., Dean, J., Davis, R.W., Oefner, P.J., McCusker, J.H. & Steinmetz, L.M.
Genetics. 2008 Nov;180(3):1661-70. Epub 2008 Sep 9.
Several quantitative trait loci (QTL) mapping strategies can successfully identify major-effect loci, but often have poor success detecting loci with minor effects, potentially due to the confounding effects of major loci, epistasis, and limited sample sizes. To overcome such difficulties, we used a targeted backcross mapping strategy that genetically eliminated the effect of a previously identified major QTL underlying high-temperature growth (Htg) in yeast. This strategy facilitated the mapping of three novel QTL contributing to Htg of a clinically derived yeast strain. One QTL, which is linked to the previously identified major-effect QTL, was dissected, and NCS2 was identified as the causative gene. The interaction of the NCS2 QTL with the first major-effect QTL was background dependent, revealing a complex QTL architecture spanning these two linked loci. Such complex architecture suggests that more genes than can be predicted are likely to contribute to quantitative traits. The targeted backcrossing approach overcomes the difficulties posed by sample size, genetic linkage, and epistatic effects and facilitates identification of additional alleles with smaller contributions to complex traits.
High-resolution mapping of meiotic crossovers and non-crossovers in yeast.
Mancera, E., Bourgon, R., Brozzi, A., Huber, W. & Steinmetz, L.M.
Nature. 2008 Jul 24;454(7203):479-85. Epub 2008 Jul 9.
Meiotic recombination has a central role in the evolution of sexually reproducing organisms. The two recombination outcomes, crossover and non-crossover, increase genetic diversity, but have the potential to homogenize alleles by gene conversion. Whereas crossover rates vary considerably across the genome, non-crossovers and gene conversions have only been identified in a handful of loci. To examine recombination genome wide and at high spatial resolution, we generated maps of crossovers, crossover-associated gene conversion and non-crossover gene conversion using dense genetic marker data collected from all four products of fifty-six yeast (Saccharomyces cerevisiae) meioses. Our maps reveal differences in the distributions of crossovers and non-crossovers, showing more regions where either crossovers or non-crossovers are favoured than expected by chance. Furthermore, we detect evidence for interference between crossovers and non-crossovers, a phenomenon previously only known to occur between crossovers. Up to 1% of the genome of each meiotic product is subject to gene conversion in a single meiosis, with detectable bias towards GC nucleotides. To our knowledge the maps represent the first high-resolution, genome-wide characterization of the multiple outcomes of recombination in any organism. In addition, because non-crossover hotspots create holes of reduced linkage within haplotype blocks, our results stress the need to incorporate non-crossovers into genetic linkage analysis.
Systematic screens for human disease genes, from yeast to human and back.
Perocchi, F., Mancera, E. & Steinmetz, L.M.
Mol Biosyst. 2008 Jan;4(1):18-29. Epub 2007 Sep 25.
Systematic screens for human disease genes have emerged in recent years, due to the wealth of information provided by genome sequences and large scale datasets. Here we review how integration of genomic data in yeast and human is helping to elucidate the genetic basis of mitochondrial diseases. The identification of nearly all yeast mitochondrial proteins and many of their functional interactions provides insight into the role of mitochondria in cellular processes. This information enables prioritization of the candidate genes underlying mitochondrial disorders. In an iterative fashion, the link between predicted human candidate genes and their disease phenotypes can be experimentally tested back in yeast.
Antisense artifacts in transcriptome microarray experiments are resolved by actinomycin D.
Perocchi, F., Xu, Z., Clauder-Munster, S. & Steinmetz, L.M.
Nucleic Acids Res. 2007;35(19):e128. Epub 2007 Sep 26.
Recent transcription profiling studies have revealed an unanticipatedly large proportion of antisense transcription across eukaryotic and bacterial genomes. However, the extent and significance of antisense transcripts is controversial partly because experimental artifacts are suspected. Here, we present a method to generate clean genome-wide transcriptome profiles, using actinomycin D (ActD) during reverse transcription. We show that antisense artifacts appear to be triggered by spurious synthesis of second-strand cDNA during reverse transcription reactions. Strand-specific hybridization signals obtained from Saccharomyces cerevisiae tiling arrays were compared between samples prepared with and without ActD. Use of ActD removed about half of the detectable antisense transcripts, consistent with their being artifacts, while sense expression levels and about 200 antisense transcripts were not affected. Our findings thus facilitate a more accurate assessment of the extent and position of antisense transcription, towards a better understanding of its role in cells.
Genome sequencing and comparative analysis of Saccharomyces cerevisiae strain YJM789.
Wei, W., McCusker, J.H., Hyman, R.W., Jones, T., Ning, Y., Cao, Z., Gu, Z., Bruno, D., Miranda, M., Nguyen, M., Wilhelmy, J., Komp, C., Tamse, R., Wang, X., Jia, P., Luedi, P., Oefner, P.J., David, L., Dietrich, F.S., Li, Y., Davis, R.W. & Steinmetz, L.M.
Proc Natl Acad Sci U S A. 2007 Jul 31;104(31):12825-30. Epub 2007 Jul 25.
We sequenced the genome of Saccharomyces cerevisiae strain YJM789, which was derived from a yeast isolated from the lung of an AIDS patient with pneumonia. The strain is used for studies of fungal infections and quantitative genetics because of its extensive phenotypic differences to the laboratory reference strain, including growth at high temperature and deadly virulence in mouse models. Here we show that the approximately 12-Mb genome of YJM789 contains approximately 60,000 SNPs and approximately 6,000 indels with respect to the reference S288c genome, leading to protein polymorphisms with a few known cases of phenotypic changes. Several ORFs are found to be unique to YJM789, some of which might have been acquired through horizontal transfer. Localized regions of high polymorphism density are scattered over the genome, in some cases spanning multiple ORFs and in others concentrated within single genes. The sequence of YJM789 contains clues to pathogenicity and spurs the development of more powerful approaches to dissecting the genetic basis of complex hereditary traits.
Mosaic Genome Architecture of the Anopheles gambiae Species Complex.
Wang-Sattler, R., Blandin, S., Ning, Y., Blass, C., Dolo, G., Toure, Y.T., Torre, A.D., Lanzaro, G.C., Steinmetz, L.M., Kafatos, F.C. & Zheng, L.
PLoS ONE. 2007 Nov 28;2(11):e1249.
BACKGROUND: Attempts over the last three decades to reconstruct the phylogenetic history of the Anopheles gambiae species complex have been important for developing better strategies to control malaria transmission. METHODOLOGY: We used fingerprint genotyping data from 414 field-collected female mosquitoes at 42 microsatellite loci to infer the evolutionary relationships of four species in the A. gambiae complex, the two major malaria vectors A. gambiae sensu stricto (A. gambiae s.s.) and A. arabiensis, as well as two minor vectors, A. merus and A. melas. PRINCIPAL FINDINGS: We identify six taxonomic units, including a clear separation of West and East Africa A. gambiae s.s. S molecular forms. We show that the phylogenetic relationships vary widely between different genomic regions, thus demonstrating the mosaic nature of the genome of these species. The two major malaria vectors are closely related and closer to A. merus than to A. melas at the genome-wide level, which is also true if only autosomes are considered. However, within the Xag inversion region of the X chromosome, the M and two S molecular forms are most similar to A. merus. Near the X centromere, outside the Xag region, the two S forms are highly dissimilar to the other taxa. Furthermore, our data suggest that the centromeric region of chromosome 3 is a strong discriminator between the major and minor malaria vectors. CONCLUSIONS: Although further studies are needed to elucidate the basis of the phylogenetic variation among the different regions of the genome, the preponderance of sympatric admixtures among taxa strongly favor introgression of different genomic regions between species, rather than lineage sorting of ancestral polymorphism, as a possible mechanism.
Assessing systems properties of yeast mitochondria through an interaction map of the organelle.
Perocchi, F., Jensen, L.J., Gagneur, J., Ahting, U., von Mering, C., Bork, P., Prokisch, H. & Steinmetz, L.M.
PLoS Genet. 2006 Oct 20;2(10):e170.
Mitochondria carry out specialized functions; compartmentalized, yet integrated into the metabolic and signaling processes of the cell. Although many mitochondrial proteins have been identified, understanding their functional interrelationships has been a challenge. Here we construct a comprehensive network of the mitochondrial system. We integrated genome-wide datasets to generate an accurate and inclusive mitochondrial parts list. Together with benchmarked measures of protein interactions, a network of mitochondria was constructed in their cellular context, including extra-mitochondrial proteins. This network also integrates data from different organisms to expand the known mitochondrial biology beyond the information in the existing databases. Our network brings together annotated and predicted functions into a single framework. This enabled, for the entire system, a survey of mutant phenotypes, gene regulation, evolution, and disease susceptibility. Furthermore, we experimentally validated the localization of several candidate proteins and derived novel functional contexts for hundreds of uncharacterized proteins. Our network thus advances the understanding of the mitochondrial system in yeast and identifies properties of genes underlying human mitochondrial disorders.
Transcript mapping with high-density oligonucleotide tiling arrays.
Huber, W., Toedling, J. & Steinmetz, L.M.
Bioinformatics. 2006 Aug 15;22(16):1963-70. Epub 2006 Jun 20.
MOTIVATION: High-density DNA tiling microarrays are a powerful tool for the characterization of complete transcriptomes. The two major analytical challenges are the segmentation of the hybridization signal along genomic coordinates to accurately determine transcript boundaries and the adjustment of the sequence-dependent response of the oligonucleotide probes to achieve quantitative comparability of the signal between different probes. RESULTS: We describe a dynamic programming algorithm for finding a globally optimal fit of a piecewise constant expression profile along genomic coordinates. We developed a probe-specific background correction and scaling method that employs empirical probe response parameters determined from reference hybridizations with no need for paired mismatch probes. This combined analysis approach allows the accurate determination of dynamical changes in transcription architectures from hybridization data and will help to study the biological significance of complex transcriptional phenomena in eukaryotic genomes. AVAILABILITY: R package tilingArray at http://www.bioconductor.org.
Capturing cellular machines by systematic screens of protein complexes.
Gagneur, J., David, L. & Steinmetz, L.M.
Trends Microbiol. 2006 Aug;14(8):336-9. Epub 2006 Jun 16.
Two recent studies have provided the most complete screen for protein complexes in yeast to date, in which partners were identified for approximately half of the proteome. A comparison shows that these two datasets are complementary. In addition, one of the analyses points to a modular organization of the cellular protein network. These data will prove useful in defining principles and trends that arise when combining large-scale datasets of different natures, and in deriving properties of protein machines in cellular systems.
Complex genetic interactions in a quantitative trait locus.
Sinha, H., Nicholson, B.P., Steinmetz, L.M. & McCusker, J.H.
PLoS Genet. 2006 Feb 3;2(2):e13.
Whether in natural populations or between two unrelated members of a species, most phenotypic variation is quantitative. To analyze such quantitative traits, one must first map the underlying quantitative trait loci. Next, and far more difficult, one must identify the quantitative trait genes (QTGs), characterize QTG interactions, and identify the phenotypically relevant polymorphisms to determine how QTGs contribute to phenotype. In this work, we analyzed three Saccharomyces cerevisiae high-temperature growth (Htg) QTGs (MKT1, END3, and RHO2). We observed a high level of genetic interactions among QTGs and strain background. Interestingly, while the MKT1 and END3 coding polymorphisms contribute to phenotype, it is the RHO2 3'UTR polymorphisms that are phenotypically relevant. Reciprocal hemizygosity analysis of the Htg QTGs in hybrids between S288c and ten unrelated S. cerevisiae strains reveals that the contributions of the Htg QTGs are not conserved in nine other hybrids, which has implications for QTG identification by marker-trait association. Our findings demonstrate the variety and complexity of QTG contributions to phenotype, the impact of genetic background, and the value of quantitative genetic studies in S. cerevisiae.
A high-resolution map of transcription in the yeast genome.
David, L., Huber, W., Granovskaia, M., Toedling, J., Palm, C.J., Bofkin, L., Jones, T., Davis, R.W. & Steinmetz, L.M.
Proc Natl Acad Sci U S A. 2006 Apr 4;103(14):5320-5. Epub 2006 Mar 28.
There is abundant transcription from eukaryotic genomes unaccounted for by protein coding genes. A high-resolution genome-wide survey of transcription in a well annotated genome will help relate transcriptional complexity to function. By quantifying RNA expression on both strands of the complete genome of Saccharomyces cerevisiae using a high-density oligonucleotide tiling array, this study identifies the boundary, structure, and level of coding and noncoding transcripts. A total of 85% of the genome is expressed in rich media. Apart from expected transcripts, we found operon-like transcripts, transcripts from neighboring genes not separated by intergenic regions, and genes with complex transcriptional architecture where different parts of the same gene are expressed at different levels. We mapped the positions of 3' and 5' UTRs of coding genes and identified hundreds of RNA transcripts distinct from annotated genes. These nonannotated transcripts, on average, have lower sequence conservation and lower rates of deletion phenotype than protein coding genes. Many other transcripts overlap known genes in antisense orientation, and for these pairs global correlations were discovered: UTR lengths correlated with gene function, localization, and requirements for regulation; antisense transcripts overlapped 3' UTRs more than 5' UTRs; UTRs with overlapping antisense tended to be longer; and the presence of antisense associated with gene function. These findings may suggest a regulatory role of antisense transcription in S. cerevisiae. Moreover, the data show that even this well studied genome has transcriptional complexity far beyond current annotation.
Re-analysis of data and its integration.
Jensen, L.J. & Steinmetz, L.M.
FEBS Lett 2005 Mar 21;579(8):1802-7.
To understand a biological process it is clear that a single approach will not be sufficient, just like a single measurement on a protein--such as its expression level--does not describe protein function. Using reference sets of proteins as benchmarks different approaches can be scaled and integrated. Here, we demonstrate the power of data re-analysis and integration by applying it in a case study to data from deletion phenotype screens and mRNA expression profiling.
Elevated evolutionary rates in the laboratory strain of Saccharomyces cerevisiae.
Gu, Z., David, L., Petrov, D., Jones, T., Davis, R.W. & Steinmetz, L.M.
Proc Natl Acad Sci U S A 2005 Jan 25;102(4):1092-7. Epub 2005 Jan 12.
By using the maximum likelihood method, we made a genome-wide comparison of the evolutionary rates in the lineages leading to the laboratory strain (S288c) and a wild strain (YJM789) of Saccharomyces cerevisiae and found that genes in the laboratory strain tend to evolve faster than in the wild strain. The pattern of elevated evolution suggests that relaxation of selection intensity is the dominant underlying reason, which is consistent with recurrent bottlenecks in the S. cerevisiae laboratory strain population. Supporting this conclusion are the following observations: (i) the increases in nonsynonymous evolutionary rate occur for genes in all functional categories; (ii) most of the synonymous evolutionary rate increases in S288c occur in genes with strong codon usage bias; (iii) genes under stronger negative selection have a larger increase in nonsynonymous evolutionary rate; and (iv) more genes with adaptive evolution were detected in the laboratory strain, but they do not account for the majority of the increased evolution. The present discoveries suggest that experimental and possible industrial manipulations of the laboratory strain of yeast could have had a strong effect on the genetic makeup of this model organism. Furthermore, they imply an evolution of laboratory model organisms away from their wild counterparts, questioning the relevancy of the models especially when extensive laboratory cultivation has occurred. In addition, these results shed light on the evolution of livestock and crop species that have been under human domestication for years.
Integrative analysis of the mitochondrial proteome in yeast.
Prokisch, H., Scharfe, C., Camp DG, 2nd, Xiao, W., David, L., Andreoli, C., Monroe, M.E., Moore, R.J., Gritsenko, M.A., Kozany, C., Hixson, K.K., Mottaz, H.M., Zischka, H., Ueffing, M., Herman, Z.S., Davis, R.W., Meitinger, T., Oefner, P.J., Smith, R.D. & Steinmetz, L.M.
PLoS Biol 2004 Jun;2(6):e160. Epub 2004 Jun 15.
In this study yeast mitochondria were used as a model system to apply, evaluate, and integrate different genomic approaches to define the proteins of an organelle. Liquid chromatography mass spectrometry applied to purified mitochondria identified 546 proteins. By expression analysis and comparison to other proteome studies, we demonstrate that the proteomic approach identifies primarily highly abundant proteins. By expanding our evaluation to other types of genomic approaches, including systematic deletion phenotype screening, expression profiling, subcellular localization studies, protein interaction analyses, and computational predictions, we show that an integration of approaches moves beyond the limitations of any single approach. We report the success of each approach by benchmarking it against a reference set of known mitochondrial proteins, and predict approximately 700 proteins associated with the mitochondrial organelle from the integration of 22 datasets. We show that a combination of complementary approaches like deletion phenotype screening and mass spectrometry can identify over 75% of the known mitochondrial proteome. These findings have implications for choosing optimal genome-wide approaches for the study of other cellular systems, including organelles and pathways in various species. Furthermore, our systematic identification of genes involved in mitochondrial function and biogenesis in yeast expands the candidate genes available for mapping Mendelian and complex mitochondrial disorders in humans.
Maximizing the potential of functional genomics.
Steinmetz, L.M. & Davis, R.W.
Nat Rev Genet. 2004 Mar;5(3):190-201. Europe PMC
Role of duplicate genes in genetic robustness against null mutations.
Gu, Z., Steinmetz, L.M., Gu, X., Scharfe, C., Davis, R.W. & Li, W.H.
Nature 2003 Jan 2;421(6918):63-6.
Deleting a gene in an organism often has little phenotypic effect, owing to two mechanisms of compensation. The first is the existence of duplicate genes: that is, the loss of function in one copy can be compensated by the other copy or copies. The second mechanism of compensation stems from alternative metabolic pathways, regulatory networks, and so on. The relative importance of the two mechanisms has not been investigated except for a limited study, which suggested that the role of duplicate genes in compensation is negligible. The availability of fitness data for a nearly complete set of single-gene-deletion mutants of the Saccharomyces cerevisiae genome has enabled us to carry out a genome-wide evaluation of the role of duplicate genes in genetic robustness against null mutations. Here we show that there is a significantly higher probability of functional compensation for a duplicate gene than for a singleton, a high correlation between the frequency of compensation and the sequence similarity of two duplicates, and a higher probability of a severe fitness effect when the duplicate copy that is more highly expressed is deleted. We estimate that in S. cerevisiae at least a quarter of those gene deletions that have no phenotype are compensated by duplicate genes.
Gene function on a genomic scale.
Steinmetz, L.M. & Deutschbauer, A.M.
J Chromatogr B Analyt Technol Biomed Life Sci 2002 Dec 25;782(1-2):151-63.
The ability to obtain experimental measurements for thousands of genes has revolutionized our view of biological systems. While traditional studies of gene function evaluated many different properties for a single gene, genomic approaches can measure a single property for thousands of genes. Over the last years, genomic approaches have been developed to measure many different properties, including gene expression, deletion phenotype, and protein characteristics. The promise of integrating these datasets has made it attractive to test whether genomic approaches can determine gene function with accuracy comparable to single gene approaches.
Systematic screen for human disease genes in yeast.
Steinmetz, L.M., Scharfe, C., Deutschbauer, A.M., Mokranjac, D., Herman, Z.S., Jones, T., Chu, A.M., Giaever, G., Prokisch, H., Oefner, P.J. & Davis, R.W.
Nat Genet 2002 Aug;31(4):400-4.
High similarity between yeast and human mitochondria allows functional genomic study of Saccharomyces cerevisiae to be used to identify human genes involved in disease. So far, 102 heritable disorders have been attributed to defects in a quarter of the known nuclear-encoded mitochondrial proteins in humans. Many mitochondrial diseases remain unexplained, however, in part because only 40-60% of the presumed 700-1,000 proteins involved in mitochondrial function and biogenesis have been identified. Here we apply a systematic functional screen using the pre-existing whole-genome pool of yeast deletion mutants to identify mitochondrial proteins. Three million measurements of strain fitness identified 466 genes whose deletions impaired mitochondrial respiration, of which 265 were new. Our approach gave higher selection than other systematic approaches, including fivefold greater selection than gene expression analysis. To apply these advantages to human disorders involving mitochondria, human orthologs were identified and linked to heritable diseases using genomic map positions.
Evolutionary rate in the protein interaction network.
Fraser, H.B., Hirsh, A.E., Steinmetz, L.M., Scharfe, C. & Feldman, M.W.
Science 2002 Apr 26;296(5568):750-2.
High-throughput screens have begun to reveal the protein interaction network that underpins most cellular functions in the yeast Saccharomyces cerevisiae. How the organization of this network affects the evolution of the proteins that compose it is a fundamental question in molecular evolution. We show that the connectivity of well-conserved proteins in the network is negatively correlated with their rate of evolution. Proteins with more interactors evolve more slowly not because they are more important to the organism, but because a greater proportion of the protein is directly involved in its function. At sites important for interaction between proteins, evolutionary changes may occur largely by coevolution, in which substitutions in one protein result in selection pressure for reciprocal changes in interacting partners. We confirm one predicted outcome of this process-namely, that interacting proteins evolve at similar rates.
Dissecting the architecture of a quantitative trait locus in yeast.
Steinmetz, L.M., Sinha, H., Richards, D.R., Spiegelman, J.I., Oefner, P.J., McCusker, J.H. & Davis, R.W.
Nature 2002 Mar 21;416(6878):326-30.
Most phenotypic diversity in natural populations is characterized by differences in degree rather than in kind. Identification of the actual genes underlying these quantitative traits has proved difficult. As a result, little is known about their genetic architecture. The failures are thought to be due to the different contributions of many underlying genes to the phenotype and the ability of different combinations of genes and environmental factors to produce similar phenotypes. This study combined genome-wide mapping and a new genetic technique named reciprocal-hemizygosity analysis to achieve the complete dissection of a quantitative trait locus (QTL) in Saccharomyces cerevisiae. A QTL architecture was uncovered that was more complex than expected. Functional linkages both in cis and in trans were found between three tightly linked quantitative trait genes that are neither necessary nor sufficient in isolation. This arrangement of alleles explains heterosis (hybrid vigour), the increased fitness of the heterozygote compared with homozygotes. It also demonstrates a deficiency in current approaches to QTL dissection with implications extending to traits in other organisms, including human genetic diseases.
Transcriptional regulation and function during the human cell cycle.
Cho, R.J., Huang, M., Campbell, M.J., Dong, H., Steinmetz, L., Sapinoso, L., Hampton, G., Elledge, S.J., Davis, R.W. & Lockhart, D.J.
Nat Genet. 2001 Jan;27(1):48-54.
We report here the transcriptional profiling of the cell cycle on a genome-wide scale in human fibroblasts. We identified approximately 700 genes that display transcriptional fluctuation with a periodicity consistent with that of the cell cycle. Systematic analysis of these genes revealed functional organization within groups of coregulated transcripts. A diverse set of cytoskeletal reorganization genes exhibit cell-cycle-dependent regulation, indicating that biological pathways are redirected for the execution of cell division. Many genes involved in cell motility and remodeling of the extracellular matrix are expressed predominantly in M phase, indicating a mechanism for balancing proliferative and invasive cellular behavior. Transcripts upregulated during S phase displayed extensive overlap with genes induced by DNA damage; cell-cycle-regulated transcripts may therefore constitute coherent programs used in response to external stimuli. Our data also provide clues to biological function for hundreds of previously uncharacterized human genes.
Combining genome sequences and new technologies for dissecting the genetics of complex phenotypes.
Steinmetz, L.M., Mindrinos, M. & Oefner, P.J.
Trends Plant Sci 2000 Sep;5(9):397-401. Europe PMC
High-density arrays and insights into genome function.
Steinmetz, L.M. & Davis, R.W.
Biotechnol Genet Eng Rev 2000;17:109-46. Europe PMC
A genome-wide transcriptional analysis of the mitotic cell cycle.
Cho, R.J., Campbell, M.J., Winzeler, E.A., Steinmetz, L., Conway, A., Wodicka, L., Wolfsberg, T.G., Gabrielian, A.E., Landsman, D., Lockhart, D.J. & Davis, R.W.
Mol Cell. 1998 Jul;2(1):65-73.
Progression through the eukaryotic cell cycle is known to be both regulated and accompanied by periodic fluctuation in the expression levels of numerous genes. We report here the genome-wide characterization of mRNA transcript levels during the cell cycle of the budding yeast S. cerevisiae. Cell cycle-dependent periodicity was found for 416 of the 6220 monitored transcripts. More than 25% of the 416 genes were found directly adjacent to other genes in the genome that displayed induction in the same cell cycle phase, suggesting a mechanism for local chromosomal organization in global mRNA regulation. More than 60% of the characterized genes that displayed mRNA fluctuation have already been implicated in cell cycle period-specific biological roles. Because more than 20% of human proteins display significant homology to yeast proteins, these results also link a range of human genes to cell cycle period-specific biological functions.