Projects - Overview
The astonishing variety in morphology, behavior, and function that is found in diverse mammalian cell types lets one easily forget that all cells of an organism share the same underlying 'genetic instructions", i.e. a common genome. Indeed, all different cell types arise from a single fertilized oocyte through cellular differentiation. It is generally accepted that the cell-specific patterns of gene expression are key determinants of cellular identity, and that these expression patterns are ultimately controlled at the level of the DNA, in particular through constellations of regulatory sites to which transcription factors bind in a sequence-specific manner. However, in eukaryotes it is not DNA that is the direct substrate for the action of transcription factors and polymerase, but rather chromatin, i.e. the complex of DNA, RNA, and proteins that makes up chromosomes. In particular, modifications of histones within the nucleosomes that package DNA, and of the DNA itself (methylation at CpG dinucleotides), modify the state of the chromatin in a dynamic fashion, thereby modulating the accessibility of regulatory factors to their cognate sites. The epigenetic state of chromatin is thus a key component of transcription regulatory networks in higher eukaryotes, and one that plays a crucial role in developmental and cell differentiation processes.
The goal of the Cell Plasticity in Health and Disease project is to reveal the general principles that explain how sequence-specific transcription factors and epigenetic modifications of the chromatin interact in order to implement control pathways that drive cellular differentiation in mammals. We will study a set of specific cellular differentiation systems in mouse using a uniform combination of high-throughput measurement and computational modeling approaches. The systems will include four "normal" differentiation systems and two "aberrant" differentiation systems involving transformation of tumor cells. We will focus on modeling the mechanisms by which the sequence-specific binding of transcription factors interacts with the dynamics of the "epigenetic code" along the genome, i.e. the local status of chromatin as determined by histone and DNA modifications. The data from several differentiation systems will be analyzed by common novel mathematical and computational methods. In particular, we will use sophisticated methods for genome-wide annotation of transcription factor binding sites in combination with the time course data to develop quantitative models for the genome-wide interactions between chromatin state and the actions of sequence-specific transcription factors.
For each of the model systems described below we will use deep sequencing and microarray technologies to obtain time-course measurements of genome-wide expression dynamics of both mRNAs and miRNAs, genome-wide chromatin modification dynamics, including DNA methylation and methylation of histone H3 at different lysines, and genome-wide DNA binding profiles of key transcription factors in each system. To ensure complete and quantitative comparability of the data across the differentiation systems, all measurements will be performed using common protocols, and standardized procedures for processing and normalization of the data.
These comprehensive time-course data sets will be analyzed with sophisticated mathematical and computational methods to construct explicit quantitative models for the gene regulatory dynamics in each of these differentiation systems. We will focus on modeling the mechanisms by which the sequence-specific binding of transcription factors interacts with the dynamics of the "epigenetic code" along the genome, i.e. the local status of chromatin as determined by histone and DNA modification. In particular, we will use sophisticated methods for genome-wide annotation of transcription factor binding sites in combination with the time course data to develop quantitative models for the genome-wide interactions between chromatin state and the actions of sequence-specific transcription factors.
The computational modeling will be performed in multiple rounds. In the first round we will focus on identifying the key regulatory factors for each of the differentiation systems, and their inferred modes of action. These predictions will then be used to formulate follow-up experiments for validating and further characterizing the roles of the key regulators in each system. In particular, we will perform perturbation experiments using knock-out, knock-down, and ectopic expression of key transcription factors, miRNAs and/or chromatin-modifying enzymes. Finally we will use the refined models to predict perturbations that will trans-differentiate or de-differentiate each model system from its differentiated state towards a desired target state. The final goal is to establish a protocol of perturbations that reliably and with high efficiency trans-differentiate each of our model systems toward its desired target state.
Computational Modeling members: E. van Nimwegen, M. Zavolan ( in collaboration with M. Stadler, FMI).
Modeling Assumptions: Our computational models will be based on the following set of basic assumptions, all of which are founded on the current state of knowledge about the mechanisms for regulating gene expression:
In the CellPlasticity project we will translate these basic modeling assumptions into explicit mathematical models of the differentiation processes in a number of phases.
The foundation of our modeling will be the annotation of regulatory sites genome-wide in the mouse genome for a large number of mouse transcription factor regulatory motifs. As part of a recent collaboration with the Riken Institute in Japan, we have obtained a comprehensive annotation of promoters in mouse using deep sequencing of 5' ends of mRNAs. Using this data and applying a number of computational tools developed in our group, which incorporate comparative genomic information of promoter evolution across mammals, we have obtained accurate and comprehensive annotation of regulatory sites in all mouse proximal promoters.
In addition the Zavolan group, who is actively developing methods for miRNA target site prediction has provided annotations of such sites in 3' UTRs of all mouse transcripts. In the first round of computational analysis we will model all time course data in terms of the genome-wide predicted regulatory sites, i.e. both transcription factor binding sites as well as miRNA target sites. To this end we will use a recently developed Bayesian methodology, called Motif Activity Response Analysis (MARA), which uses generalized linear models to infer the time-dependent activities of regulatory factors that determine the dynamics of gene expression profiles and epigenetic mark profiles.
Sequence reads are mapped to the genome, associated with known TSSs, and their expression is normalized. Vertical lines represent known TSS positions and their height is proportional to the normalized expression. (b) Promoters regions are defined as clusters of nearby known TSSs. (c) A window of -300 to +100 flanking each promoter region is extracted, multiply aligned and the MotEvo algorithm is used to predict binding sites for known motifs. (d)(i) Observed expression of all promoters and (ii) predicted site-counts are used to infer (iii) motif activities. (e) The statistical significance of the regulatory edge from motif to promoter is calculated based on correlation of the promoter expression and motif activity profiles.
This methodology will allow us to deduce which TFs and miRNAs drive mRNA expression changes and their time-dependent activities. Similarly, the methodology also infers which transcription factors are most important in recruiting chromatin modifications. Recently, MARA has been successfully applied to infer key determinants of the transcription regulatory network controlling the differentiation of a human myeloid leukemic cell line (Nat. Genet. 41(5), 553-62, 2009) and we will apply this powerful approach to the systems studied in the CellPlasticity project. One of the key aims of our investigations in the CellPlasticity project is to extend the Motif Activity Response Analysis to include the interplay between TF binding and epigenetic marks. In particular, we aim to explicitly model the effects of chromatin structure on the ability of the transcription factors to bind to their targets, as well as the feed-back from the binding of transcription factors to the chromatin structure, i.e. through the recruitment of chromatin modifiers. Here our modeling will aim to uncover general principles in the interplay of chromatin state and the actions of transcription factors and address conceptual questions such as the reversibility of the epigenetic changes, the mechanism by which robustness and redundancy are provided, and the extent to which the epigenetic state contributes to cell fate decisions. Besides these conceptual issues our models will predict, for each differentiation system, which transcription factors, chromatin modifying enzymes, and miRNAs are the key regulators in the differentiation, how these regulatory factors change their activity through time, and what the genome-wide targets of these regulatory factors are.
A crucial part of all the computational modeling is a close collaboration with the experimental groups. In particular, together with the experimental collaborators we will design validation experiments in which the experimental systems will be perturbed in a controlled manner to validate the roles of these key regulators. These perturbations can be either knock-downs or ectopic expression followed by measurement of the system's response. Finally we will make and test predictions in order to trans-differentiate into a desired state. For each differentiation system we will use the constructed computational model to predict a set of perturbations, i.e. state-specific gene excision or ectopic expression of transcription factors, chromatin modifying enzymes or miRNAs. In this final stage we expect that there will be multiple rounds of iteration between experimental tests of the results of the predicted perturbations, and updating of the computational predictions.
Project 1: GENETICS AND EPIGENETICS OF NEURONAL CELL TYPE DETERMINATION
Core member: Dirk Schübeler (external advice from Miriam Bibel, Novartis)Synopsis: Neuronal development starting from pluripotent stem cells provides an experimentally highly accessible model system to investigate transcriptome and epigenome reprogramming during cell differentiation.. Within Cell Plasticity this is the most advanced system, since genome-wide proof of concept studies and their computational analysis have already been performed. Genetic manipulations of these cells are easily achieved and experience derived from this model will provide a paradigm for the other experimental systems of Cell Plasticity.
Background: The nervous system displays highly unique properties and is composed of an enormous number of functionally distinct subpopulations of neurons, which are interconnected to form highly specific networks. Most neurons in the nervous system are generated during development in a multistep process and last for the life of the animal. Understanding the transcriptional and epigenetic pathways underlying the events of neuronal specification and plasticity has been a major challenge. Recent advances in stem cell biology and neuronal cell culture allow in vitro the recapitulation of multistage neuronal differentiation. Importantly, this can be optimized to generate pure populations of progenitor and terminally differentiated cells allowing for the first time a global and quantitative view of transcriptome and epigenome reprogramming during neurogenesis.
From embryonic stem cells to neuronal progenitors to fully differentiated neurons. Changes in transcriptional activities and epigenetic imprints during the differentiation process can be monitored.
Project 2: CELLULAR DIFFERENTIATION DURING HEMATOPOIESIS
Core member: A. Rolink, P. Matthias, G. HolländerSynopsis: Throughout the lifetime of an organism, hematopoietic stem cells (HSCs) generate all differentiated blood cells, via a pathway involving multiple branchings and the gradual loss of developmental potential. In addition to its high biological and medical relevance, the hematopoietic system offers a number of significant advantages for quantitative studies of the regulatory networks governing cell identity: (i) well known sets of cell surface markers permit the identification and physical purification of cells at different developmental stages, (ii) culture systems allow the expansion of progenitors in vitro while maintaining their developmental potential, and (iii) a growing number of key transcription factors have been identified that are responsible for the differentiation to and within individual hematopoietic lineages. Within Cell Plasticity, we will focus on well-characterized steps in B- and T-cell development and then explore the promiscuous gene expression of self-antigens by thymic epithelial cells as a paradigm for overriding lineage-specific gene expression.
From hematopoietic stem cells to differentiated blood cells. The lymphoid lineage leading to mature B- and T-cells is highlighted.
Project 3: ABERRANT DIFFERENTIATION IN CANCER
Core member: G. Christofori, A. Peters in collaboration with J. SchwallerSynopsis: Uncontrolled growth in cancer is a consequence of cellular transformation, which usually coincides with a loss of a more coherent differentiated state. Within Cell Plasticity we will study the block of differentiation that occurs in stem cell leukemia and the de-differentiation and gain of migratory potential that precedes tumor metastasis. Both chosen models provide important disease paradigms in highly-controlled systems allowing us to monitor and perturb cellular transformation in a temporal manner in vitro and in vivo.
Gerhard Christofori
Background: During late stages of tumor progression, epithelial differentiated tumor cells acquire a de-differentiated, migratory and invasive phenotype. This process of epithelial-mesenchymal transition (EMT) is accompanied by changes in cellular morphology, the gain of migratory and invasive capabilities and, finally, the metastatic dissemination of cancer cells throughout the body. Thus, EMT appears to be the basic process underlying the formation of tumor metastasis, the final and deadly stage of cancer.
A key event during EMT is the loss of E-cadherin function, in most cases exerted by the binding of Snail-type transcriptional repressors to E-boxes within the E-cadherin promoter, which leads to a repression of E-cadherin expression and subsequent promoter DNA hypermethylation. Notably, the expression and activities of many homeobox and other transcription factors is activated upon loss of E-cadherin, suggesting major changes in signaling pathways, transcriptional control, and epigenetic regulation, also manifested by the upregulated expression of histone methyltransferases and DNA methylases during tumor progression and EMT. Along these lines, several pharmacological inhibitors of histone deacetylases, histone methyltransferases and DNA methyltransferases are currently being tested in pre-clinical cancer models and in clinical cancer patient trials.
Features of Epithelial to Mesenchymal Transition (EMT).
A. Peters in collaboration with J. Schwaller
Background: Blood formation is the result of a highly regulated chain of cellular decisions. Self-renewing hematopoietic stem cells (HSCs) give rise to multipotent progenitor cells that can differentiate into more specified progenitor and terminally differentiated blood cells. In acute leukemia, the malignant blasts are characterized by a block in differentiation, aberrant self-renewal, and increased survival. A hallmark of human leukemias is the presence of mostly balanced chromosomal translocations leading to expression of fusion oncoproteins.
The Mixed-Lineage Leukemia (MLL) (also known as Trithorax) gene encodes a histone methyltransferase that is essential for the maintenance of adult hematopoiesis. MLL is a frequent target of translocations leading to MLL-X fusions involving over 50 different partner genes. In adult AML, MLL-AF4, MLL-AF9 and MLL-ENL cover over 60% of MLL-fusion with generally poor prognosis and their oncogenic activity has been demonstrated in vitro and in mouse models. Importantly, expression of MLL-X does not transform hematopoietic stem cells (HSC) but changes the cellular identity of committed progenitors such as granulocytic myeloid progenitor cells (GMPs) by blocking differentiation and providing aberrant self-renewal capacity.
Recent expression profiling and ChIP-on-chip analyses of human and murine MLL-rearranged leukemias provided first insights into the altered genetic and epigenetic program mediated by MLL-X fusion proteins revealing a number of critical regulators, such as transcription factors and histone modifying enzymes. These data suggest that the leukemic function of MLL-X is directly linked to aberrant transcriptional control at the genetic and epigenetic level. As part of Cell Plasticity we will take advantage of a conditional model system of MLL-X oncogene based leukemic transformation in vitro and in vivo.


