Selected Publications

  • Transcriptional heterogeneity due to plasticity of the epigenetic state of chromatin contributes to tumour evolution, metastasis and drug resistance1,2,3. However, the mechanisms that cause this epigenetic variation are incompletely understood. Here we identify micronuclei and chromosome bridges, aberrations in the nucleus common in cancer4,5, as sources of heritable transcriptional suppression. Using a combination of approaches, including long-term live-cell imaging and same-cell single-cell RNA sequencing (Look-Seq2), we identified reductions in gene expression in chromosomes from micronuclei. With heterogeneous penetrance, these changes in gene expression can be heritable even after the chromosome from the micronucleus has been re-incorporated into a normal daughter cell nucleus. Concomitantly, micronuclear chromosomes acquire aberrant epigenetic chromatin marks. These defects may persist as variably reduced chromatin accessibility and reduced gene expression after clonal expansion from single cells. Persistent transcriptional repression is strongly associated with, and may be explained by, markedly long-lived DNA damage. Epigenetic alterations in transcription may therefore be inherently coupled to chromosomal instability and aberrations in nuclear architecture.

  • Papathanasiou et al. Heritable transcriptional defects from aberrations of nuclear architecture. Nature (2023)

  • Inflammation can trigger lasting phenotypes in immune and non-immune cells. Whether and how human infections and associated inflammation can form innate immune memory in hematopoietic stem and progenitor cells (HSPC) has remained unclear. We found that circulating HSPC, enriched from peripheral blood, captured the diversity of bone marrow HSPC, enabling investigation of their epigenomic reprogramming following coronavirus disease 2019 (COVID-19). Alterations in innate immune phenotypes and epigenetic programs of HSPC persisted for months to 1 year following severe COVID-19 and were associated with distinct transcription factor (TF) activities, altered regulation of inflammatory programs, and durable increases in myelopoiesis. HSPC epigenomic alterations were conveyed, through differentiation, to progeny innate immune cells. Early activity of IL-6 contributed to these persistent phenotypes in human COVID-19 and a mouse coronavirus infection model. Epigenetic reprogramming of HSPC may underlie altered immune function following infection and be broadly relevant, especially for millions of COVID-19 survivors.

  • Cheong et al. Cell (2023).

Photoselective sequencing: microscopically guided genomic measurements with subcellular resolution

Mangiameli SM, Chen H, Earl AS, Dobkin JA, Lesman D, Buenrostro JD☨ & Chen F☨. Nature Methods (2023).

  • In biological systems, spatial organization and function are interconnected. Here we present photoselective sequencing, a new method for genomic and epigenomic profiling within morphologically distinct regions. Starting with an intact biological specimen, photoselective sequencing uses targeted illumination to selectively unblock a photocaged fragment library, restricting the sequencing-based readout to microscopically identified spatial regions. We validate photoselective sequencing by measuring the chromatin accessibility profiles of fluorescently labeled cell types within the mouse brain and comparing with published data. Furthermore, by combining photoselective sequencing with a computational strategy for decomposing bulk accessibility profiles, we find that the oligodendrocyte-lineage-cell population is relatively enriched for oligodendrocyte-progenitor cells in the cortex versus the corpus callosum. Finally, we leverage photoselective sequencing at the subcellular scale to identify features of chromatin that are correlated with positioning at the nuclear periphery. These results collectively demonstrate that photoselective sequencing is a flexible and generalizable platform for exploring the interplay of spatial structures with genomic and epigenomic properties.

  • Mangiameli SM, Chen H, Earl AS, Dobkin JA, Lesman D, Buenrostro JD☨ & Chen F☨. Photoselective sequencing: microscopically guided genomic measurements with subcellular resolution. Nature Methods (2023).

  • Transcription factors (TFs) regulate gene programs, thereby controlling diverse cellular processes and cell states. To comprehensively understand TFs and the programs they control, we created a barcoded library of all annotated human TF splice isoforms (>3,500) and applied it to build a TF Atlas charting expression profiles of human embryonic stem cells (hESCs) overexpressing each TF at single-cell resolution. We mapped TF-induced expression profiles to reference cell types and validated candidate TFs for generation of diverse cell types, spanning all three germ layers and trophoblasts. Targeted screens with subsets of the library allowed us to create a tailored cellular disease model and integrate mRNA expression and chromatin accessibility data to identify downstream regulators. Finally, we characterized the effects of combinatorial TF overexpression by developing and validating a strategy for predicting combinations of TFs that produce target expression profiles matching reference cell types to accelerate cellular engineering efforts.

  • Joung et al. A transcription factor atlas of directed differentiation. Cell (2023)

Functional inference of gene regulation using single-cell multi-omics

Kartha VK, Duarte FM, Hu, Sai Ma, Chew JG, Lareau CA, Earl A, Burkett ZD, Kohlway AS, Lebofsky R, Buenrostro JD. Cell Genomics (2022).

  • Cells require coordinated control over gene expression when responding to environmental stimuli. Here we apply scATAC-seq and single-cell RNA sequencing (scRNA-seq) in resting and stimulated human blood cells. Collectively, we generate ∼91,000 single-cell profiles, allowing us to probe the cis-regulatory landscape of the immunological response across cell types, stimuli, and time. Advancing tools to integrate multi-omics data, we develop functional inference of gene regulation (FigR), a framework to computationally pair scATAC-seq with scRNA-seq cells, connect distal cis-regulatory elements to genes, and infer gene-regulatory networks (GRNs) to identify candidate transcription factor (TF) regulators. Utilizing these paired multi-omics data, we define domains of regulatory chromatin (DORCs) of immune stimulation and find that cells alter chromatin accessibility and gene expression at timescales of minutes. Construction of the stimulation GRN elucidates TF activity at disease-associated DORCs. Overall, FigR enables elucidation of regulatory interactions across single-cell data, providing new opportunities to understand the function of cells within tissues.

  • Kartha VK, Duarte FM, Hu, Sai Ma, Chew JG, Lareau CA, Earl A, Burkett ZD, Kohlway AS, Lebofsky R, Buenrostro JD. Functional inference of gene regulation using single-cell multi-omics. Cell Genomics (2022).

  • Realizing the full utility of brain organoids to study human development requires understanding whether organoids precisely replicate endogenous cellular and molecular events, particularly since acquisition of cell identity in organoids can be impaired by abnormal metabolic states. We present a comprehensive single-cell transcriptomic, epigenetic, and spatial atlas of human cortical organoid development, comprising over 610,000 cells, from generation of neural progenitors through production of differentiated neuronal and glial subtypes. We show that processes of cellular diversification correlate closely to endogenous ones, irrespective of metabolic state, empowering the use of this atlas to study human fate specification. We define longitudinal molecular trajectories of cortical cell types during organoid development, identify genes with predicted human-specific roles in lineage establishment, and uncover early transcriptional diversity of human callosal neurons. The findings validate this comprehensive atlas of human corticogenesis in vitro as a resource to prime investigation into the mechanisms of human cortical development.

  • Uzquiano et al. Proper acquisition of cell class identity in organoids allows definition of fate specification programs of the human cerebral cortex. Cell (2022).

Single-cell epigenomics reveals mechanisms of cancer progression

LaFave LM, Savage RE, Buenrostro JD. Annual Review of Cancer Biology (2022).

  • Cancer initiation is driven by the cooperation between genetic and epigenetic aberrations that disrupt gene regulatory programs critical to maintaining specialized cellular functions. After initiation, cells acquire additional genetic and epigenetic alterations influenced by tumor-intrinsic and -extrinsic mechanisms, which increase intratumoral heterogeneity, reshape the cell's underlying gene regulatory networks and promote cancer evolution. Furthermore, environmental or therapeutic insults drive the selection of heterogeneous cell states, with implications for cancer initiation, maintenance, and drug resistance. The advancement of single-cell genomics has begun to uncover the full repertoire of chromatin and gene expression states (cell states) that exist within individual tumors. These single-cell analyses suggest that cells diversify in their regulatory states upon transformation by co-opting damage-induced and nonlineage regulatory programs that can lead to epigenomic plasticity. Here, we review these recent studies related to regulatory state changes in cancer progression and highlight the growing single-cell epigenomics toolkit poised to address unresolved questions in the field.

  • LaFave LM, Savage RE, Buenrostro JD. Single-cell epigenomics reveals mechanisms of cancer progression. Annual Review of Cancer Biology (2022).

Spatial genomics enables multi-modal study of clonal heterogeneity in tissues

Zhao T*, Chiang ZD*, Morriss JW, LaFave LM, Murray EM, Priore ID, Meli K, Lareau CA, Nadaf NM, Li J, Earl AS, Macosko EZ, Jacks T, Buenrostro JD☨ & Chen F☨. Nature (2021).

  • The state and behaviour of a cell can be influenced by both genetic and environmental factors. In particular, tumour progression is determined by underlying genetic aberrations as well as the makeup of the tumour microenvironment. Quantifying the contributions of these factors requires new technologies that can accurately measure the spatial location of genomic sequence together with phenotypic readouts. Here we developed slide-DNA-seq, a method for capturing spatially resolved DNA sequences from intact tissue sections. We demonstrate that this method accurately preserves local tumour architecture and enables the de novo discovery of distinct tumour clones and their copy number alterations. We then apply slide-DNA-seq to a mouse model of metastasis and a primary human cancer, revealing that clonal populations are confined to distinct spatial regions. Moreover, through integration with spatial transcriptomics, we uncover distinct sets of genes that are associated with clone-specific genetic aberrations, the local tumour microenvironment, or both. Together, this multi-modal spatial genomics approach provides a versatile platform for quantifying how cell-intrinsic and cell-extrinsic factors contribute to gene expression, protein abundance and other cellular phenotypes.

  • Zhao T*, Chiang ZD*, Morriss JW, LaFave LM, Murray EM, Priore ID, Meli K, Lareau CA, Nadaf NM, Li J, Earl AS, Macosko EZ, Jacks T, Buenrostro JD☨ & Chen F☨. Spatial genomics enables multi-modal study of clonal heterogeneity in tissues. Nature (2021).

  • Chronic, sustained exposure to stressors can profoundly affect tissue homeostasis, although the mechanisms by which these changes occur are largely unknown. Here we report that the stress hormone corticosterone—which is derived from the adrenal gland and is the rodent equivalent of cortisol in humans—regulates hair follicle stem cell (HFSC) quiescence and hair growth in mice. In the absence of systemic corticosterone, HFSCs enter substantially more rounds of the regeneration cycle throughout life. Conversely, under chronic stress, increased levels of corticosterone prolong HFSC quiescence and maintain hair follicles in an extended resting phase. Mechanistically, corticosterone acts on the dermal papillae to suppress the expression of Gas6, a gene that encodes the secreted factor growth arrest specific 6. Restoring Gas6 expression overcomes the stress-induced inhibition of HFSC activation and hair growth. Our work identifies corticosterone as a systemic inhibitor of HFSC activity through its effect on the niche, and demonstrates that the removal of such inhibition drives HFSCs into frequent regeneration cycles, with no observable defects in the long-term.

  • Choi et al. Corticosterone inhibits GAS6 to govern hair follicle stem-cell quiescence. Nature (2021)

Deep learning-based enhancement of epigenomics data with AtacWorks

Lal A*, Chiang ZD*, Yakovenko N, Duarte FM, Israeli J☨, Buenrostro JD☨. Nature Communications (2021).

  • ATAC-seq is a widely-applied assay used to measure genome-wide chromatin accessibility; however, its ability to detect active regulatory regions can depend on the depth of sequencing coverage and the signal-to-noise ratio. Here we introduce AtacWorks, a deep learning toolkit to denoise sequencing coverage and identify regulatory peaks at base-pair resolution from low cell count, low-coverage, or low-quality ATAC-seq data. Models trained by AtacWorks can detect peaks from cell types not seen in the training data, and are generalizable across diverse sample preparations and experimental platforms. We demonstrate that AtacWorks enhances the sensitivity of single-cell experiments by producing results on par with those of conventional methods using ~10 times as many cells, and further show that this framework can be adapted to enable cross-modality inference of protein-DNA interactions. Finally, we establish that AtacWorks can enable new biological discoveries by identifying active regulatory regions associated with lineage priming in rare subpopulations of hematopoietic stem cells.

  • Lal A*, Chiang ZD*, Yakovenko N, Duarte FM, Israeli J☨, Buenrostro JD☨. Deep learning-based enhancement of epigenomics data with AtacWorks. Nature Communications (2021).

  • Charting an organs’ biological atlas requires us to spatially resolve the entire single-cell transcriptome, and to relate such cellular features to the anatomical scale. Single-cell and single-nucleus RNA-seq (sc/snRNA-seq) can profile cells comprehensively, but lose spatial information. Spatial transcriptomics allows for spatial measurements, but at lower resolution and with limited sensitivity. Targeted in situ technologies solve both issues, but are limited in gene throughput. To overcome these limitations we present Tangram, a method that aligns sc/snRNA-seq data to various forms of spatial data collected from the same region, including MERFISH, STARmap, smFISH, Spatial Transcriptomics (Visium) and histological images. Tangram can map any type of sc/snRNA-seq data, including multimodal data such as those from SHARE-seq, which we used to reveal spatial patterns of chromatin accessibility. We demonstrate Tangram on healthy mouse brain tissue, by reconstructing a genome-wide anatomically integrated spatial map at single-cell resolution of the visual and somatomotor areas.

  • Biancalani et al. Deep learning and alignment of spatially resolved single-cell transcriptomes with Tangram. Nature Methods (2021).

In situ genome sequencing resolves DNA sequence and structure in intact biological samples

Payne AC*, Chiang ZD*, Reginato PL*, Mangiameli SM, Murray EM, Yao CC, Markoulaki S, Earl AS, Labade AS, Jaenisch R, Church GM, Boyden ES☨, Buenrostro JD☨, Chen F☨. Science (2020).

  • Understanding genome organization requires integration of DNA sequence and three-dimensional spatial context; however, existing genome-wide methods lack either base pair sequence resolution or direct spatial localization. Here, we describe in situ genome sequencing (IGS), a method for simultaneously sequencing and imaging genomes within intact biological samples. We applied IGS to human fibroblasts and early mouse embryos, spatially localizing thousands of genomic loci in individual nuclei. Using these data, we characterized parent-specific changes in genome structure across embryonic stages, revealed single-cell chromatin domains in zygotes, and uncovered epigenetic memory of global chromosome positioning within individual embryos. These results demonstrate how IGS can directly connect sequence and structure across length scales from single base pairs to whole organisms.

  • Payne AC*, Chiang ZD*, Reginato PL*, Mangiameli SM, Murray EM, Yao CC, Markoulaki S, Earl AS, Labade AS, Jaenisch R, Church GM, Boyden ES☨, Buenrostro JD☨, Chen F☨. In situ genome sequencing resolves DNA sequence and structure in intact biological samples. Science (2020).

  • Empirical and anecdotal evidence has associated stress with accelerated hair greying (formation of unpigmented hairs)1,2, but so far there has been little scientific validation of this link. Here we report that, in mice, acute stress leads to hair greying through the fast depletion of melanocyte stem cells. Using a combination of adrenalectomy, denervation, chemogenetics3,4, cell ablation and knockout of the adrenergic receptor specifically in melanocyte stem cells, we find that the stress-induced loss of melanocyte stem cells is independent of immune attack or adrenal stress hormones. Instead, hair greying results from activation of the sympathetic nerves that innervate the melanocyte stem-cell niche. Under conditions of stress, the activation of these sympathetic nerves leads to burst release of the neurotransmitter noradrenaline (also known as norepinephrine). This causes quiescent melanocyte stem cells to proliferate rapidly, and is followed by their differentiation, migration and permanent depletion from the niche. Transient suppression of the proliferation of melanocyte stem cells prevents stress-induced hair greying. Our study demonstrates that neuronal activity that is induced by acute stress can drive a rapid and permanent loss of somatic stem cells, and illustrates an example in which the maintenance of somatic stem cells is directly influenced by the overall physiological state of the organism.

  • Zhang et al. Hyperactivation of sympathetic nerves drives depletion of melanocyte stem cells. Nature (2020).

Chromatin potential identified by shared single cell profiling of RNA and chromatin

Ma S, Zhang B, LaFave L, Chiang Z, Hu Y, Ding J, Brack A, Kartha VK, Law T, Lareau C, Hsu Y, Regev A☨, Buenrostro JD☨. Cell (2020).

  • Cell differentiation and function are regulated across multiple layers of gene regulation, including modulation of gene expression by changes in chromatin accessibility. However, differentiation is an asynchronous process precluding a temporal understanding of regulatory events leading to cell fate commitment. Here we developed simultaneous high-throughput ATAC and RNA expression with sequencing (SHARE-seq), a highly scalable approach for measurement of chromatin accessibility and gene expression in the same single cell, applicable to different tissues. Using 34,774 joint profiles from mouse skin, we develop a computational strategy to identify cis-regulatory interactions and define domains of regulatory chromatin (DORCs) that significantly overlap with super-enhancers. During lineage commitment, chromatin accessibility at DORCs precedes gene expression, suggesting that changes in chromatin accessibility may prime cells for lineage commitment. We computationally infer chromatin potential as a quantitative measure of chromatin lineage-priming and use it to predict cell fate outcomes. SHARE-seq is an extensible platform to study regulatory circuitry across diverse cells in tissues.

  • Ma S, Zhang B, LaFave L, Chiang Z, Hu Y, Ding J, Brack A, Kartha VK, Law T, Lareau C, Hsu Y, Regev A☨, Buenrostro JD☨. Chromatin potential identified by shared single cell profiling of RNA and chromatin. Cell (2020).

Epigenomic State Transitions Characterize Tumor Progression in Mouse Lung Adenocarcinoma

LaFave LM, Kartha VK*, Ma S*, Meli K, Priore ID, Lareau C, Naranjo S, Westcott P, Duarte FM, Sankar V, Chiang Z, Brack A, Law T, Hauck H, Okimoto A, Regev A, Buenrostro JD☨, Tyler Jacks☨. Cancer Cell (2020).

  • Regulatory networks that maintain functional, differentiated cell states are often dysregulated in tumor development. Here, we use single-cell epigenomics to profile chromatin state transitions in a mouse model of lung adenocarcinoma (LUAD). We identify an epigenomic continuum representing loss of cellular identity and progression toward a metastatic state. We define co-accessible regulatory programs and infer key activating and repressive chromatin regulators of these cell states. Among these co-accessibility programs, we identify a pre-metastatic transition, characterized by activation of RUNX transcription factors, which mediates extracellular matrix remodeling to promote metastasis and is predictive of survival across human LUAD patients. Together, these results demonstrate the power of single-cell epigenomics to identify regulatory programs to uncover mechanisms and key biomarkers of tumor progression.

  • LaFave LM, Kartha VK*, Ma S*, Meli K, Priore ID, Lareau C, Naranjo S, Westcott P, Duarte FM, Sankar V, Chiang Z, Brack A, Law T, Hauck H, Okimoto A, Regev A, Buenrostro JD☨, Tyler Jacks☨. Epigenomic State Transitions Characterize Tumor Progression in Mouse Lung Adenocarcinoma. Cancer Cell (2020).

Inference and effects of barcode multiplets in droplet-based single-cell assays

Lareau C☨, Ma S, Duarte F, Buenrostro JD☨. Nature Communications (2020).

  • A widespread assumption for single-cell analyses specifies that one cell’s nucleic acids are predominantly captured by one oligonucleotide barcode. Here, we show that ~13–21% of cell barcodes from the 10x Chromium scATAC-seq assay may have been derived from a droplet with more than one oligonucleotide sequence, which we call “barcode multiplets”. We demonstrate that barcode multiplets can be derived from at least two different sources. First, we confirm that approximately 4% of droplets from the 10x platform may contain multiple beads. Additionally, we find that approximately 5% of beads may contain detectable levels of multiple oligonucleotide barcodes. We show that this artifact can confound single-cell analyses, including the interpretation of clonal diversity and proliferation of intra-tumor lymphocytes. Overall, our work provides a conceptual and computational framework to identify and assess the impacts of barcode multiplets in single-cell data.

  • Lareau C☨, Ma S, Duarte F, Buenrostro JD☨. Inference and effects of barcode multiplets in droplet-based single-cell assays. Nature Communications (2020).

  • Lineage tracing provides key insights into the fate of individual cells in complex organisms. Although effective genetic labeling approaches are available in model systems, in humans, most approaches require detection of nuclear somatic mutations, which have high error rates, limited scale, and do not capture cell state information. Here, we show that somatic mutations in mtDNA can be tracked by single-cell RNA or assay for transposase accessible chromatin (ATAC) sequencing. We leverage somatic mtDNA mutations as natural genetic barcodes and demonstrate their utility as highly accurate clonal markers to infer cellular relationships. We track native human cells both in vitro and in vivo and relate clonal dynamics to gene expression and chromatin accessibility. Our approach should allow clonal tracking at a 1,000-fold greater scale than with nuclear genome sequencing, with simultaneous information on cell state, opening the way to chart cellular dynamics in human health and disease.

  • Ludwig et al. Lineage Tracing in Humans Enabled by Mitochondrial Mutations and Single-Cell Genomics. Cell (2019).

Droplet-based combinatorial indexing for massive-scale single-cell chromatin accessibility

Lareau CA*, Duarte FM*, Chew JG*, Kartha VK, Burkett ZD, Kolhway AS, Pokholok D, Aryee MJ, Steemers FJ, Lebofsky R☨, Buenrostro JD☨. Nature Biotechnology (2019).

  • Recent technical advancements have facilitated the mapping of epigenomes at single-cell resolution; however, the throughput and quality of these methods have limited their widespread adoption. Here we describe a high-quality (105 nuclear fragments per cell) droplet-microfluidics-based method for single-cell profiling of chromatin accessibility. We use this approach, named ‘droplet single-cell assay for transposase-accessible chromatin using sequencing’ (dscATAC-seq), to assay 46,653 cells for the unbiased discovery of cell types and regulatory elements in adult mouse brain. We further increase the throughput of this platform by combining it with combinatorial indexing (dsciATAC-seq), enabling single-cell studies at a massive scale. We demonstrate the utility of this approach by measuring chromatin accessibility across 136,463 resting and stimulated human bone marrow-derived cells to reveal changes in the cis- and trans-regulatory landscape across cell types and under stimulatory conditions at single-cell resolution. Altogether, we describe a total of 510,123 single-cell profiles, demonstrating the scalability and flexibility of this droplet-based platform.

  • Lareau CA*, Duarte FM*, Chew JG*, Kartha VK, Burkett ZD, Kolhway AS, Pokholok D, Aryee MJ, Steemers FJ, Lebofsky R☨, Buenrostro JD☨. Droplet-based combinatorial indexing for massive-scale single-cell chromatin accessibility. Nature Biotechnology (2019).

Interrogation of human hematopoiesis at single-cell and single-variant resolution

Ulirsch JC*, Lareau CA*, Bao EL*, Ludwig LS, Guo MH, Benner C, Satpathy AT, Kartha VK, Salem R, Hirschhorn JN, Finucane HK, Aryee MJ, Buenrostro JD☨, Sankaran VG☨. Nature Genetics (2019).

  • Widespread linkage disequilibrium and incomplete annotation of cell-to-cell state variation represent substantial challenges to elucidating mechanisms of trait-associated genetic variation. Here we perform genetic fine-mapping for blood cell traits in the UK Biobank to identify putative causal variants. These variants are enriched in genes encoding proteins in trait-relevant biological pathways and in accessible chromatin of hematopoietic progenitors. For regulatory variants, we explore patterns of developmental enhancer activity, predict molecular mechanisms, and identify likely target genes. In several instances, we localize multiple independent variants to the same regulatory element or gene. We further observe that variants with pleiotropic effects preferentially act in common progenitor populations to direct the production of distinct lineages. Finally, we leverage fine-mapped variants in conjunction with continuous epigenomic annotations to identify trait–cell type enrichments within closely related populations and in single cells. Our study provides a comprehensive framework for single-variant and single-cell analyses of genetic associations.

  • Ulirsch JC*, Lareau CA*, Bao EL*, Ludwig LS, Guo MH, Benner C, Satpathy AT, Kartha VK, Salem R, Hirschhorn JN, Finucane HK, Aryee MJ, Buenrostro JD☨, Sankaran VG☨. Interrogation of human hematopoiesis at single-cell and single-variant resolution. Nature Genetics (2019).

The cis-Regulatory Atlas of the Mouse Immune System

Yoshida H, Lareau CA, Ramirez RN, Rose SA, Maier B, Wroblewska A, Desland F, Chudnovskiy A, Mortha A, Dominguez C, Tellier J, Kim E, Dwyer D, Shinton S, Nabekura T, Qi Y, Yu B, Robinette M, Kim K, Wagers A, Rhoads A, Nutt SL, Brown BD, Mostafavi S☨, Buenrostro JD☨, Benoist C☨, the Immunological Genome Project. Cell (2019).

  • A complete chart of cis-regulatory elements and their dynamic activity is necessary to understand the transcriptional basis of differentiation and function of an organ system. We generated matched epigenome and transcriptome measurements in 86 primary cell types that span the mouse immune system and its differentiation cascades. This breadth of data enable variance components analysis that suggests that genes fall into two distinct classes, controlled by either enhancer- or promoter-driven logic, and multiple regression that connects genes to the enhancers that regulate them. Relating transcription factor (TF) expression to the genome-wide accessibility of their binding motifs classifies them as predominantly openers or closers of local chromatin accessibility, pinpointing specific cis-regulatory elements where binding of given TFs is likely functionally relevant, validated by chromatin immunoprecipitation sequencing (ChIP-seq). Overall, this cis-regulatory atlas provides a trove of information on transcriptional regulation through immune differentiation and a foundational scaffold to define key regulatory events throughout the immunological genome.

  • Yoshida H, Lareau CA, Ramirez RN, Rose SA, Maier B, Wroblewska A, Desland F, Chudnovskiy A, Mortha A, Dominguez C, Tellier J, Kim E, Dwyer D, Shinton S, Nabekura T, Qi Y, Yu B, Robinette M, Kim K, Wagers A, Rhoads A, Nutt SL, Brown BD, Mostafavi S☨, Buenrostro JD☨, Benoist C☨, the Immunological Genome Project. The cis-Regulatory Atlas of the Mouse Immune System. Cell (2019).

  • Recent advances in single-cell and single-molecule epigenomic technologies now enable the study of genome regulation and dynamics at unprecedented resolution. In this Perspective, we highlight some of these transformative technologies and discuss how they have been used to identify new modes of gene regulation. We also contrast these assays with recent advances in single-cell transcriptomics and argue for the essential role of epigenomic technologies in both understanding cellular diversity and discovering gene regulatory mechanisms. In addition, we provide our view on the next generation of biological tools that we expect will open new avenues for elucidating the fundamental principles of gene regulation. Overall, this Perspective motivates the use of these high-resolution epigenomic technologies for mapping cell states and understanding regulatory diversity at single-molecule resolution within single cells.

  • Shema E, Bernstein BE, Buenrostro JD. Single-cell and single-molecule epigenomics to uncover genome regulation at unprecedented resolution. Nature Genetics (2018).

Integrated Single-Cell Analysis Maps the Continuous Regulatory Landscape of Human Hematopoietic Differentiation

Buenrostro JD☨, Corces R, Lareau C, Wu B, Schep AN, Aryee MJ, Majeti R, Chang HY, Greenleaf WJ☨. Cell (2018).

  • Human hematopoiesis involves cellular differentiation of multipotent cells into progressively more lineage-restricted states. While the chromatin accessibility landscape of this process has been explored in defined populations, single-cell regulatory variation has been hidden by ensemble averaging. We collected single-cell chromatin accessibility profiles across 10 populations of immunophenotypically defined human hematopoietic cell types and constructed a chromatin accessibility landscape of human hematopoiesis to characterize differentiation trajectories. We find variation consistent with lineage bias toward different developmental branches in multipotent cell types. We observe heterogeneity within common myeloid progenitors (CMPs) and granulocyte-macrophage progenitors (GMPs) and develop a strategy to partition GMPs along their differentiation trajectory. Furthermore, we integrated single-cell RNA sequencing (scRNA-seq) data to associate transcription factors to chromatin accessibility changes and regulatory elements to target genes through correlations of expression and regulatory element accessibility. Overall, this work provides a framework for integrative exploration of complex regulatory dynamics in a primary human tissue at single-cell resolution.

  • Buenrostro JD☨, Corces R, Lareau C, Wu B, Schep AN, Aryee MJ, Majeti R, Chang HY, Greenleaf WJ☨. Integrated Single-Cell Analysis Maps the Continuous Regulatory Landscape of Human Hematopoietic Differentiation. Cell (2018).

  • The expression of co-inhibitory receptors, such as CTLA-4 and PD-1, on effector T cells is a key mechanism for ensuring immune homeostasis. Dysregulated expression of co-inhibitory receptors on CD4+ T cells promotes autoimmunity, whereas sustained overexpression on CD8+ T cells promotes T cell dysfunction or exhaustion, leading to impaired ability to clear chronic viral infections and diseases such as cancer1,2. Here, using RNA and protein expression profiling at single-cell resolution in mouse cells, we identify a module of co-inhibitory receptors that includes not only several known co-inhibitory receptors (PD-1, TIM-3, LAG-3 and TIGIT) but also many new surface receptors. We functionally validated two new co-inhibitory receptors, activated protein C receptor (PROCR) and podoplanin (PDPN). The module of co-inhibitory receptors is co-expressed in both CD4+ and CD8+ T cells and is part of a larger co-inhibitory gene program that is shared by non-responsive T cells in several physiological contexts and is driven by the immunoregulatory cytokine IL-27. Computational analysis identified the transcription factors PRDM1 and c-MAF as cooperative regulators of the co-inhibitory module, and this was validated experimentally. This molecular circuit underlies the co-expression of co-inhibitory receptors in T cells and identifies regulators of T cell function with the potential to control autoimmunity and tumour immunity.

  • Chihara et al. Induction and transcriptional regulation of the co-inhibitory gene module in T cells. Nature (2018).

Transcript-indexed ATAC-seq for precision immune profiling

Satpathy AT*, Saligrama N*, Buenrostro JD*, Wei Y, Wu B, Rubin AJ, Granja JM, Li R, Mumbach MR, Lareau CA, Serratelli WS, Gennert DG, Schep AN, Corces MR, Kim YH, Khavari PA, Greenleaf WJ, Davis MM, Chang HY. Nature Medicine (2018).

  • T cells create vast amounts of diversity in the genes that encode their T cell receptors (TCRs), which enables individual clones to recognize specific peptide–major histocompatibility complex (MHC) ligands. Here we combined sequencing of the TCR-encoding genes with assay for transposase-accessible chromatin with sequencing (ATAC-seq) analysis at the single-cell level to provide information on the TCR specificity and epigenomic state of individual T cells. By using this approach, termed transcript-indexed ATAC-seq (T-ATAC-seq), we identified epigenomic signatures in immortalized leukemic T cells, primary human T cells from healthy volunteers and primary leukemic T cells from patient samples. In peripheral blood CD4+ T cells from healthy individuals, we identified cis and trans regulators of naive and memory T cell states and found substantial heterogeneity in surface-marker-defined T cell populations. In patients with a leukemic form of cutaneous T cell lymphoma, T-ATAC-seq enabled identification of leukemic and nonleukemic regulatory pathways in T cells from the same individual by allowing separation of the signals that arose from the malignant clone from the background T cell noise. Thus, T-ATAC-seq is a new tool that enables analysis of epigenomic landscapes in clonal T cells and should be valuable for studies of T cell malignancy, immunity and immunotherapy.

  • Satpathy AT*, Saligrama N*, Buenrostro JD*, Wei Y, Wu B, Rubin AJ, Granja JM, Li R, Mumbach MR, Lareau CA, Serratelli WS, Gennert DG, Schep AN, Corces MR, Kim YH, Khavari PA, Greenleaf WJ, Davis MM, Chang HY. Transcript-indexed ATAC-seq for precision immune profiling. Nature Medicine (2018).

  • We introduce an approach to identify disease-relevant tissues and cell types by analyzing gene expression data together with genome-wide association study (GWAS) summary statistics. Our approach uses stratified linkage disequilibrium (LD) score regression to test whether disease heritability is enriched in regions surrounding genes with the highest specific expression in a given tissue. We applied our approach to gene expression data from several sources together with GWAS summary statistics for 48 diseases and traits (average N = 169,331) and found significant tissue-specific enrichments (false discovery rate (FDR) < 5%) for 34 traits. In our analysis of multiple tissues, we detected a broad range of enrichments that recapitulated known biology. In our brain-specific analysis, significant enrichments included an enrichment of inhibitory over excitatory neurons for bipolar disorder, and excitatory over inhibitory neurons for schizophrenia and body mass index. Our results demonstrate that our polygenic approach is a powerful way to leverage gene expression data for interpreting GWAS signals.

  • Finucane et al. Heritability enrichment of specifically expressed genes identifies disease-relevant tissues and cell types. Nature Genetics (2018).

chromVAR: inferring transcription-factor-associated accessibility from single-cell epigenomic data

Schep AN, Wu B, Buenrostro JD☨, Greenleaf WJ☨. Nature Methods (2017).

  • Single-cell ATAC-seq (scATAC) yields sparse data that make conventional analysis challenging. We developed chromVAR (http://www.github.com/GreenleafLab/chromVAR), an R package for analyzing sparse chromatin-accessibility data by estimating gain or loss of accessibility within peaks sharing the same motif or annotation while controlling for technical biases. chromVAR enables accurate clustering of scATAC-seq profiles and characterization of known and de novo sequence motifs associated with variation in chromatin accessibility.

  • Schep AN, Wu B, Buenrostro JD☨, Greenleaf WJ☨. chromVAR: inferring transcription-factor-associated accessibility from single-cell epigenomic data. Nature Methods (2017).

Lineage-specific and single cell chromatin accessibility charts human hematopoiesis and leukemia evolution

Corces MR* & Buenrostro JD*☨, Wu B, Greenside PG, Chan SM, Koenig JL, Snyder MP, Pritchard JK, Kundaje A, Greenleaf WJ, Majeti R☨, Chang HY☨. Nature Genetics (2016).

  • We define the chromatin accessibility and transcriptional landscapes in 13 human primary blood cell types that span the hematopoietic hierarchy. Exploiting the finding that the enhancer landscape better reflects cell identity than mRNA levels, we enable 'enhancer cytometry' for enumeration of pure cell types from complex populations. We identify regulators governing hematopoietic differentiation and further show the lineage ontogeny of genetic elements linked to diverse human diseases. In acute myeloid leukemia (AML), chromatin accessibility uncovers unique regulatory evolution in cancer cells with a progressively increasing mutation burden. Single AML cells exhibit distinctive mixed regulome profiles corresponding to disparate developmental stages. A method to account for this regulatory heterogeneity identified cancer-specific deviations and implicated HOX factors as key regulators of preleukemic hematopoietic stem cell characteristics. Thus, regulome dynamics can provide diverse insights into hematopoietic development and disease.

  • Corces MR* & Buenrostro JD*☨, Wu B, Greenside PG, Chan SM, Koenig JL, Snyder MP, Pritchard JK, Kundaje A, Greenleaf WJ, Majeti R☨, Chang HY☨. Lineage-specific and single cell chromatin accessibility charts human hematopoiesis and leukemia evolution. Nature Genetiics (2016).

Single-cell chromatin accessibility reveals principles of regulatory variation

Buenrostro JD, Wu B, Litzenburger U, Gonzales M, Ruff D, Snyder M, Chang HY, Greenleaf WJ. Nature (2015).

  • Cell-to-cell variation is a universal feature of life that affects a wide range of biological phenomena, from developmental plasticity1,2 to tumour heterogeneity3. Although recent advances have improved our ability to document cellular phenotypic variation4,5,6,7,8, the fundamental mechanisms that generate variability from identical DNA sequences remain elusive. Here we reveal the landscape and principles of mammalian DNA regulatory variation by developing a robust method for mapping the accessible genome of individual cells by assay for transposase-accessible chromatin using sequencing (ATAC-seq)9 integrated into a programmable microfluidics platform. Single-cell ATAC-seq (scATAC-seq) maps from hundreds of single cells in aggregate closely resemble accessibility profiles from tens of millions of cells and provide insights into cell-to-cell variation. Accessibility variance is systematically associated with specific trans-factors and cis-elements, and we discover combinations of trans-factors associated with either induction or suppression of cell-to-cell variability. We further identify sets of trans-factors associated with cell-type-specific accessibility variance across eight cell types. Targeted perturbations of cell cycle or transcription factor signalling evoke stimulus-specific changes in this observed variability. The pattern of accessibility variation in cis across the genome recapitulates chromosome compartments10 de novo, linking single-cell accessibility variation to three-dimensional genome organization. Single-cell analysis of DNA accessibility provides new insight into cellular variation of the ‘regulome’.

  • Buenrostro JD, Wu B, Litzenburger U, Gonzales M, Ruff D, Snyder M, Chang HY, Greenleaf WJ. Single-cell chromatin accessibility reveals principles of regulatory variation. Nature (2015).

  • RNA-protein interactions drive fundamental biological processes and are targets for molecular engineering, yet quantitative and comprehensive understanding of the sequence determinants of affinity remains limited. Here we repurpose a high-throughput sequencing instrument to quantitatively measure binding and dissociation of a fluorescently labeled protein to >107 RNA targets generated on a flow cell surface by in situ transcription and intermolecular tethering of RNA to DNA. Studying the MS2 coat protein, we decompose the binding energy contributions from primary and secondary RNA structure, and observe that differences in affinity are often driven by sequence-specific changes in both association and dissociation rates. By analyzing the biophysical constraints and modeling mutational paths describing the molecular evolution of MS2 from low- to high-affinity hairpins, we quantify widespread molecular epistasis and a long-hypothesized, structure-dependent preference for G:U base pairs over C:A intermediates in evolutionary trajectories. Our results suggest that quantitative analysis of RNA on a massively parallel array (RNA-MaP) provides generalizable insight into the biophysical basis and evolutionary consequences of sequence-function relationships.

  • Buenrostro JD* & Araya CL*, Chircus LM, et al. Quantitative analysis of RNA-protein interactions on a massively parallel array reveals biophysical and evolutionary landscapes. Nature Biotechnology (2014).

  • We describe an assay for transposase-accessible chromatin using sequencing (ATAC-seq), based on direct in vitro transposition of sequencing adaptors into native chromatin, as a rapid and sensitive method for integrative epigenomic analysis. ATAC-seq captures open chromatin sites using a simple two-step protocol with 500–50,000 cells and reveals the interplay between genomic locations of open chromatin, DNA-binding proteins, individual nucleosomes and chromatin compaction at nucleotide resolution. We discovered classes of DNA-binding factors that strictly avoided, could tolerate or tended to overlap with nucleosomes. Using ATAC-seq maps of human CD4+ T cells from a proband obtained on consecutive days, we demonstrated the feasibility of analyzing an individual's epigenome on a timescale compatible with clinical decision-making.

  • Buenrostro JD, Giresi PG, Zaba LC, Chang HY, Greenleaf WJ. Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position. Nature Methods (2013).

Efficient targeted resequencing of human germline and cancer genomes by oligonucleotide-selective sequencing

Myllykangas S* & Buenrostro JD*, Natsoulis G, Bell JM, Ji HP. Nature Biotechnology (2011).

  • We describe an approach for targeted genome resequencing, called oligonucleotide-selective sequencing (OS-Seq), in which we modify the immobilized lawn of oligonucleotide primers of a next-generation DNA sequencer to function as both a capture and sequencing substrate. We apply OS-Seq to resequence the exons of either 10 or 344 cancer genes from human DNA samples. In our assessment of capture performance, >87% of the captured sequence originated from the intended target region with sequencing coverage falling within a tenfold range for a majority of all targets. Single nucleotide variants (SNVs) called from OS-Seq data agreed with >95% of variants obtained from whole-genome sequencing of the same individual. We also demonstrate mutation discovery from a colorectal cancer tumor sample matched with normal tissue. Overall, we show the robust performance and utility of OS-Seq for the resequencing analysis of human germline and cancer genomes.

  • Myllykangas S* & Buenrostro JD*, Natsoulis G, Bell JM, Ji HP. Efficient targeted resequencing of human germline and cancer genomes by oligonucleotide-selective sequencing. Nature Biotechnology (2011).