“Reading” the Epigenome
A central theme of our lab’s work is the pursuit of mechanistic insights into DNA modifying enzymes and their application in the development of next-generation biotechnological tools. Early in the lab’s time at Penn, we explored the biology of DNA demethylation and tackled the question of whether AID/APOBEC family DNA deaminases might be involved in DNA demethylation by creating genomic mismatches. We helped rule out deamination-mediated DNA demethylation pathways by studying enzymatic selectivity with modified cytosine bases (Nat Chem Bio, 2012). As the field uncovered bona fide methods for DNA demethylation involving TET family enzymes, we have also helped to dissect their enzymatic mechanisms, building new biochemical tools that allow for revealing how TET enzymes can regulate gene expression.
In the course of our work on DNA deaminases, we realized that these enzymes had a powerful ability to discriminate between cytosine modification states, a feature that could potentially be leveraged to usher in a new approach to epigenetic sequencing. Building on these mechanistic insights, our lab has pioneered DNA deaminase-based sequencing approaches to “read” the epigenome. This work has been motivated by the limitations of prior methods and has since taken hold in epigenetic sequencing approaches by others across the field. Modifications to cytosine bases in the form of 5-methylcytosine (5mC) or 5-hydroxymethylcytosine (5hmC) play critical gene regulatory roles that shape cell identity. For decades, researchers relied upon bisulfite sequencing (BS-Seq) to discriminate modified from unmodified cytosine bases. While BS-Seq has been foundational, the approach is destructive of input DNA and confounds 5mC and 5hmC together. Recognizing this challenge, we developed APOBEC-Coupled Epigenetic Sequencing (ACE-Seq), as the first method that used a DNA deaminase in base-resolution sequencing. In ACE-Seq, selective conversion by the DNA deaminase APOBEC3A (A3A) allowed us to map 5hmC in excitatory neurons using 1000-fold less DNA than required for an analogous bisulfite-based approach (Nat Biotechnol, 2018). As the ‘first-in-class’ method, ACE-Seq laid the groundwork for others in the field to derive all-enzymatic approaches for joint 5mC/5hmC mapping (EM-Seq) and for simultaneous genetic and epigenetic sequencing (5-letter-Seq). Enzymatic epigenetic sequencing methods are now broadly applied, particularly for the study of sparse DNA samples, such as in early development or circulating cell-free DNA, as explored in early cancer diagnostics.
With the advent of these enzymatic methods, we recognized that there remained major gaps, including the absence of base resolution enzymatic approaches to specifically and accurately detect only 5mC. In more recent work, we imagined that the combined use of DNA modifying enzymes, including engineered enzymes, could address these challenges. We speculated that DNA methyltransferases (MTases), enzymes that ‘write’ methylation marks, could be incorporated into sequencing pipelines to also ‘read’ these marks. Through a combination of biochemical and genetic approaches, we made the unexpected discovery of a mutant MTase with the ability to use a sparse natural metabolite, carboxy-SAM, in place of SAM as a cofactor (Cell Chem Bio, 2021). Striking again at the theme of our research – where mechanistic insights lead to new biotechnological tools – we then applied this neomorphic DNA carboxymethyltransferase (CxMTase) in conjunction with A3A to construct Direct Methylation Sequencing (DM-Seq), the first all-enzymatic approach for ‘reading’ only 5mC. DM-Seq is precise and non-destructive, allowing for identification of clinically-important 5mC sites otherwise masked by BS-Seq (Nat Chem Biol, 2023). More broadly, through productive collaborations, we have applied enzymatic methods to dissect single-cell 5hmC landscapes (Nat Biotechnol, 2024) and to study the bona fide DNA demethylation pathways in diverse processes, including stem cell reprogramming (Mol Cell, 2021) and imprinting (Dev Cell, 2024).
With ongoing support from NHGRI, NIGMSI and other sources, the vision of our research program is to now employ all-enzymatic approaches to simultaneously profile genetic and epigenetic information, to access long-read and nearly-complete single-cell epigenomes, to continue to deconstruction and engineer DNA modifying enzyme, including TET family enzymes, and together to address major open questions regarding gene regulation and cellular identity.