Diversified, miniaturized and ancestral parts for mammalian genome engineering and molecular recording
Analysis and visualization code for "Diversified, miniaturized and ancestral parts for mammalian genome engineering and molecular recording"
Troy A. McDiarmid1,2,†,, Megan L. Taylor1,2,†, Wei Chen1,2, Florence M. Chardon1,2, Junhong Choi1,2,3, Hanna Liao1,2, Xiaoyi Li1,2, Haedong Kim1,2, Jean-Benoît Lalanne1 , Tony Li1, Jenny F. Nathans1,2, Beth K. Martin1,2, Jordan Knuth2, Alessandro L.V. Coradini2, Jesse M. Gray2, Sudarshan Pinglay1,2,4, and Jay Shendure1,2,4,5,6
Affiliations:
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
- Seattle Hub for Synthetic Biology, Seattle, WA, USA
- Developmental Biology Program, Memorial Sloan Kettering Cancer Center, New York, NY, USA
- Brotman Baty Institute for Precision Medicine, Seattle, WA, USA
- Howard Hughes Medical Institute, Seattle, WA, USA
- Allen Discovery Center for Cell Lineage Tracing, Seattle, WA, USA
† These authors contributed equally to this work
- Correspondence to T.A.M. ([email protected]) or J.S. ([email protected])
Abstract As the synthetic biology and genome engineering fields mature and converge, there is a clear need for a “parts list” of components that are diversified with respect to both functional activity (to facilitate design) and primary sequence (to facilitate assembly). Here we designed libraries composed of extant, ancestral, mutagenized or miniaturized variants of Pol III promoters or guide RNA (gRNA) scaffolds and quantified their ability to mediate precise edits to the mammalian genome via multiplex prime editing. We identified thousands of parts that reproducibly drive a range of editing activities in human and mouse stem cells and cancer cell lines, including hundreds exhibiting similar or greater activity than the sequences used in conventional genome engineering constructs. We further conducted saturation mutagenesis screens of canonical Pol III promoters (U6p, 7SKp, H1p) and the prime editing guide RNA (pegRNA) scaffold, which identified tolerated variants that can be superimposed on baseline parts to further enhance sequence diversity. While characterizing thousands of orthologous promoters from hundreds of extant or ancestral genomes, we incidentally mapped the functional landscape of mammalian Pol III promoter evolution. Finally, to showcase the usefulness of these parts, we designed a “ten key” molecular recording array that lacks repetitive subsequences in order to facilitate its one-step assembly in yeast. Upon delivering this 15.8 kb tandem array of promoters and guides to mammalian cells, individual pegRNAs exhibited balanced activities as predicted by the activity of component parts, despite their relocation to a single locus. Looking forward, we anticipate that the diversified parts and variant effect maps reported here can be leveraged for the design, assembly and deployment of synthetic loci encoding arrays of gRNAs exhibiting predictable, differentiated levels of activity, which will be useful for multiplex perturbation, advanced biological recorders and complex genetic circuits.