+ : equal contributing authors

* : corresponding authors

42. Fontana, J.+, Sparkman-Yager, D.+, Faulkner, I.+, Cardiff, R., Kiattisewee, C., Walls, A., Primo, T., Kinnunen, P.C., Garcia Martin, H., Zalatan, J.G.*, Carothers, J.M.*  Guide RNA structure design enables combinatorial CRISPRa programs for biosynthetic profiling. Submitted. DOI:

41. Cardiff, R., Faulkner, I., Beall, J., Carothers, J.M.*, Zalatan, J.G.*. CRISPR-Cas tools for simultaneous transcription & translation control in bacteria. Submitted. DOI:

40. Alba Burbano, D.+, Kiattisewee, C.+, Karanjia, A.V., Cardiff, R.A.L., Faulkner, I.D., Sugianto, W., Carothers, J.M.* CRISPR Tools for Engineering Prokaryotic Systems: Recent Advances and New Applications. Submitted. 

39. Chowdhury, S., Ryan Cardiff, R., Westenberg, R., Beliaev, A.S., Noireaux, V., Carothers, J.M.*, Peralta-Yahya, P.* De novo cell free expression-based biocatalytic synthesis of serine from formate. Submitted.  

38. Burbano, D.A.+, Cardiff, R.+, Tickman, B.I., Kiattisewee, C., Maranas, C., Zalatan, J.G.*, Carothers, J.M.* Engineering activatible promoters for scalable and multi-input CRISPRa/i circuits. Proc. Natl. Acad. Sci. USA. 2023. 120 (30) e2220358120. DOI: LINK

SIGNIFICANCE: Gene regulatory networks (GRNs) expressed in cell-free systems hold great promise for investigating the limits of biological information processing and developing platforms for molecular biosensing and chemical bioproduction. We address the challenge of engineering GRNs that can dynamically activate many targets. The work described here enables classes of deep, wide, and multi-input CRISPR-based genetic circuits. This study represents an important step toward engineered GRNs with complexities approaching those found in nature.

ABSTRACT: Dynamic, multi-input gene regulatory networks (GRNs) are ubiquitous in nature. Multilayer CRISPR-based genetic circuits hold great promise for building GRNs akin to those found in naturally occurring biological systems. We develop an approach for creating high-performing activatable promoters that can be assembled into deep, wide, and multi-input CRISPR-activation and -interference (CRISPRa/i) GRNs. By integrating sequence-based design and in vivo screening, we engineer activatable promoters that achieve up to 1,000-fold dynamic range in an Escherichia coli-based cell-free system. These components enable CRISPRa GRNs that are six layers deep and four branches wide. We show the generalizability of the promoter engineering workflow by improving the dynamic range of the light-dependent EL222 optogenetic system from 6-fold to 34-fold. Additionally, high dynamic range promoters enable CRISPRa systems mediated by small molecules and protein–protein interactions. We apply these tools to build input-responsive CRISPRa/i GRNs, including feedback loops, logic gates, multilayer cascades, and dynamic pulse modulators. Our work provides a generalizable approach for the design of high dynamic range activatable promoters and enables classes of gene regulatory functions in cell-free systems.

37Sugianto, W., Altin-Yavuzarslan, G., Tickman, B.I.,  Kiattiswee, C., Yuan, S-F., Brooks, S.M., Wong, J., Alper, H.S., Nelson, A.*, Carothers, J.M.* Gene expression dynamics in input-responsive engineered living materials programmed for bioproduction. Materials Today Bio. 2023. 100677. DOI: LINK

ABSTRACT: Engineered living materials (ELMs) fabricated by encapsulating microbes in hydrogels have great potential as bioreactors for sustained bioproduction. While long-term metabolic activity has been demonstrated in these systems, the capacity and dynamics of gene expression over time is not well understood. Thus, we investigate the long-term gene expression dynamics in microbial ELMs constructed using different microbes and hydrogel matrices. Through direct gene expression measurements of engineered E. coli in F127-bisurethane methacrylate (F127-BUM) hydrogels, we show that inducible, input-responsive genetic programs in ELMs can be activated multiple times and maintained for multiple weeks. Interestingly, the encapsulated bacteria sustain inducible gene expression almost 10 times longer than free-floating, planktonic cells. These ELMs exhibit dynamic responsiveness to repeated induction cycles, with up to 97% of the initial gene expression capacity retained following a subsequent induction event. We demonstrate multi-week bioproduction cycling by implementing inducible CRISPR transcriptional activation (CRISPRa) programs that regulate the expression of enzymes in a pteridine biosynthesis pathway. ELMs fabricated from engineered S. cerevisiae in bovine serum albumin (BSA) - polyethylene glycol diacrylate (PEGDA) hydrogels were programmed to express two different proteins, each under the control of a different chemical inducer. We observed scheduled bioproduction switching between betaxanthin pigment molecules and proteinase A in S. cerevisiae ELMs over the course of 27 days under continuous cultivation. Overall, these results suggest that the capacity for long-term genetic expression may be a general property of microbial ELMs. This work establishes approaches for implementing dynamic, input-responsive genetic programs to tailor ELM functions for a wide range of advanced applications.

36. Shin, J., Porubsky, V., Carothers, J., Sauro, H.* Standards, Dissemination, and Best Practices in Systems Biology. Curr. Opin. Biotechnol. 2023. 81, 102922. DOI: LINK

ABSTRACT: The reproducibility of scientific research is crucial to the success of the scientific method. Here, we review the current best practices when publishing mechanistic models in systems biology. We recommend, where possible, to use software engineering strategies such as testing, verification, validation, documentation, versioning, iterative development, and continuous integration. In addition, adhering to the Findable, Accessible, Interoperable, and Reusable modeling principles allows other scientists to collaborate and build off of each other’s work. Existing standards such as Systems Biology Markup Language, CellML, or Simulation Experiment Description Markup Language can greatly improve the likelihood that a published model is reproducible, especially if such models are deposited in well-established model repositories. Where models are published in executable programming languages, the source code and their data should be published as open-source in public code repositories together with any documentation and testing code. For complex models, we recommend container-based solutions where any software dependencies and the run-time context can be easily replicated.

35. Garcia Martin, H.* et al. Perspectives for self-driving labs in synthetic biology. Curr. Opin. Biotechnol. 2023. 79, 102881. DOI: LINK

ABSTRACT: Self-driving labs (SDLs) combine fully automated experiments with artificial intelligence (AI) that decides the next set of experiments. Taken to their ultimate expression, SDLs could usher a new paradigm of scientific research, where the world is probed, interpreted, and explained by machines for human benefit. While there are functioning SDLs in the fields of chemistry and materials science, we contend that synthetic biology provides a unique opportunity since the genome provides a single target for affecting the incredibly wide repertoire of biological cell behavior. However, the level of investment required for the creation of biological SDLs is only warranted if directed toward solving difficult and enabling biological questions. Here, we discuss challenges and opportunities in creating SDLs for synthetic biology.

34. Kiattisewee, C.+, Karanjia, A.V.+, Legut, M., Daniloski, Z., Koplik, S.E., Nelson, J., Kleinstiver, B.P., Sanjana, N.E., Carothers, J.M.*, Zalatan, J.G.* Expanding the scope of bacterial CRISPR activation with PAM-flexible dCas9 variants. ACS Synth. Biol. 2022. 11, 12, 4103–4112. DOI: LINK

ABSTRACT: CRISPR-Cas transcriptional tools have been widely applied for programmable regulation of complex biological networks. In comparison to eukaryotic systems, bacterial CRISPR activation (CRISPRa) has stringent target site requirements for effective gene activation. While genes may not always have an NGG PAM at the appropriate position, PAM-flexible dCas9 variants can expand the range of targetable sites. Here we systematically evaluate a panel of PAM-flexible dCas9 variants for their ability to activate bacterial genes. We observe that dxCas9-NG provides a high dynamic range of gene activation for sites with NGN PAMs while dSpRY permits modest activity across almost any PAM. Similar trends were observed for heterologous and endogenous promoters. For all variants tested, improved PAM-flexibility comes with the tradeoff that CRISPRi-mediated gene repression becomes less effective. Weaker CRISPRi gene repression can be partially rescued by expressing multiple sgRNAs to target many sites in the gene of interest. Our work provides a framework to choose the most effective dCas9 variant for a given set of gene targets, which will further expand the utility of CRISPRa/i gene regulation in bacterial systems.

33. Tickman, B.I.+, Alba Burbano, D.+, Chavali, V.P., Kiattisewee, C., Fontana, J., Khakimzhan, A, Noireaux, V., Zalatan, J.G.*, and Carothers, J.M.*  Multi-layer CRISPRa/i circuits for dynamic genetic programs in cell-free and bacterial systems. Cell Systems. 2022. DOI: LINK

ABSTRACT: CRISPR-Cas transcriptional circuits hold great promise as platforms for engineering metabolic networks and information processing circuits. Historically, prokaryotic CRISPR control systems have been limited to CRISPRi. Creating approaches to integrate CRISPRa for transcriptional activation with existing CRISPRi-based systems would greatly expand CRISPR circuit design space. Here, we develop design principles for engineering prokaryotic CRISPRa/i genetic circuits with network topologies specified by guide RNAs. We demonstrate that multi-layer CRISPRa/i cascades and feedforward loops can operate through the regulated expression of guide RNAs in cell-free expression systems and E. coli. We show that CRISPRa/i circuits can program complex functions by designing type 1 incoherent feedforward loops acting as fold-change detectors and tunable pulse-generators. By investigating how component characteristics relate to network properties such as depth, width, and speed, this work establishes a framework for building scalable CRISPRa/i circuits as regulatory programs in cell-free expression systems and bacterial hosts.

32. Khakimzhan, A., Garenne, D., Tickman, B., Fontana, J., Carothers, J., Noireaux, V.* Complex Dependence of CRISPR-Cas9 binding strength on guide RNA spacer lengths. Phys. Biol. 2021. 18: 056003. DOI: LINK  

ABSTRACT: It is established that for CRISPR-Cas9 genetic guide RNAs with 17-20bp long spacer sequences are optimal for accurate target binding and cleavage. In this work, we perform cell-free CRISPRa (CRISPR activation) and CRISPRi (CRISPR inhibition) experiments to demonstrate the existence of a complex dependence of CRISPR-Cas9 binding as a function of the spacer length and complementarity. Our results show that significantly truncated or mismatched spacer sequences can form stronger guide-target bonds than the conventional 18-20bp long spacers. To explain this phenomenon, we take into consideration previous structural and single-molecule CRISPR-Cas9 experiments and develop a novel thermodynamic model of CRISPR-Cas9 target recognition.

31.  Kiattisewee, C. Dong, C., Fontana, J., Sugianto, W., Peralta-Yahya, P., Carothers, J.M.*, Zalatan, J.G*. Portable bacterial CRISPR transcriptional activation enables metabolic engineering in Pseudomonas putida. Metab. Eng. 2021. 66: 283-295. DOI: LINK

ABSTRACT: CRISPR-Cas transcriptional programming in bacteria is an emerging tool to regulate gene expression for metabolic pathway engineering. Here we implement CRISPR-Cas transcriptional activation (CRISPRa) in P. putida using a system previously developed in E. coli. We provide a methodology to transfer CRISPRa to a new host by first optimizing expression levels for the CRISPRa system components, and then applying rules for effective CRISPRa based on a systematic characterization of promoter features. Using this optimized system, we regulate biosynthesis in the biopterin and mevalonate pathways. We demonstrate that multiple genes can be activated simultaneously by targeting multiple promoters or by targeting a single promoter in a multi-gene operon. This work will enable new metabolic engineering strategies in P. putida and pave the way for CRISPR-Cas transcriptional programming in other bacterial species.

30.  Kruyer, N.S., Sugianto, W., Tickman, B.I., Alba Burbano, D., Noireaux, V., Carothers, J.M., Peralta-Yahya, P.* Membrane Augmented Cell-Free Systems: A New Frontier in Biotechnology. ACS Synth. Biol. 2021. 10: 670-681. LINK

ABSTRACT:  Membrane proteins are present in a wide array of cellular processes from primary and secondary metabolite synthesis to electron transport and single carbon metabolism. A key barrier to applying membrane proteins industrially is their difficult functional production. Beyond expression, folding, and membrane insertion, membrane protein activity is influenced by the physicochemical properties of the associated membrane, making it difficult to achieve optimal membrane protein performance outside the endogenous host. In this review, we highlight recent work on production of membrane proteins in membrane augmented cell-free systems (CFSs) and applications thereof. CFSs lack membranes and can thus be augmented with user-specified, tunable, mimetic membranes to generate customized environments for production of functional membrane proteins of interest. Membrane augmented CFSs would enable the synthesis of more complex plant secondary metabolites, the growth and division of synthetic cells for drug delivery and cell therapeutic applications, as well as enable green energy applications including methane capture and artificial photosynthesis.

29.  Fontana, J.+, Sparkman-Yager, D.+, Zalatan, J.G.*, Carothers, J.M.* Challenges and opportunities in bacterial CRISPRa for data-driven metabolic engineering. Curr. Opin. Biotechnol. 2020. 64: 190-198.DOI: LINK

ABSTRACT: Creating CRISPR gene activation (CRISPRa) technologies in industrially-promising bacteria could be transformative for accelerating data-driven metabolic engineering and strain design. CRISPRa has been widely used in eukaryotes, but applications in bacterial systems have remained limited. Recent work, including from our own laboratories, shows that multiple features of bacterial promoters impose stringent requirements on CRISPRa-mediated gene activation. However, by systematically defining rules for effective bacterial CRISPRa sites and developing new approaches for encoding complex functions in engineered guide RNAs, there are now clear routes to generalize synthetic gene regulation in bacteria. When combined with multi-omics data collection and machine learning, the full development of bacterial CRISPRa will dramatically improve the ability to rapidly engineer bacteria for bioproduction through accelerated design-build-test-learn cycles.

28.  Fontana, J.+, Dong, C.+, Kiattisewee, C., Chavali, V.P., Tickman, B.I., Carothers, J.M.*, Zalatan, J.G.* Effective CRISPRa-mediated control of gene expression in bacteria must overcome strict target site requirements. Nature Commun. 2020. 11: 1618.  DOI: LINK

ABSTRACT: In bacterial systems, CRISPR-Cas transcriptional activation (CRISPRa) has the potential to dramatically expand our ability to regulate gene expression, but we currently lack a complete understanding of the rules for designing effective guide RNA target sites. We have identified multiple features of bacterial promoters that impose stringent requirements on bacterial CRISPRa target sites. Most importantly, we found that shifting a gRNA target site by 2-4 bases along the DNA target can cause a nearly complete loss in activity. The loss in activity can be rescued by shifting the target site 10-11 bases, corresponding to one full helical turn. Practically, our results suggest that it will be challenging to find a gRNA target site with an appropriate PAM sequence at precisely the right position at arbitrary genes of interest. To overcome this limitation, we demonstrate that a dCas9 variant with expanded PAM specificity allows activation of promoters that cannot be activated by S. pyogenes dCas9. These results provide a roadmap for future engineering efforts to further expand and generalize the scope of bacterial CRISPRa.

27. Barajas, et al. Isolation and characterization of bacterial cellulase producers for biomass deconstruction: A microbiology laboratory course. J. Microbiol. & Biol. Edu. 2019. 20: 50.  DOI: LINK

ABSTRACT: The conversion of biomass to biofuels presents a solution to one of the largest global challenges of our era, climate change. A critical part of this pipeline is the process of breaking down cellulosic sugars from plant matter to be used by microbes containing biosynthetic pathways that produce biofuels or bioproducts. In this inquiry-based course, students complete a research project that isolates cellulase-producing bacteria from samples collected from the environment. After obtaining isolates, the students characterize the production of cellulases. Students then amplify and sequence the 16S rRNA genes of confirmed cellulase producers and use bioinformatic methods to identify the bacterial isolates. Throughout the course, students learn about the process of generating biofuels and bioproducts through the deconstruction of cellulosic biomass to form monosaccharides from the biopolymers in plant matter. The program relies heavily on active learning and enables students to connect microbiology with issues of sustainability. In addition, it provides exposure to basic microbiology, molecular biology, and biotechnology laboratory techniques and concepts. The described activity was initially developed for the Introductory College Level Experience in Microbiology (iCLEM) program, a research-based immersive laboratory course at the US Department of Energy Joint BioEnergy Institute. Originally designed as an accelerated program for high-potential, low-income, high school students (11th–12th grade), this curriculum could also be implemented for undergraduate coursework in a research-intensive laboratory course at a two- or four-year college or university.

26.  Aurand, E, Keasling, J., Friedman, D., Salis, H., Liu, C., Peralta-Yahya, P., Carothers, J., Arkin, A., Collins, C., Galm, U., Cizauskas, C., Haynes, K., Lu, A., Savage, D., Annaluru, V., Bovenberg, R., Carlson, P.,  Contreras, L., Freemont, P., Hamazato, F., Jewett, M., Khalil, A., Plassmeier, J., Roubos, H., Sampson, J., Wook Chang, M. "Engineering Biology: a research roadmap for the next-generation bioeconomy." Engineering Biology Research Consortium, Emeryville, CA (2019) DOI: LINK

ABSTRACT: Engineering Biology: A Research Roadmap for the Next-Generation Bioeconomy is a critical assessment of the current status and potential of engineering biology. It is intended to provide researchers and other stakeholders (including government funders) with a compelling set of technical challenges and opportunities in the near and long term. The matrixed framework of the roadmap considers challenges, bottlenecks, and other limitations observed or predicted in the research, development, and application of advancements in engineering biology tools and technologies toward addressing broad societal challenges. The roadmap’s four technical themes form the foundation of engineering biology research and technology and illustrate where our current abilities lie and what we might achieve in the next 20 years. Complementarily, the five roadmap application and impact sectors demonstrate the breadth and impact of technical advancements in real-world application areas and exemplify how engineering biology tools and products could be oriented towards some of the most complex problems we face as a society. The technical themes represent a “bottom-up” approach focusing on tool and technology innovations to move the field forward, while the five application and impact sectors are a “top-down” look at how engineering biology could contribute toward addressing and overcoming national and global challenges.

25.  Burke, C.R., Sparkman-Yager, D., Carothers, J.M.*  Multi-state design of kinetically-controlled RNA aptamer ribosensors. BioRxiv DOI: LINK 

ABSTRACT: Metabolite-responsive RNA regulators with kinetically-controlled responses are widespread in nature. By comparison, very limited success has been achieved creating kinetic control mechanisms for synthetic RNA aptamer devices. Here, we show that kinetically-controlled RNA aptamer ribosensors can be engineered using a novel approach for multi-state, co-transcriptional folding design. The design approach was developed through investigation of 29 candidate p-aminophenylalanine-responsive ribosensors. We show that ribosensors can be transcribed in situ and used to analyze metabolic production directly from engineered microbial cultures, establishing a new class of cell-free biosensors. We found that kinetically-controlled ribosensors exhibited 5-10 fold greater ligand sensitivity than a thermodynamically-controlled device. And, we further demonstrated that a second aptamer, promiscuous for aromatic amino acid binding, could be assembled into kinetic  ribosensors with 45-fold improvements in ligand selectivity. These results have broad implications for engineering RNA aptamer devices and overcoming thermodynamic constraints on molecular recognition through the design of kinetically-controlled responses.

24.    Dong, C., Fontana, J., Patel, A., Carothers, J.M.*, Zalatan, J.G.* Synthetic CRISPR-Cas gene activators for transcriptional reprogramming in bacteria. Nature Commun. 2018. 9:2489. DOI: LINK

ABSTRACT: Methods to regulate gene expression programs in bacterial cells are limited by the absence of effective gene activators. To address this challenge, we have developed new synthetic bacterial transcriptional activators in E. coli by linking activation domains to programmable CRISPR-Cas DNA binding domains. Effective gene activation requires target sites situated in a narrow region just upstream of the transcription start site, in sharp contrast to the relatively flexible target site requirements for gene activation in eukaryotic cells. Together with existing tools for CRISPRi gene repression, these bacterial activators enable programmable control over multiple genes with simultaneous activation and repression. Further, the entire gene expression program can be switched on by inducing expression of the CRISPR-Cas system. This work will provide a foundation for engineering synthetic bacterial cellular devices with applications including diagnostics, therapeutics, and industrial biosynthesis.

HIGHLIGHTED IN: Nature Communications Editors’ Highlights

23.   Fontana, J., Voje, W.E., Zalatan, J.G.*, Carothers, J.M.* Prospects for engineering dynamic CRISPR-Cas transcriptional circuits to improve bioproduction.  J. Indust. Microbiol. & Biotechnol. 2018.  DOI: 10.1007/s10295-018-2039-z. LINK

ABSTRACT: Dynamic control of gene expression is emerging as an important strategy for controlling flux in metabolic pathways and improving bioproduction of valuable compounds. Integrating dynamic genetic control tools with CRISPR–Cas transcriptional regulation could significantly improve our ability to fine-tune the   expression of  multiple endogenous and heterologous genes according to the state of the cell. In this mini-review, we combine an analysis of recent literature with examples from our own work to discuss the prospects and challenges of developing dynamically regulated CRISPR–Cas transcriptional control systems for applications  in synthetic biology and metabolic engineering.

APPEARS IN:  Special Issue on "Synthetic Biology for the Biotechnology Industry", eds, Leonard Katz, Richard Balz, Huimin Zhao, Ramon Gonzalez, Todd Petersen

22.   Fontana, J., Dong, C., Ham, J.Y., Zalatan, J.G.*, Carothers, J.M.*  Regulated expression of sgRNAs tunes CRISPRi in E. coliBiotechnol. J. 2018. e1800069. doi: 10.1002/biot.201800069. LINK

ABSTRACT: Methods for implementing dynamically-controlled multi-gene programs could expand our ability to engineer metabolism for efficiently producing high-value compounds. Working toward this goal, we explored whether CRISPRi repression can be tuned in E. coli through the regulated expression of the CRISPRi machinery. We find when dCas9 is not limiting, variations in sgRNA expression alone can lead to CRISPRi repression levels ranging from 5- to 300-fold. We show that titrating sgRNA expression over a 2.5-fold range can lead to 16-fold changes in reporter gene expression. Many different classes of genetic controllers can generate 2.5-fold differences in transcription, indicating they could be integrated in dynamically-regulated CRISPRi circuits. Finally, we observed that CRISPRi cannot be reversed for up to 12 hours by expressing a competing sgRNA later in the growth phase, indicating that CRISPR-Cas:DNA interactions can be persistent in vivo. Collectively, our results identify genetic architectures for tuning CRISPRi repression through regulated sgRNA expression and suggest that dynamically-regulated CRISPRi systems targeting multiple genes may be within reach.

          HIGHLIGHTED IN: Commentary in Special Issue on "CRISPR Technologies for Metabolic and Microbial Engineering"

21. Stevens, J.T., and Carothers, J.M.* Advanced Review: Programming gene expression by engineering transcript stability control and processing in bacteria. Synthetic Biology: Parts, Devices and Applications. 1st ed. Wiley-Blackwell Biotechnology Series, ed. C. Smolke. 2018. 189-215. LINK

ABSTRACT: Through control of messenger RNA stability, bacteria are able to process information, respond to changing conditions, and maintain homeostasis. Many of the naturally occurring mechanisms for Transcript Stability Control (TSC) have been elucidated, and a number of studies have leveraged this understanding to demonstrate that transcript stability can be engineered to control static and dynamic gene expression. Collectively, that body of work represents a foundation for developing new forward-engineering approaches that harness mechanistic understanding to build predictive computational models to guide the development of large-scale genetic devices based on TSC and other means. Further increasing our understanding of RNA degradation pathways and mechanisms will also improve the ability to anticipate how undesired variations in transcript stability may confound device output goals and frustrate engineering efforts. Here, we discuss the current state of the art and identify routes for using TSC to design increasingly large and complex synthetic biological systems.

20.  Gander, M.W.*, Vrana, J.D., Voje, W.E., Carothers, J.M., Klavins, E. Digital logic circuits in yeast with CRISPR-dCas9 NOR gates.  Nature Commun. 2017. e15459. LINK

ABSTRACT: Natural genetic circuits enable cells to make sophisticated digital decisions. Building equally complex synthetic circuits in eukaryotes remains difficult, however, because commonly used genetic components leak transcriptionally, do not allow arbitrary interconnections, or do not have digital responses. Here, we designed a new dCas9-Mxi1 based NOR gate architecture in S. cerevisiae that allows arbitrary connectivity and large genetic circuits. Because we used the strong chromatin remodeler Mxi1, our system showed very little leak and exhibits a highly digital response. In particular, we built a combinatorial library of NOR gates that each directly convert guide RNA (gRNA) input signals into gRNA output signals, enabling NOR gates to be “wired” together. We constructed and characterized logic circuits with up to seven independent gRNAs, including repression cascades with up to seven layers. Modeling predicted that the NOR gates have Hill Coefficients of approximately 1.71±0.09, explaining the minimal signal degradation we observed in these deeply layered circuits. Our approach enables the construction of the largest, eukaryotic gene circuits to date and will form the basis for large, synthetic, decision making systems in living cells.

          HIGHLIGHTED IN: UW Press Release

19.  Hwang, C., and Carothers, J.M.*  Label-free selections of aptamers for metabolic engineering. Methods. 2016. LINK PDF

ABSTRACT: RNA aptamers can be assembled into genetic regulatory devices that sense and respond to levels of specific cellular metabolites and thus serve an integral part of designing dynamic control into engineered metabolic pathways. Here, we describe a practical method for generating specific and high affinity aptamers to enable the wider use of in vitro selection and a broader application of aptamers for metabolic engineering. Conventional selection methods involving either radioactive labeling of RNA or the use of label-free methods such as SPR to track aptamer enrichment require resources that are not widely accessible to research groups. We present a label-free selection method that uses small volume spectrophotometers to track RNA enrichment paired with previously characterized affinity chromatography methods. Borrowing techniques used in solid phase peptide synthesis, we present an approach for immobilizing a wide range of metabolites to an amino PEGA matrix. As an illustration, we detail laboratory techniques employed to generate aptamers that bind p-aminophenylalanine, a metabolic precursor for bio-based production of plastics and the pristinamycin family of antibiotics. We focused on the development of methods for ligand immobilization, selection via affinity chromatography, and nucleic acid quantification that can be performed with common laboratory equipment.

18.  Beck, D.A.C.*, Carothers, J.M. Subramanian, V., Pfaendtner, J.* Data Science: Accelerating innovation and discovery in chemical engineering.  AIChE J. 2016. 62, 1402-1416. (Cover) LINK PDF

ABSTRACT: All of science and engineering, including chemical engineering, is being transformed by new sources of data from high-throughput experiments, observational studies, and simulation. In this new era of data-enabled science and engineering, discovery is no longer limited by the collection and processing of data but by data management, knowledge extraction, and the visualization of information. The termdata sciencehas become increasingly popular across industry, and academic disciplines to refer to the combination of strategies and tools for addressing the oncoming deluge of data. The term data scientist is a common descriptor of an engineer or scientist from any disciplinary background who is equipped to seamlessly process, analyze, and communicate in this data-intensive context. The core areas of data science are often identified as data management, statistical and machine learning, and visualization. In this Perspective, we present an overview of these core areas, discuss application areas from within chemical engineering research, and conclude with perspectives on how data science principles can be included in our training.

17.  Sparkman-Yager, D., Correa-Rojas, D., Carothers, J.M.* Kinetic folding design of aptazyme-regulated expression devices as riboswitches for metabolic engineering. Methods in Enzymol. 2015. 550, 321-340.  LINK PDF

ABSTRACT: Recent developments in the fields of synthetic biology and metabolic engineering have opened the doors for the microbial production of biofuels and other valuable organic compounds. There remain, however, significant metabolic hurdles to the production of these compounds in cost-effective quantities. This is due, in part, to mismatches between the metabolic engineer's desire for high yields and the microbe's desire to survive. Many valuable compounds, or the intermediates necessary for their biosynthesis, prove deleterious at the desired production concentrations. One potential solution to these toxicity-related issues is the implementation of nonnative dynamic genetic control mechanisms that sense excessively high concentrations of metabolic intermediates and respond accordingly to alleviate their impact. One potential class of dynamic regulator is the riboswitch: cis-acting RNA elements that regulate the expression of downstream genes based on the presence of an effector molecule. Here, we present combined methods for constructing aptazyme-regulated expression devices (aREDs) through computational cotranscriptional kinetic folding design and experimental validation. These approaches can be used to engineer aREDs within novel genetic contexts for the predictable, dynamic regulation of gene expression in vivo.

16.  Stevens, J.T., and Carothers, J.M.* Designing RNA-based genetic control systems for efficient production from engineered metabolic pathways. ACS Synth. Biol. 2015. 4, 107-115. LINK PDF

ABSTRACT: Engineered metabolic pathways can be augmented with dynamic regulatory controllers to increase production titers by minimizing toxicity and helping cells maintain homeostasis. We investigated the potential for dynamic RNA-based genetic control systems to increase production through simulation analysis of an engineered p-aminostyrene (p-AS) pathway in E. coli. To map the entire design space, we formulated 729 unique mechanistic models corresponding to all of the possible control topologies and mechanistic implementations in the system under study. 2,000 sampled simulations were performed for each of the 729 system designs to relate the potential effects of dynamic control to increases in p-AS production (total of 3×106 simulations). Our analysis indicates that dynamic control strategies employing aptazyme-regulated expression devices (aREDs) can yield >10-fold improvements over static control. We uncovered generalizable trends in successful control architectures and found that highly performing RNA-based control systems are experimentally tractable. Analyzing the metabolic control state space to predict optimal genetic control strategies promises to enhance the design of metabolic pathways.

15.  Thimmaiah, T., Voje, Jr., W.E., and Carothers, J.M.*  Computational design of RNA parts, devices, and transcripts with kinetic folding algorithms implemented on multiprocessor clusters. Computational Methods in Synthetic Biology, Methods in Mol. Biol., Marchisio (ed.). 2015, 1244. DOI: 10.1007/978-1-4939-1878-2_3. LINK PDF 

ABSTRACT: With progress toward inexpensive, large-scale DNA assembly, the demand for simulation tools that allow the rapid construction of synthetic biological devices with predictable behaviors continues to increase. By combining engineered transcript components, such as ribosome binding sites, transcriptional terminators, ligand-binding aptamers, catalytic ribozymes, and aptamer-controlled ribozymes (aptazymes), gene expression in bacteria can be fine-tuned, with many corollaries and applications in yeast and mammalian cells. The successful design of genetic constructs that implement these kinds of RNA-based control mechanisms requires modeling and analyzing kinetically-determined co-transcriptional folding pathways. Transcript design methods using stochastic kinetic folding simulations to search spacer sequence libraries for motifs enabling the assembly of RNA component parts into static ribozyme- and dynamic aptazyme-regulated expression devices with quantitatively predictable functions (rREDs and aREDs, respectively) have been described (Science 2011, 334, 1716 1719). Here, we provide a detailed practical procedure for computational transcript design by illustrating a high throughput, multiprocessor approach for evaluating spacer sequences and generating functional rREDs. This chapter is written as a tutorial, complete with pseudo-code and step-by-step instructions for setting up a computational cluster with an Amazon, Inc. web server and performing the large numbers of kinefold-based stochastic kinetic co-transcriptional folding simulations needed to design functional rREDs and aREDs. The method described here should be broadly applicable for designing and analyzing a variety of synthetic RNA parts, devices and transcripts.

14. Goler, J.A., Carothers, J.M., and Keasling, J.D.* Dual-selection for evolution of in vivo functional aptazymes as riboswitch parts. Methods Mol. Biol. 2014. 1111, 221-35.  LINK   PDF

ABSTRACT: Synthetic biology and metabolic engineering both are aided by the development of genetic control parts. One class of riboswitch parts that has great potential for sensing and regulation of protein levels is aptamer-coupled ribozymes (aptazymes). These devices are comprised of an aptamer domain selected to bind a particular ligand, a ribozyme domain, and a communication module that regulates the ribozyme activity based on the state of the aptamer. We describe a broadly-applicable method for coupling a novel, newly selected aptamer to a ribozyme to generate functional aptazymes via in vitro and in vivo selection. To illustrate this approach, we describe experimental procedures for selecting aptazymes assembled from aptamers that bind p -amino-phenylalanine and a hammerhead ribozyme. Because this method uses selection, it does not rely on sequence-specific design and thus should be generalizable for the generation of in vivo operational aptazymes that respond to any targeted molecules.

13. Carothers, J.M.* Design-driven, multi-use research agendas to enable applied synthetic biology for global health. Syst. Synth. Biol. 2013.  7, 79-86. LINK   PDF

ABSTRACT: Many of the synthetic biological devices, pathways and systems that can be engineered aremulti-use, in the sense that they could be used both for commercially-important applications and to help meet global health needs. The on-going development of models and simulation tools for assembling component parts into functionally-complex devices and systems will enable successful engineering with much less trial-and-error experimentation and laboratory infrastructure. As illustrations, I draw upon recent examples from my own work and the broader Keasling research group at the University of California Berkeley and the Joint BioEnergy Institute, of which I was formerly a part. By combining multi-use synthetic biology research agendas with advanced computer-aided design tool creation, it may be possible to more rapidly engineer safe and effective synthetic biology technologies that help address a wide range of global health problems.

12. Cambray, G., Guimaraes J., Mutalik V., Lam C., May Q.A., Thimmaiah T., Carothers J.M., Arkin A.P., and Endy D.* Quantification and prediction of intrinsic transcription termination efficiency. Nucl. Acids Res. 2013. 41, 5139-5148. LINK 

ABSTRACT: The reliable forward engineering of genetic systems remains limited by the ad hoc reuse of many types of basic genetic elements. Although a few intrinsic prokaryotic transcription terminators are used routinely, termination efficiencies have not been studied systematically. Here, we developed and validated a genetic architecture that enables reliable measurement of termination efficiencies. We then assembled a collection of 61 natural and synthetic terminators that collectively encode termination efficiencies across an ∼800-fold dynamic range withinEscherichia coli. We simulated co-transcriptional RNA folding dynamics to identify competing secondary structures that might interfere with terminator folding kinetics or impact termination activity. We found that structures extending beyond the core terminator stem are likely to increase terminator activity. By excluding terminators encoding such context-confounding elements, we were able to develop a linear sequence-function model that can be used to estimate termination efficiencies (r = 0.9, n = 31) better than models trained on all terminators (r = 0.67, n = 54). The resulting systematically measured collection of terminators should improve the engineering of synthetic genetic systems and also advance quantitative modeling of transcription termination.

11. Zhang, F., Carothers, J.M., and Keasling, J.D.* Design of a dynamic sensor-regulator system for production of fatty acid-based chemicals and fuels. Nature Biotechnol. 2012.  30, 354-359LINK

ABSTRACT: Microbial production of chemicals is now an attractive alternative to chemical synthesis. Current efforts focus mainly on constructing pathways to produce different types of molecules. However, there are few strategies for engineering regulatory components to improve product titers and conversion yields of heterologous pathway. Here we developed a dynamic sensor-regulator system (DSRS) to produce fatty acid–based products in Escherichia coli, and demonstrated its use for biodiesel production.  The DSRS uses a transcription factor that senses a key intermediate and dynamically regulates the expression of genes involved in biodiesel production. This DSRS substantially improved the stability of biodiesel-producing strains and increased the titer to 1.5 g/l and the yield threefold to 28% of the theoretical maximum. Given the large number of natural sensors available, this DSRS strategy can be extended to many other biosynthetic pathways to balance metabolism, thereby increasing product titers and conversion yields and stabilizing production hosts.

HIGHLIGHTED IN: LBNL Press Release | Scientific American Online |Genetic Engineering & Biotechnology News (GEN)

10. Carothers, J.M., Goler, J.A., Juminaga, D, and Keasling, J.D.* Model-driven engineering of RNA devices to quantitatively-program gene expression. Science. 2011. 334, 1716-1719. LINK

ABSTRACT:  The models and simulation tools available to design functionally complex synthetic biological devices are very limited. We formulated a design-driven approach that used mechanistic modeling and kinetic RNA folding simulations to engineer RNA-regulated genetic devices that control gene expression. Ribozyme and metabolite-controlled, aptazyme-regulated expression devices with quantitatively predictable functions were assembled from components characterized in vitro, in vivo, and in silico. The models and design strategy were verified by constructing 28 Escherichia coli expression devices that gave excellent quantitative agreement between the predicted and measured gene expression levels (r = 0.94). These technologies were applied to engineer RNA-regulated controls in metabolic pathways. More broadly, we provide a framework for studying RNA functions and illustrate the potential for the use of biochemical and biophysical modeling to develop biological design methods.

HIGHLIGHTED IN: DOE Press Release (w/ quote from Secretary of Energy Steve Chu) | LBNL Press Release | Chemical & Engineering News (C&EN) | Scientific American Online | Nature Reviews Genetics|Genetic Engineering & Biotechnology News (GEN)

9. Carothers, J.M.+, Goler, J.A.+, Kapoor, R., Lara, L., and Keasling, J.D.* Selecting aptamers for synthetic biology: investigating magnesium dependence and predicting binding affinity. Nucl. Acids Res. 2010. 38, 2736-2747. LINK 

ABSTRACT:  The ability to generate RNA aptamers for synthetic biology using in vitro selection depends on the informational complexity (IC) needed to specify functional structures that bind target ligands with desired affinities in physiological concentrations of magnesium. We investigate how selection for high-affinity aptamers is constrained by chemical properties of the ligand and the need to bind in low magnesium. We select two sets of RNA aptamers that bind planar ligands with dissociation constants (Kds) ranging from 65 nM to 100 μM in physiological buffer conditions. Aptamers selected to bind the non-proteinogenic amino acid, p-amino phenylalanine (pAF), are larger and more informationally complex (i.e., rarer in a pool of random sequences) than aptamers selected to bind a larger fluorescent dye, tetramethylrhodamine (TMR). Interestingly, tighter binding aptamers show less dependence on magnesium than weaker-binding aptamers. Thus, selection for high-affinity binding may automatically lead to structures that are functional in physiological conditions (1–2.5 mM Mg2+). We hypothesize that selection for high-affinity binding in physiological conditions is primarily constrained by ligand characteristics such as molecular weight (MW) and the number of rotatable bonds. We suggest that it may be possible to estimate aptamer–ligand affinities and predict whether a particular aptamer-based design goal is achievable before performing the selection.

HIGHLIGHTED IN:  BioTechniques

8. Carothers, J.M., Goler, J.A., and Keasling, J.D.* Chemical synthesis using synthetic biology. Curr. Opin. Biotechnol. 2009. 20, 498-503. LINK

ABSTRACT:  An immense array of naturally occurring biological systems have evolved that convert simple substrates into the products that cells need for growth and persistence. Through the careful application of metabolic engineering and synthetic biology, this biotransformation potential can be harnessed to produce chemicals that address unmet clinical and industrial needs. Developing the capacity to utilize biology to perform chemistry is a matter of increasing control over both the function of synthetic biological systems and the engineering of those systems. Recent efforts have improved general techniques and yielded successes in the use of synthetic biology for the production of drugs, bulk chemicals, and fuels in microbial platform hosts. Synthetic promoter systems and novel RNA-based, or riboregulator, mechanisms give more control over gene expression. Improved methods for isolating, engineering, and evolving enzymes give more control over substrate and product specificity and better catalysis inside the cell. New computational tools and methods for high-throughput system assembly and analysis may lead to more rapid forward engineering. We highlight research that reduces reliance upon natural biological components and point to future work that may enable more rational design and assembly of synthetic biological systems for synthetic chemistry.

7. Hazen, R.M., Griffin, P.L., Carothers, J.M., and Szostak, J.W.* Functional information and the emergence of biocomplexity. Proc. Natl. Acad. Sci. 2007. 104, 8574-8581. LINK

ABSTRACT:  Complex emergent systems of many interacting components, including complex biological systems, have the potential to perform quantifiable functions. Accordingly, we define “functional information,” I(Ex ), as a measure of system complexity. For a given system and function, x (e.g., a folded RNA sequence that binds to GTP), and degree of function, Ex (e.g., the RNA–GTP binding energy), I(Ex) = −log2[F(E x)], where F(Ex ) is the fraction of all possible configurations of the system that possess a degree of function ≥ Ex . Functional information, which we illustrate with letter sequences, artificial life, and biopolymers, thus represents the probability that an arbitrary configuration of a system will achieve a specific function to a specified degree. In each case we observe evidence for several distinct solutions with different maximum degrees of function, features that lead to steps in plots of information versus degree of function.

6. Carothers, J.M., and Szostak, J.W.* In vitro selection of functional oligonucleotides and the origins of biochemical activity. In The Aptamer Handbook Functional Oligonucleotides and Their Applications (ed. S. Klussmann). Springer-Verlag Press, Berlin: 2006, 3-28. LINK Available free under "Read Excerpt: Chapter (PDF)"

INTRODUCTION:  In vitro selection is an experimental method for searching oligonucleotide sequence spaces for synthetic structures and activities. Oligonucleotide sequence spaces are very large – they contain the ensemble of all possible sequences of a given length separated by point mutations. For example, the sequence space of an RNA the length of a small tRNA (74 nucleotides) encompasses 1043 different molecules. The largest libraries typically synthesized in the laboratory, approximately 1016 different sequences, represent only a minute fraction of the total number of possible sequences for any nucleic acid molecule of even modest size. How can such necessarily sparse samplings of sequence space produce so many different aptamers, ribozymes, and deoxyribozymes? In this chapter, we focus on the technology of in vitro selection and what its application teaches us about the quantity and quality of functional structures in nucleic acid sequence spaces.

5. Carothers, J.M., Oestreich, S.C., and Szostak, J.W.* Aptamers selected for higher-affinity binding are not more specific for the target ligand. J. Am. Chem. Soc. 2006. 128, 7929-7937. LINK

ABSTRACT:  Previous study of eleven different in vitro-selected RNA aptamers that bind guanosine triphosphate (GTP) with Kds ranging from 8 μM to 9 nM showed that more information is required to specify the structures of the higher-affinity aptamers. We are interested in understanding how the more complex aptamers achieve higher affinities for the ligand. In vitro selection produces structural solutions to a functional problem that are are as simple as possible in terms of the information content needed to define them. It has long been assumed that the simplest way to improve the affinity of an aptamer is to increase the shape and functional group complementarity of the RNA binding pocket for the ligand. This argument underlies the hypothesis that selection for higher-affinity aptamers automatically leads to structures that bind more specifically to the target molecule. Here, we examined the binding specificities of the eleven GTP aptamers by carrying out competition binding studies with sixteen different chemical analogues of GTP. The aptamers have distinct patterns of specificity, implying that each RNA is a structurally unique solution to the problem of GTP binding. However, these experiments failed to provide evidence that higher-affinity aptamers bind more specifically to GTP. We suggest that the simplest way to improve aptamer Kds may be to increase the stability of the RNA tertiary structure with additional intramolecular RNA−RNA interactions; increasingly specific ligand binding may emerge only in response to direct selection for specificity.

4. Carothers, J.M., Davis, J.H., Chou, J.J., and Szostak, J.W.* Solution structure of an informationally- complex high-affinity RNA aptamer to GTP. RNA. 2006. 12, 567-579. LINK

ABSTRACT:  Higher-affinity RNA aptamers to GTP are more informationally complex than lower-affinity aptamers. Analog binding studies have shown that the additional information needed to improve affinity does not specify more interactions with the ligand. In light of those observations, we would like to understand the structural characteristics that enable complex aptamers to bind their ligands with higher affinity. Here we present the solution structure of the 41-nt Class I GTP aptamer (Kd = 75 nM) as determined by NMR. The backbone of the aptamer forms a reverse-S that shapes the binding pocket. The ligand nucleobase stacks between purine platforms and makes hydrogen bonds with the edge of another base. Interestingly, the local modes of interaction for the Class I aptamer and an RNA aptamer that binds ATP with a Kdof 6 μM are very much alike. The aptamers exhibit nearly identical levels of binding specificity and fraction of ligand sequestered from the solvent (81%–85%). However, the GTP aptamer is more informationally complex (~45 vs. 35 bits) and has a larger recognition bulge (15 vs. 12 nucleotides) with many more stabilizing base–base interactions. Because the aptamers have similar modes of ligand binding, we conclude that the stabilizing structural elements in the Class I aptamer are responsible for much of the difference in Kd. These results are consistent with the hypothesis that increasing the number of intra-RNA interactions, rather than adding specific contacts to the ligand, is the simplest way to improve binding affinity.

3. Plummer, K.A., Carothers, J.M., Yoshimura, M. Szostak, J.W., and Verdine, G.L.* In vitro selection of RNA aptamers against a composite small molecule-protein surface. Nucl. Acids Res. 2005, 33, 5602- 5610. LINK

ABSTRACT:  A particularly challenging problem in chemical biology entails developing systems for modulating the activity of RNA using small molecules. One promising new approach towards this problem exploits the phenomenon of ‘surface borrowing,’ in which the small molecule is presented to the RNA in complex with a protein, thereby expanding the overall surface area available for interaction with RNA. To extend the utility of surface borrowing to include potential applications in synthetic biology, we set out to create an ‘orthogonal’ RNA-targeting system, one in which all components are foreign to the cell. Here we report the identification of small RNA modules selected in vitro to bind a surface-engineered protein, but only when the two macromolecules are bound to a synthetic bifunctional small molecule.

2. Carothers, J.M., Oestreich, S.C., Davis, J.H., and Szostak, J.W.* Informational complexity and functional activity of RNA structures. J. Am. Chem. Soc. 2004, 12: 5130-5137. LINK

ABSTRACT:  Very little is known about the distribution of functional DNA, RNA, and protein molecules in sequence space. The question of how the number and complexity of distinct solutions to a particular biochemical problem varies with activity is an important aspect of this general problem. Here we present a comparison of the structures and activities of eleven distinct GTP-binding RNAs (aptamers). By experimentally measuring the amount of information required to specify each optimal binding structure, we show that defining a structure capable of 10-fold tighter binding requires approximately 10 additional bits of information. This increase in information content is equivalent to specifying the identity of five additional nucleotide positions and corresponds to an 1000-fold decrease in abundance in a sample of random sequences. We observe a similar relationship between structural complexity and activity in a comparison of two catalytic RNAs (ribozyme ligases), raising the possibility of a general relationship between the complexity of RNA structures and their functional activity. Describing how information varies with activity in other heteropolymers, both biological and synthetic, may lead to an objective means of comparing their functional properties. This approach could be useful in predicting the functional utility of novel heteropolymers.

1. Kopp, E., Medzhitov, R., Carothers, J., Xiao, C., Douglas, I., Janeway, C.A., and Ghosh, S.* ECSIT is an evolutionarily conserved intermediate in the Toll/IL-1 signal transduction pathway. Genes Develop. 1999, 13: 2059-2071.  LINK

ABSTRACT:  Activation of NF-κB as a consequence of signaling through the Toll and IL-1 receptors is a major element of innate immune responses. We report the identification and characterization of a novel intermediate in these signaling pathways that bridges TRAF6 to MEKK-1. This adapter protein, which we have named ECSIT (evolutionarilyconserved signaling intermediate inToll pathways), is specific for the Toll/IL-1 pathways and is a regulator of MEKK-1 processing. Expression of wild-type ECSIT accelerates processing of MEKK-1, whereas a dominant-negative fragment of ECSIT blocks MEKK-1 processing and activation of NF-κB. These results indicate an important role for ECSIT in signaling to NF-κB and suggest that processing of MEKK-1 is required for its function in the Toll/IL-1 pathway.

* denotes corresponding authors