The Convergence of Machine Learning and Synthetic Biology
Written on
Chapter 1: Intersection of Machine Learning and Synthetic Biology
This week's edition delves into the merging realms of machine learning and synthetic biology. Notably, two articles recently published in Nature Communications (see the press release) by researchers from the Lawrence Berkeley National Laboratory and the Department of Energy's Joint BioEnergy Institute introduce a machine learning framework designed to assist metabolic engineers in optimizing the production of target molecules.
While I found the studies quite impressive, it is crucial to emphasize the importance of achieving a mechanistic understanding as scientists. Although new machine learning tools can predict an "optimized" metabolic pathway by analyzing extensive datasets from numerous experiments, they may not offer metabolic engineers deeper insights into why specific protein combinations or promoter batches yield desired results.
The researchers employed a machine learning framework known as the Automated Recommendation Tool (ART) to expedite the "Learning" phase of the Design-Build-Test Cycle. Remarkably, this tool functions effectively "without requiring a complete mechanistic understanding of the biological system."
One of my least favored articles is a 2008 op-ed by Chris Anderson in WIRED, where he claimed that in the face of vast data, the traditional scientific method of hypothesizing, modeling, and testing was becoming outdated. This prediction has proven incorrect, especially as machine learning has emerged as a pivotal topic in biology, showing great promise in addressing many experimental challenges.
While this issue of This Week in Synthetic Biology pays tribute to the collaborative potential of machine learning and synthetic biology, I hope that synthetic biologists remain attuned to the essence of being a biologist—appreciating the wonders and complexities of life. Or as my undergraduate research advisor often reminded me, "That's fascinating, but what’s the underlying mechanism?"
Chapter 2: Advancements in Metabolic Engineering through ART
The first study from Berkeley-based researchers outlines the ART, primarily developed using Python's widely-used scikit-learn library. This machine learning framework assists researchers in maximizing the production of target molecules, reducing cellular toxicity, or adjusting metabolite levels to precise concentrations. It often yields positive results. For instance, researchers utilized it to engineer E. coli and S. cerevisiae to produce limonene, synthesize metabolites that generate hoppy flavors in beer, and manufacture dodecanol from fatty acids. However, in the case of dodecanol, the predictions were not particularly meaningful, despite utilizing data from 50 engineering cycles. Nevertheless, the tool appears to be beneficial and is worth exploring further. This study was published in Nature Communications.
In a follow-up paper, also featured in Nature Communications, ART was leveraged to enhance tryptophan production in engineered S. cerevisiae. This time, it performed remarkably well, identifying "designs exhibiting up to 74% higher tryptophan titers than the best designs used for training the models" from a pool of 7776 combinatorial options (comprising five genes managed by six promoters selected from a batch of thirty available). Despite the success, the researchers needed to gather a vast amount of data to train their predictive models, amassing over 120,000 time series data points and generating more than 500 different yeast strains during their investigation.
Chapter 3: Innovative Techniques in Genetic Engineering
In a recent study published in Nature Protocols, researchers demonstrated that within just 30 seconds, one could utilize "cellulose-based dipsticks" to extract nucleic acids. Each dipstick can be produced in less than 30 minutes and requires three buffers: an extraction buffer that binds nucleic acids, a washing buffer to eliminate contaminants, and an amplification buffer to elute the nucleic acids. Interestingly, they employed a low-cost pasta maker, comparable to an Avanti pasta maker, to facilitate the extraction process.
Another groundbreaking study in Nature Microbiology revealed that E. coli, engineered with synthetic pathways for carbon dioxide and formic acid assimilation, can grow using solely these compounds. This research, led by Sang Yup Lee's team at the Korea Advanced Institute of Science and Technology, builds on previous work from 2019 where E. coli was engineered to utilize carbon exclusively from carbon dioxide.
Finally, researchers have made strides in viral gene drives. Marius Walter and Eric Verdin from the Buck Institute for Research on Aging developed a gene drive using human cytomegalovirus, which propagates through virus populations. When two viruses co-infect a host cell—one carrying the gene drive and the other not—the Cas9 from the gene drive cleaves the wildtype sequence, using the gene drive sequence as a repair template to convert the wildtype locus into a new gene drive sequence. This effective approach for propagating genetic elements in viruses was also published in Nature Communications.
Rapid-Fire Highlights
- A novel, functional material composed entirely of living cells was engineered by the Joshi lab, showcasing a significant breakthrough in synthetic biology.
- DNA nanoswitches were designed to change shape in response to various viral RNAs, including Zika and SARS-CoV-2, allowing for detection via gel electrophoresis.
- An evolved strain of E. coli was created to utilize acetate as its only carbon and energy source, engineered to produce mevalonate and n-butanol.
- A mutant enzyme called thioglycoligase was developed to hydrolyze D-xylose sugars and form various glycosides, demonstrating the versatility of enzyme engineering.
- A web tool was introduced for designing pegRNAs for prime-editing experiments, enhancing research capabilities in genetic engineering.
Thank you for reading this edition of This Week in Synthetic Biology, part of Bioeconomy.XYZ. If you find this newsletter valuable, please consider sharing it with a friend. A version of this newsletter is also available on bioeconomy.xyz and my personal website, nikomccarty.com. Connect with me on Twitter @NikoMcCarty for tips and feedback—I'm open to constructive criticism!