Introduction
So many engineering fields use many design-build-test-learn (DBTL) cycles to find optimal results.
Biofoundries are the infrastructure that allows synthetic biology and biotechnology to utilize the
DBTL as the main workforce of change for solutions in organism engineering. Automation is the key
element inside Biofoundries allowing them to high-throughput a wide range of designs, experiments,
tests that later will generate integrative reports to define if the desired goal is achieved.
However, what should be the final objective for this software in Biofoundries? Well, in our
perspective the future depends directly on the creation of autonomous biofoundries, especially the
frugal ones.
Autonomy is a concept that people are familiar with when talking about cars. Fully autonomous
vehicles are the pinnacle of automation by transforming a very human-dependent activity into a
completely automatic one. What if synthetic biology could be similarly automized? What if protocols
could be executed by an integrated infrastructure? What if they could be adapted to each specific
laboratory setup? What if the design of genetic parts and experiments could be corrected while being
produced?
We understand that Frugal Biofoundries will need open software that allows for these types of
integrations. Where the community actively communicates and develops their own non-proprietary,
free, easy, high-quality software solutions, that could resolve a high-throughput and high volume of
data. Not only that, we need synthetic biology developers that will develop the next generation of
tools for biotech infrastructure.
In order to do that, we decided to utilize Poly, an open-source Go package for organism engineering.
As a Go package, Poly has intrinsic properties that allow easy reusability, compatibility, and good
performance. Poly also has a very vivid and funny community, guided by a very active maintainer and
creator of the package, Timothy Stiles. The compromise of creating good quality code allied to the
ambition to become the most complete and open collection of computational synthetic biology tools is
what makes Poly a very attractive option for what we’re trying to create. Actually, most of the
Friendzymes software team is or became a Poly contributor.
We decide for this MVP (Minimum Viable Product) part of Friendzymes projects to stipulate two main
objectives:
-
Create software that can be easily adapted and learned for people interested in being a
SynBio developer, so they could be empowered to resolve their own and community problems;
and,
-
Create software that can demonstrate how software could automatize processes in the DBTL
cycle.
The main goal of our project is the democratization of biotechnology; thus, when thinking about
people who have different backgrounds and levels of knowledge in programming, we created 1) the
Friendzymes Cookbook, a collection of Jupyter Notebooks with scripts that we developed for this iGEM
season to help the Design team create incredible work, and 2) the Friendzymes Actions, a collection
of Github Actions for Synthetic Biology for Continuos Integration integration.
The Friendzymes Cookbook
During the iGEM season, Friendzymes’ software and design teams worked together to automatize steps
that could be complicated, time-consuming, and unsafe to do by hand, e.g. making a typo and
compromising your sequence. Ithis process, we created many scripts to locate our specific demands
and shared this as Colab Notebooks so others could copy, modify and recreate.
However, many of these tasks are similar when it comes to biological circuit design: codon
optimization, primer design, searching for forbidden sequences (e.g. EcoRI binding site outside the
BioBrick standard prefix and suffix), among others. Hence, we thought it prudent to make tutorials
that could help people beyond our own project so we create the Friendzymes Cookbook, not only a
collection of scripts for design automation but also as an Educational Tool (Check on the Education
Section) so newcomers in the Software Team, interested people from the Friendzymes, teams from
iGEM/iGEM Design League, and others in the SynBio Community could all have a way to learn more about
Poly, common problems, and how to design new tools!
The Cookbook is defined as the collection of Colab notebooks, currently comprising:
-
Understanding Poly
Poly is our key tool for the software. It was a planned decision to build workflows that
integrate with Poly, to show ways to use the package, as well as create some new features;
therefore, it is very important that you understand how the Poly package works and what its
structure is in general before you begin manipulating it. Thus, we created this brief
overview of Poly, its sub-packages, and a collection of use cases. We strongly recommend
that you do the tutorials in the order they appear.
-
Codon Optimization
A very common task for the design of parts is Codon Optimization, so here we will show how
you can create customized Codon Tables and how you can use this to do codon optimization of
a given Coding Sequence (CDS).
-
Annotation of problematic sequences
Have you designed your sequence? Now it is time to remove small forbidden parts that can
hinder you, not only when sequencing (e.g. hairpins, repetitive regions), but also when
cloning (e.g. restriction binding sites). What this tutorial shows is the automatic
annotation of these problems. It will give you a genbank file (with these annotations
attached) that you can drop into your favorite viewer, like Benchling or Snapgene.
-
CDS fix
In this notebook, you will input your CDS sequence(s) and receive your CDS corrected without
the problematic sequences. This is done by replacing the codons with synonymous ones, thus
keeping the same amino acid sequence at the end. Kind reminder that this tutorial was NOT
written for non-coding sequences such as promoters, rbs, and terminators. If you have found
problematic sequences in it, review case by case and be careful not to lose biological
meaning.
-
Automatically create parts with correct overhangs
How about designing your final plasmid without worrying about each separate part and using a
script to add the restriction binding sites, spacer, and overhangs? That’s what you find
here!
-
Golden Gate Simulation
In this notebook, you will run a simulation of a Golden Gate reaction and see if everything
is theoretically acceptable before physically synthesizing your parts.
We made all these ‘recipes’ using Jupyter Notebook, with Google Colab in mind, a free platform for
running notebooks using the Google Infrastructure. This way people don’t need to install or
configure anything to run, adapt and develop their own tools.
We also made this repository where people could contribute by proposing new chapters of the
cookbook, fixing bugs, and maintaining this whole collection of tools. Feel free to take a visit to
our repo and suggest anything you’d like!
Friendzymes Actions
While writing this text a script behind the scenes is checking if the words are used correctly, and
this integration is so seamless and smooth that people take it for granted. This isn’t magic, it is
actually an automation process. In software engineering, there is an entire field of study dedicated
to automation processes which were previously manual. By using a pipeline, we make the process of
automation simpler.
Pipelines could be understood as an iterative process where each output is used as the input of the
next, so the sum of all this script's workflow is your final result. The actual pipeline manager
tools try to make these workflows context-independent (using most of the time container as a
solution), so developers could easily migrate and scale pipelines from local computers to a cluster
or a cloud server.
To process the high-throughput demand inside the biofoundries, software engineers implement
pipelines which process thousands upon thousands of designs, experiments, and data analyses every
week. We believe the Jupyter Notebooks are good for some scenarios, however, they can't provide a
framework scalable enough for this demand. With this in mind, we tried to avoid creating a solution
that is attached to a specific cloud service provider or to use a tool that will need too many
configuration steps. For this, we decided to use Github Actions.
Github Action is a free-to-use feature inside Github that allows you to automate tasks inside your
repository. In essence, is a way to have pipelines that could or not be related to the code that you
share in the platform. For us, this means a free pipeline manager software, with minimal steps for
configuration, where people could automatize processes for the DBTL cycle in an open-source
environment.
To demonstrate the potential of this too,l we created three Github Actions:
-
DNA Annotator
This action allows users to process Genbank files and reannotate them with problematic
regions as Hairpins, Repetitive sequences, and forbidden restriction binding sites, allowing
DNA designers to easily find regions to take a well-informed decision of what subsequences
is better to correct.
-
Is this DNA Synthesizable?
Instead of copying and pasting each sequence that you have in the IDT gBlock Analyzer page to
see if your sequence is synthesizable, we created this action to check this for you. We have
some optional features, like break the pipeline, for halting the process if the software
finds a non-synthesizable sequence, and then exports a JSON file with the score of each
sequence and the problems they found.
-
Codon Optimization
You could, with this action, automatically codon- optimize a list of sequences for different
organisms based on the codon tables you share. This is good if you’re working with multiple
organisms at the same time.
In our development roadmap, we envision new actions that can automatically generate Opentron
protocols files that already have parameters to assemble (BUILD), amplify and validate (TEST) your
sequences as a whole experiment; additionally, we’re creating also more and more modular tools for
automating the design of genetic parts. We believe with these tools, Synthetic Biologists could
start automatizing manual labor-intensive tasks and utilize the benefits that software development
already has for their field.
For the future
For us this feels like a beginning. We are starting to implement these ideas, but as previously
stated, we know where we want to go. We want frugal biofoundries to be equal with full-size
biofoundries, including automation. Furthermore, we don’t want the software to be ‘good enough’ to
be open-source, we want software that is majestic for individuals and big companies. We want to
integrate hardware and software, so that frugal biofoundries can automate DNA sequencing from
end-to-end, for example, the processing of COVID’s DNA sequence. We want to make oligo pools
assembly easy and less error-prone so people could synthesize parts 10x or 20x more cheaply. Living
protocols running and showing if the experiment already has some inconsistencies, so you don’t have
to waste more reagents to realize you make a mistake.
For us this feels like a beginning. We are starting to implement these ideas, but as previously
stated, we know where we want to go. We want frugal biofoundries to be equal with full-size
biofoundries, including automation. Furthermore, we don’t want the software to be ‘good enough’ to
be open-source, we want software that is majestic for individuals and big companies. We want to
integrate hardware and software, so that frugal biofoundries can automate DNA sequencing from
end-to-end, for example, the processing of COVID’s DNA sequence. We want to make oligo pools
assembly easy and less error-prone so people could synthesize parts 10x or 20x more cheaply. Living
protocols running and showing if the experiment already has some inconsistencies, so you don’t have
to waste more reagents to realize you make a mistake.
How could software improve synthetic biology? How much impact could software make to advance humanity
to carbon-negative and actually make a (why not) solarpunk future a reality?
Software is a big piece of this puzzle. Together we could build it.
If you’re interested don’t hesitate: leave a message, e-mail or github issue, and we will be glad to
present what we’re doing right now and show ways to contribute to the Friendzymes project.