Description
Our project consists of two phases. The first encapsulates the circuit and focuses on the fulfillment of specific requirements through an in-depth architectural design along with a thorough assessment of the circuit’s behavior and characteristics. The second focuses on a conceptual framework: a complete guide for the assessment of sequential logic in synthetic biology, together with its implementation as a model for our software tool’s design.
For the design of said circuit we defined three essential needs:
1. A way to determine the lifetime of our engineered bacteria
2. A mechanism to prevent the life span from running out during laboratory
manipulation
3. A multilayered killswitch to ensure a safe death
We tackled our first design requirement by exhaustive brainstorming and research. Our
main idea involved a counter system, hence, multiple mechanisms to perform this
count were discussed, from binary linear systems dependent on inducible promoters to our
final implementation, a sequential logic based recombinase device. Some of the
counter systems we assessed and used as inspiration are in the comparison section.
In spite of the value of the previously mentioned proposals, none of the counters fit
our more specific needs which were focused on the complexity, count limit and length of
the device. Since we envisioned our circuit being used as an auxiliary to other
applications, it was essential that the available count was as high as possible without
the length growing exponentially. This left out counters such as the ones proposed by
Groningen Team (2011) and Paris Liliane Bettencourt Team (2010), since they behave
linearly and “in situations where the same signal is being recorded over multiple
occurrences (for example, a series of cell division events), reliably rewritable
elements are needed to realize geometric increases in data storage capacity (for
example, combinatorial counters capable of recording 2N events given N storage
elements)” (ref). Whereas Team: UT-Tokyo’s (2016) approach was not only lengthy but also
complex and imposing a high metabolic burden, which were undesired qualities as well.
Other proposals, even with 2n capability, displayed undesirable qualities such as the
need for very specific regulatory systems such as the delay system presented by Zhao et
al. (2019). Also, many of the counters were based on the theoretical assumption that
there are n protein-induced promoters or n serine recombinases which don't have
crosstalk issues being available, stating the limit at the number of orthogonal
characterized recombinases (Yang et al., 2014). While good for stating our counter’s
basis, such as the viability of placing several recombinases next to each other and
having them behave correctly, we wanted to go even further. This led us to propose our
own counter system, which went through several changes and iterations as explained in
{engineering success} until we came up with a suitable approach as validated by PhD.
Pakpoom Subsoontorn and PhD. Jérôme Bonnet.
It’s important to note that our counter is a practical representation and
proof of concept for the sequential-logic based development of genetic circuits
that we propose in the {conceptual framework}.
The final genetic suicide circuit design consists of different modules, whose functions
are explained next.
01
Repress-Initialize
The first device was designed to maintain the other three inactivated until the user is ready to trigger the countdown with the intent of permitting the manipulation, cultivation, etc. of the engineered organism without the circuit counter being activated. Otherwise, if the circuit wasn’t under strict control, the growing times and uses of these microorganisms would be limited. The proposed mechanism consists of regulating the transcription of the main promoter via repression triggered by an inducible promoter.
02
Count
Autonomous genetic counter whose purpose is to keep track of the number of times a cell goes through cell division (from the moment the counter is activated by the first device). The counter is based on a cell-cycle coupled promoter, recombinases and unidirectional terminators, as detailed in the {architecture section}. This device is able to count up to 2n where n is the number of recombinases.
03
Expand Count
The third device’s purpose is to enable a higher count, hence it’s also regulated by the second device’s promoter and it’s based on a self-excising DNA segment that resets the counter state of the previous device. This system is CRISPR Cas-9 reliant and amplifies the count by x times where x is the number of repeats of self-excising segments.
04
Kill Switch
Multi-layered containment system designed with the objective of killing the host cells leaving no trace and whose initiation depends on previous devices. Several toxins are involved in this device, including RelE and NucA1, along with self-targeting single guides to ensure its safety.
In the following image you can see the design of our whole device and, by clicking on the boxes below, you can access our insight on each of the different parts and modules.
Promoters
Promoters are the main regulatory units for translation; they define where translation starts and under which conditions (iGEM, s.f.a).
Promoter 1: pBAD {BBa_I0500}
This promoter is induced by arabinose and it’s one of the most characterized for heterologous protein expression (Széliová et al., 2016), and commonly used in iGEM and synthetic biology itself (http://parts.igem.org/PBAD_Promoter_Family http://parts.igem.org/Part:BBa_I0500). Many of the other counter systems discussed in this project were developed and characterized with this promoter as its regulatory element (Roquet et al., 2016; Bonnet et al., 2012; Zhao et al., 2019; Yang et al., 2014).
Promoter 2: pnrd {BBa_K2070012}
This promoter is cell cycle dependent according to (Sun 1992), since it is a highly regulated promoter region which contains several DnaA boxes, indicating that activity is partly affected by initiation of DNA replication (Messer 2002). The nrd operon itself in E. coli expresses a ribonucleotide reductase, an enzyme that reduces ribonucleotides into deoxyribonucleotides and is involved in bacterial cell cycle in E. coli (Sun 1994). This part has been used previously by Stanford Brown in 2012 (https://2012.igem.org/Team:Stanford-Brown); they proved that the promoter turned on once every cell cycle. A variation of this promoter was also used by UT-Tokyo 2016 (link) to build a regulatory system in which gene expression changes after each cell division which is very similar to our own purpose. The use of this promoter ensures that our system will be activated only once every cell cycle.
Promoter 2: pnrd {BBa_K2070012}
This promoter is cell cycle dependent according to (Sun 1992), since it is a highly regulated promoter region which contains several DnaA boxes, indicating that activity is partly affected by initiation of DNA replication (Messer 2002). The nrd operon itself in E. coli expresses a ribonucleotide reductase, an enzyme that reduces ribonucleotides into deoxyribonucleotides and is involved in bacterial cell cycle in E. coli (Sun 1994). This part has been used previously by Stanford Brown in 2012 (https://2012.igem.org/Team:Stanford-Brown); they proved that the promoter turned on once every cell cycle. A variation of this promoter was also used by UT-Tokyo 2016 (link) to build a regulatory system in which gene expression changes after each cell division which is very similar to our own purpose. The use of this promoter ensures that our system will be activated only once every cell cycle.
Recombinases
DNA site-specific recombinases are enzymes that catalyse breaking and rejoining of DNA strands at specific points, thereby bringing about precise genetic rearrangements. Serine integrases are a group of recombinases with unusual properties such as recombination directionality and simple site requirements (Stark, 2017). According to Zhao et al. (2019) when inverting, serine recombinases recognize specific genetic sites, attB and attP (PB state), and invert the sequence between these sites generating the new sites attL and attR (LR state). This inversion is unidirectional, meaning that another component must be present for the inversion to be reversed. This component is the RDF (recombination directionality factor), and, when co-expressed with the recombinase, allows to switch the LR state back to PB (Zhao et al., 2019). Depending on the DNA arrangement, the use of integrases allow the control of the flow of RNA polymerase along the DNA, hence regulating genetic expression (Merrick et al., 2018). The selected recombinases have been tried out together and display orthogonality (Guiziou et al., 2019). They have also been characterized in several other chassises, making our circuit easier to extrapolate (Tomimatsu et al., 2017; Xu & Brown, 2016). BXB1 and TP901 particularly have been used in similar proposals such as Hsiao et al. (2016) y Roquet et al. (2016). It’s important to highlight that the order assigned to these recombinases is not random, and was supported by our mathematical model.
Recombinase 1:Bxb1
We chose Bxb1 as the first integrase since it displayed the best recombination efficiency to obtain products with no residual substrates remaining as reported by Wang et al. (2017), showing that the activity of Bxb1 integrase is obviously superior compared with any other integrase. Bxb1 integrase mediated excellent, efficient site-specific recombination in vitro in this study.
Recombinase 2, 3 and 4: Represent TP901, Int5 and Int7 respectively.
There are several versions of each recombinase and other recombinases available in the registry http://parts.igem.org/Recombination/Other
Terminator
Terminators define the end of translation, they can be uni or bidirectional and are relevant to correct protein expression (iGEM, s.f.b).
Terminator 1: {BBa_B1002}
Bidirectional terminators to ensure the separation of the different modules, i.e.
Terminator 2: {BBa_B0010}
Only functions in its forward direction and can be used to regulate translation of polycistronic elements similarly to Friedland et al. (2009)’s proposal, in which they built a DNA Invertase Cascade (DIC).
CRISPR+ligase
The CRISPR-Cas9 system, a gene editing system, depends on two main elements, a small RNA fragment that contains the guide sequence that binds to the target DNA sequence and the Cas9 enzyme joined to the guide RNA sequence. When the guide is transcribed, it binds to the Cas9 in order to recognize the target DNA sequence which is then cut by the enzyme, creating a double strand break. This break can be used to delete, insert or replace DNA if used coupled with the cell’s own repair machinery (MedlinePlus, s.f.). In our case E. coli doesn’t have a non-homologous end joining (NHEJ) repair system, so a double stranded break can only be fixed by homologous recombination. Since we don’t want the insertion of a template to be required, we searched for a different alternative, and found that a T4 ligase can fix the break creating indels that knockout the gene and prevent the cell from committing suicide because of the broken DNA (Su et al., 2016). This is according to Su et al. (2019), who report using a ligase for repairing a DSB made by Cas9 in LacZ gene in E. coli, leading to the survival of cells. In turn Su et al. (2016) report using a different strategy: which involved the design of sg to avoid HR and the use of conserved prokaryotic NHEJ proteins from Mycobacterium tuberculosis H37Rv instead for repairing complete gene deletions, proving that this is in fact possible even though in their case it had only 36% efficiency. We believe that by joining the best of each strategy, the ligase and the sg design, and having a much smaller deletion target (which in turn is closer to DNA deletion consistency at 200bp) we can have a much better efficiency in both deletion and repair.
Cas9
http://parts.igem.org/Part:BBa_K1774001:Experience
Sg
Our single guides are to be designed to fulfill several requirements for an increased specificity and efficiency and decreased off-target effects as proposed by Liu et al. (2020).
Ligase
{BBa_K3917004}, created by basing off of Su et al. (2019).
Operator sites
“A common method of repressing gene expression in prokaryotes involves the binding of a protein to a DNA sequence (a.k.a. operator) near or overlapping its promoter” (Politz et al., 2013). These operators, in our design, were purposely ambiguous, since their identity can vary depending on available technology. The common requirement they must fulfill is binding to specific DNA sequences. A readily available mechanism to do this is taking advantage of already existing repressors such as LacI from the Lac operon or the phage encoded lambda repressor (cI) (Lewis, 2011). The deactivated version of the cas9 protein is also an alternative, but its location would have to be rethought, since it would need to be active and functional for all three cases (Xu & Qi, 2019). Finally, other synthetically designed DNA-binding proteins can be used, TALEs being one of the most used of these categories (Politz et al., 2013).
Multi-layered containment system
A multi-layered safeguard system is a series of barriers that can decrease the
escape frequency of synthetically modified microorganisms (Gallagher et al.,
2015). In this device, the main objective is to develop a biocontainment,
condition independent, kill switch.
RelE: modified {BBa_K2449029}, removing the double terminator.
The toxin RelE, a mRNA interferase, causes translation inhibition by cutting mRNAs positioned on site A of the ribosome. It cleaves the stop codons of the mRNA between the 2nd and 3rd nucleotide (https://www.uniprot.org/uniprot/P0C077), which blocks processes such as its binding to release factors, release of formed polypeptides and availability of ribosomes to be used again (SJTU-BioX-Shanghai, 2009). This ultimately hinders the correct translation and promotes a decrease in cell growth. However, since mRNA interferases play a role in bacterial persistence to antibiotics (https://www.uniprot.org/uniprot/P0C077), it is necessary to additionally apply other death mechanisms together.
Nuclease A1: modified {BBa_K3027000}, removing its terminator.
The nuclease A1 is in charge of the genomic DNA elimination, without being associated with cell lysis (GO_Paris-Saclay, 2009), which is important because the genetic material can remain contained during the DNA escision. When the cell is left without DNA, some remnants, like ribosomes and housekeeping proteins, could still get transferred when interacting with other microorganisms and provide them new functions including and beyond our specific modification. DNA-less cells are still equivalent to cells with DNA (GO_Paris-Saclay, 2009) with the minimal genome to maintain metabolic homeostasis, reproduce, and evolve (Gil et al., 2004).
CRISPR: GMK target
Already described mechanism of action (see CRISPR+ligase). In this module, it includes is a sgRNA to target the nucleoside monophosphate kinases Gmk, a multimeric enzyme showed to be determinant in the reduction of the nucleotide pool and decrement of growth and/or respiration rate when it is knocked out, due to the specificity established with the phosphate acceptor substrate (EcoCyc, 2012). We also aim to use another set of sgRNA to generate breaks to our synthetic construct, leaving no possibilities for synthetic genomic remnants to be freed. Single guides are to be designed as specified previously, targeting the Gmk gene, the synthetic circuit’s main promoter and its recombinases.
Plasmid
Heterologous expression in E. coli may be achieved through several vectors such as plasmids or bacterial chromosomes, as well as by the insertion of DNA directly into the bacterial artificial chromosome. The ideal implementation for our proposal is chromosomal insertion, in order to prevent the existence of cells with a mixed expression within them (Zhao et al, 2019), otherwise understood as several counters in a different state. It’s also a way to allow the stable expression of foreign DNA without the need for antibiotic selection, the increased protein expression and the decrease of metabolic burden (Englaender et al., 2017). The likelihood of uptake of the synthetic DNA by other organisms is also decreased by this strategy. Nonetheless if it’s to be implemented in a plasmid, this should be low-copy in order to reduce the noise caused from the plasmid’s divergent expression, and the selection mechanism should be via auxotrophy to allow for more diverse implementations.
Degradation tags and RBS optimization
For our system to be as efficient as possible all of the recombinase’s expressions must be fine-tuned. One of the most common ways to do this is via RBS selection, which would allow the balancing out of the recombinases and their expression rate by regulating their translation (Jin et al., 2017). Also, this way the ratio of 2:1 needed for correct RDF:recombinase expression can be achieved. Another relevant point for our system is that recombinases are to execute their work and then be degraded as quickly as possible to not interfere with other recombinases. This is particularly true for RDFs, since they’re being generated several generations previous to their participation. Hence, we researched ways to fast-teack the degradation of proteins, particularly the application of degradation tags, which are a mechanism for targeted proteolysis mediated by the addition of a degradation tag to the coding sequence (McGinness et al., 2006) . These have been used in counter systems to improve their efficiency such as that (SYSU China Team, 2015), and can similarly be applied to our circuit.
Evolutionary Stability
Evolutionary stability of recombinase based memory circuits
According to Fernandez-Rodriguez et al. (2015) one of the design principles for
long-term stability is the avoidance of constitutive expression, which we
have applied in our circuit. They also mention that “Placing the circuit in the
genome could also improve stability, as this is a common approach to improve
the robustness of strains in metabolic engineering” (Fernandez-Rodriguez et al.,
2015). Another consideration that must be made is the prevention of leaky
expression, since it can generate a drift into a different state and even
disable the ability to hold a state. In our circuit the expression of all
recombinases is regulated by the same promoter, so this issue is not expected to be
relevant to its performance. Canton et al., (2008) note that the evolutionary
stability is also dependent on the state in which the circuit is carried; the
authors note that a circuit in the OFF state can easily be carried over many
generations, whereas in the ON state it quickly breaks. This specific quality can be
problematic for our circuit, since one way or another it’s always turned on because
of the dependence on the cell-cycle promoter. Nonetheless, since different
activities are carried out in each specific state and there is no constant
expression of regulatory proteins, this might not be relevant to the circuit’s
stability, since “the burden is a function of regulator expression, the evolutionary
stability depends on the state in which the circuit is carried” (Fernandez-Rodriguez
et al., 2015).
This was seconded by Sleight et al. (2010), who state that “Evolutionary stability
is a problem in genetic circuits if there is no selective pressure to maintain
function of the circuit. The current belief is that this loss-of-function occurs
because any cell in the population that acquires a mutation in the genetic circuit
often has a growth advantage and can outcompete the cells in the population with all
functional plasmids”; hence, if there is no significant metabolic burden,
there is no advantage. Finally, we weren’t able to carry out an evolutionary
stability experiment of our own; however, extrapolating the results obtained by
Fernandez-Rodriguez et al. (2015) shows great promise for our circuit. They report a
stability of over 400 hours for their memory switch, and credit its breaking to
continuous invertase expression and the host’s evolution towards reducing its
expression. More importantly, Fernandez-Rodriguez et al. (2015) indicate that during
the 400 hour period, 6 cycles of their circuit, “there was no loss of performance,
reduction in dynamic range, or increase in population variability”. The main risk to
our circuits stability are transposable insertion elements and even homologous
recombination, which can be avoided by using diverse parts in the
construction of the circuit and using strains where insertion elements have been
deleted from the genome; nevertheless, both these strategies involve challenges on
themselves, particularly the need for more robust strains for many applications, as
stated by Fernandez-Rodriguez et al. (2015). Finally, although evolutionary
stability must be considered when designing a circuit, the main focus should be on
the specific end application. The required robustness of a circuit will be directly
related to the duration of its application. “For example, circuits do not
need to be robust for months if designed for a one week
fermentation”(Fernandez-Rodriguez et al., 2015).
The time for the loss of function of a genetic circuit can also be predicted by
applying the function developed by Arkin & Fletcher (2006) through a
simulation study.
Evolutionary stability of kill switches
According to Stirling et al. (2017), “any kill switch with leaky, low-level
expression of a toxin in permissive conditions may be quickly disabled in rapidly
growing microbes.” In our circuit this is highly unlikely, since the expression of
the toxins is regulated by the same promoter that controls the expression of
recombinases, and between them stand several terminators. Furthermore, the toxins
from our kill-switch are to be expressed only once, and induced by the change of
state from the previous module (counter circuit), hence avoiding the complexities
of an environmentally induced kill switch. The necessity for induction is
another decisive factor in the stability of the circuit, since, uninduced there
is no selective pressure for it to be dismantled. This was proven by Pasotti
et al. (2011), who evaluated the stability of an inducible killswitch, finding it to
be stable for a maximum of 100 generations and strain dependent.
Further optimization for the final application can be done by applying tools such as
“Predicting the Genetic Stability of Engineered DNA Sequences with the EFM
Calculator” and making the necessary alterations regarding the sequence such as
altering codon usage in the different CDSs or swapping out promoters/rbs of equal
strength (Jack et al., 2015).
For further characterization and application examples, check out our Proposed
Implementation page.
Modularity
Each of these modules can be applied separately to other proposals, and modified accordingly, and even exchanged for others. For example, the final module can be changed to execute a specific function instead of inducing the cell’s death. Even the toxins from the multilayered containment system can be changed if a different death approach is desired, like having specific conditions (like pH) unfold. Regarding our crucial module, the counter, it can have its recombinase number diminished or increased easily, redesigning it by following the architectural generality of the circuit: a single promoter at the beginning, then the first recombination site of all the recombinases, then as many basic modules as required, (RBS+ recombinase+ unidirectional terminator+ second recombination site+ RDF). *The final recombinase’s RDF must only be included if the reset is to be part of the sequential process.
The functioning is based on four main premises based on the architecture:
- The promoter will work one time per cell cycle.
- The terminator will only work in its forward direction.
- Each recombinase (when expressed) will invert non-reversibly the DNA sequence when such is flanked by PB sites.
- Each recombinase together with the recombination directionality factor (RDF) (when both are expressed) will invert non-reversibly the DNA sequence when such is flanked by LR sites.
Repress-Initialize
During lab work with the cloned bacteria, the inducer must be provided in order for the counter to be repressed. Once ready to implement the bacteria on its specific application the inducer must be withdrawn from or not supplied to the culture. The repressor will then be degraded and the counter will start.
Counter Circuit
A
First generation
Only the first recombinase is transcribed and hence expressed. It inverts non-reversibly the sequence between the recognition sites in the PB state, which now are in the LR state. This leaves/positions the recombinase and RDF coding sequences in the same reading frame (reverse).
B
Second generation
Because of the past inversion, the recombinase 1 and its respective RDF are reverse and the first terminator is now inactive, which allows the second recombinase to be expressed. The recombinase 2 inverts the sequence between the recognition sites in the PB state, which become the LR state. This leaves the recombinase 2 and RDF coding sequences in the same reading frame (reverse).
C
Third generation
The recombinase 1, RDF and terminator are now in the correct reading strand. Only the rec 1 and the RDF are expressed, thus allowing the invertion between/of the sites in the LR state back into the PB state. This leaves the RDF 1 coding sequence in the same position (forward) and the recombinase 1 in the reverse.
D
Fourth generation
Recombinase 1, and recombinase 2 with the RDF 2 are in the reverse strand and the first two promoters are inactive, thus, only the RDF of the recombinase 1 and the third recombinase are expressed. The RDF has no effect without its corresponding recombinase. The recombinase 3 inverts irreversibly the sequence between the recognition sites in the PB state, which now are in the LR state. This leaves the recombinase 3 and RDF 3 coding sequences in the same reading frame (reverse).
E
Fifth generation
The RDF 2 and rec 1 are expressed. The RDF 2 has no effect without its corresponding recombinase. The recombinase 1 inverts irreversibly the sequence between the recognition sites in the PB state, which now are in the LR state. This leaves the recombinase 1 and RDF 1 coding sequences in the same reading frame (reverse).
F
Sixth generation
Only the rec 2 and the RDF 2 are expressed, thus allowing the invertion between/of the sites in the LR state back into the PB state. This leaves the RDF 2 coding sequence in the same position (forward) and the recombinase 2 in the reverse.
G
Seventh generation
The RDF of the recombinase 2 and the first recombinase with its corresponding RDF are expressed. The RDF 2 has no effect without its corresponding recombinase. The recombinase 1 with the RDF invert the sequence between the recognition sites in the LR state back into the PB state. This leaves the RDF 1 coding sequence in the same position (forward) and the recombinase 1 in the reverse.
H
Eight generation
The fourth recombinase is expressed which inverts the complete circuit which is located between the PB sites and switch them in the LR conformation.
I
Ninth generation
The RDF of the recombinase 3 and the recombinase 1 are expressed step 1 is repeated. The RDF 3 has no effect without its corresponding recombinase. The recombinase 1 inverts non-reversibly the sequence between the PB recognition sites, which now are in the LR conformation. This leaves/positions the recombinase 1 and RDF 1 coding sequences in the same reading frame (reverse).
J
Tenth generation
The RDF 3 and recombinase 2 are expressed. The RDF 3 has no effect without its corresponding recombinase. The recombinase 2 inverts non-reversibly the sequence between the PB recognition sites, which turn/ now are in the LR conformation. This leaves/positions the recombinase 2 and RDF 2 coding sequences in the same reading frame (reverse).
K
Eleventh generation
The RDF 3 and the rec 1 and the RDF 1 are expressed. The RDF 3 has no effect without its corresponding recombinase. The recombinase 1 and RDF 1 allow the invertion between/of the sites in the LR conformation back into the PB conformation. This leaves the RDF 1 coding sequence in the same position (forward) and the recombinase 1 in the reverse.
L
Twelfth generation
The RDF 1 and the rec 3 and the RDF 3 are expressed. The RDF 1 has no effect without its corresponding recombinase. The recombinase 3 and RDF 3 allow the invertion between/of the sites in the LR conformation back into the PB conformation. This leaves the RDF 3 coding sequence in the same position (forward) and the recombinase 3 in the reverse.
M
Thirteenth generation
The RDF 3 and the rec 2 and the RDF 2 are expressed. The RDF 3 has no effect without its corresponding recombinase. The recombinase and RDF allow the invertion between/of the sites in the LR conformation back into the PB conformation. This leaves the RDF 2 coding sequence in the same position (forward) and the recombinase 2 in the reverse.
N
Fourteenth generation
The RDF 2, RDF 3, rec 1 and the RDF 1 are expressed. The RDFs have no effect without their corresponding recombinase. The recombinase and RDF allow the invertion between/of the sites in the LR conformation back into the PB conformation. This leaves the RDF 1 coding sequence in the same position (forward) and the recombinase 1 in the reverse.
O
Fifteenth generation
Counter reaches an end and transcription is able to carry on with the next device.
Expand Count
A single segment is analyzed in this animation, it consists of an inducer for the inducible promoter that controls the fourth recombinase, a Cas9 and a ligase, the recombinases’ directionality factor which allows it to invert the sequence from LR to PB, and two single guides which enable the excision of the segment by the Cas9’s action. Hence, one by one the auto-excisable segments are expressed and deleted until the last one is reached, giving way for the last module to express itself. It’s important to note that only a slight change in terminator sequence is needed to yield dysfunctional activity for each module.
Kill Switch
When the RelE toxin is expressed, translation of proteins slowly comes to a halt. Simultaneously genomic DNA will be destroyed by the NucA toxin, the Gmk gene will be targeted by the Cas9 and our construct will be destroyed by this very same enzyme. The coupled action of these enzymes will ultimately kill the cell without releasing its genetic contents.
Overcount Prevention
“A potential challenge in designing a robust biological counter is the ability to count at completion of the event. The existing designs of the counters are sensitive to the pulse duration–a brief pulse will be ignored and a lengthy pulse can cause the counter to count ahead” (Noman et al., 2016). Given the concerns over the regulation of the count expressed by Dr. Pakpoom and Dr. Bonnet, further explained in engineering success, we designed a module to prevent overcount. This system works by placing an proteic inducer for an external promoter and an operator site after the main promoter. When active, the external promoter will produce a repressor which will in turn bind to the operator site preventing translation. The central concept to it is affinity, both the proteic inducer and the operator affinity are tunable, the first more than the latter. Once a pulse is received by the main promoter the proteic inducer will be expressed, along with the respective recombinase depending on the circuit’s state. If designed correctly, the affinity of the proteic inducer and the repressor will be just so that exactly the right amount of recombinase will be produced in order to change state without doing it twice or being shut off before being able to act. We designed a construct to prove this hypothesis, you can check our work on it in Wet Lab.
Another possible solution to the response of a continuous pulse would be to include an incoherent feedforward loop to regulate the type of pulse the system responds to, as suggested by PhD. Rodrigo Mora. This type of standardized regulation consists of an activator which “regulates both a gene and a repressor of the gene”, similar to the manner in which we proposed our overcount system (Goentoro et al., 2009).
Promoters
Additionally, we propose the use of inducible promoters (such as pBAD) instead of pnrd for user-controlled counter systems. These could even involve the association of the counter to specific environmental conditions such as pH, temperature, presence of certain molecules and others can easily be done by changing the initiating module to one dependent on these conditions.
Containment
Plasmid specific
A toxin-antitoxin could also be used for regulation and the actual kill (fourth device), in our proposal it’s counter effective to have another regulatory device of this magnitude but for systems in which a plasmid needs to be added and contained it could be useful, for example, the toxin could be placed on the plasmid and the respective antitoxin on the genome, avoiding horizontal gene transfer, because the plasmid will get the toxin but not the antitoxin to persist in other microorganisms (Torres et al., 2016). However, the toxin-antitoxin systems are usually complemented by a synthetic auxotrophy in which cells can only grow in the presence of an exogenously supplied metabolite that limits the cell survival under desired conditions, but implicates some risks (Chan et al., 2016). Toxin-antitoxin systems have been reported with high mobility between genomes through horizontal gene transfer and its stability in the genome is still uncertain due to mutations accumulation due to genetic drift (Van Melderen & Saavedra, 2009), and auxotrophy could be satisfied by metabolite cross-feeding or essential small molecules on environmental conditions (Gallagher et al., 2015), besides it requires extensive genome-wide engineering and reprogrammation for different environmental conditions is complicated (Chan et al., 2015). Nevertheless, we propose Gmk (https://www.uniprot.org/uniprot/P60546#sequences) as an auxotrophy gene, which has been reported as an essentialness by single-gene knocking out technique (Baba et al., 2006). It is important to denote that some toxin-antitoxin systems are able to lysate cells (Van Melderen & Saavedra, 2009), which is not beneficial for the optimal work timing of the whole system we propose.
Specific toxins
If a toxin could be designed to target our strain and only our strain of E. coli it could be released from the cells that first reach the end of the counter, reducing the noise of signal processing from each cell and ensuring that even cells with flaws in the system are killed.
Growth phase association: quorum sensing
Suggested by our users, i.e. M.Sc Alexander Schmidt, this proposal can be achieved in the same manner as the one described in the previous “promoters” section. Nonetheless, its implications are much more extensive, since “quorum sensing allows groups of bacteria to synchronously alter behaviour in response to changes in the population density and species composition of the vicinal community” (Mukherjee & Bassler, 2019). “QS is likely to control behaviours that are crucial for the development and success of these communities in diverse environments such as the human gut and chronic infections in humans” (Whiteley et al., 2017). According to (Wu et al., 2020) “Dynamic control of bacterial populations usually includes population size control, dynamic metabolic engineering for desirable products and the regulation of various physiological activities”. This regulation mechanism has been used in the development “of various genetic circuits such as genetic oscillators, toggle switches and logic gates with AHL-based QS devices in Gram-negative bacteria” (Wu et al., 2020).
Adan, A., Alizada, G., Kiraz, Y., Baran, Y., & Nalbant, A. (2017). Flow cytometry: basic
principles and applications. Critical reviews in biotechnology, 37(2), 163-176.
https://doi.org/10.3109/07388551.2015.1128876 ">https://doi.org/10.3109/07388551.2015.1128876 ">https://doi.org/10.3109/07388551.2015.1128876
Arkin, A. P., & Fletcher, D. A. (2006). Fast, cheap and somewhat in control. Genome
biology, 7(8), 1-6. https://doi.org/10.1186/gb-2006-7-8-114
Baba, T., Ara, T., Hasegawa, M., Takai, Y., Okumura, Y., Baba, M., & Mori, H. (2006).
Construction of Escherichia coli K‐12 in‐frame, single‐gene knockout mutants: the Keio
collection. Molecular systems biology, 2(1), 2006-0008.
https://doi.org/10.1038/msb4100050
Beal, J., Baldwin, G. S., Farny, N. G., Gershater, M., Haddock-Angelli, T.,
Buckley-Taylor, R., & iGEM Interlab Study Contributors. (2021). Comparative analysis of
three studies measuring fluorescence from engineered bacterial genetic constructs. PloS
one, 16(6), e0252263. https://doi.org/10.1371/journal.pone.0252263
Beal, J., Haddock-Angelli, T., Farny, N., & Rettberg, R. (2018). Time to get serious
about measurement in synthetic biology. Trends in biotechnology, 36(9), 869-871.
10.1016/j.tibtech.2018.05.003
Bonnet, J., Subsoontorn, P., & Endy, D. (2012). Rewritable digital data storage in live
cells via engineered control of recombination directionality. Proceedings of the
National Academy of Sciences, 109(23), 8884-8889.
https://doi.org/10.1073/pnas.1202344109
Canton, B., Labno, A., & Endy, D. (2008). Refinement and standardization of synthetic
biological parts and devices. Nature biotechnology, 26(7), 787-793. 10.1038/nbt1413
Chan, C. T., Lee, J. W., Cameron, D. E., Bashor, C. J., & Collins, J. J. (2016).
'Deadman' and 'Passcode' microbial kill switches for bacterial containment. Nature
chemical biology, 12(2), 82-86. 10.1038/nchembio.1979
EcoCyc. [Data base]. Escherichia coli K-12 substr. MG1655 reference genome (EcoCyc).
https://biocyc.org/gene?orgid=ECOLI&id=GUANYL-KIN-MONOMER#
Englaender, J. A., Jones, J. A., Cress, B. F., Kuhlman, T. E., Linhardt, R. J., &
Koffas, M. A. (2017). Effect of genomic integration location on heterologous protein
expression and metabolic engineering in E. coli. ACS synthetic biology, 6(4), 710-720.
10.1021/acssynbio.6b00350
Fedorec, A. J., Robinson, C. M., Wen, K. Y., & Barnes, C. P. (2020). FlopR: an open
source software package for calibration and normalization of plate reader and flow
cytometry data. ACS synthetic biology, 9(9), 2258-2266. 10.1021/acssynbio.0c00296
Fernandez-Rodriguez, J., Yang, L., Gorochowski, T. E., Gordon, D. B., & Voigt, C. A.
(2015). Memory and combinatorial logic based on DNA inversions: dynamics and
evolutionary stability. ACS synthetic biology, 4(12), 1361-1372.
10.1021/acssynbio.5b00170
Fredrickson, J. K., Bentjen, S. A., Bolton Jr, H., Li, S. W., Ligotke, M. W., McFadden,
K. M., & Van Voris, P. (1989). Evaluation of terrestrial microcosms for assessing the
fate and effects of genetically engineered microorganisms on ecological processes (No.
PNL-6850). Pacific Northwest Lab., Richland, WA (USA). https://doi.org/10.2172/6294057
Friedland, A. E., Lu, T. K., Wang, X., Shi, D., Church, G., & Collins, J. J. (2009).
Synthetic gene networks that count. Science, 324(5931), 1199-1202. doi:
10.1126/science.1172005
Gallagher, R. R., Patel, J. R., Interiano, A. L., Rovner, A. J., & Isaacs, F. J. (2015).
Multilayered genetic safeguards limit growth of microorganisms to defined environments.
Nucleic acids research, 43(3), 1945-1954. 10.1093/nar/gku1378
Gil, R., Silva, F. J., Peretó, J., & Moya, A. (2004). Determination of the core of a
minimal bacterial gene set. Microbiology and Molecular Biology Reviews, 68(3), 518-537.
10.1128/MMBR.68.3.518-537.2004
Goentoro, L., Shoval, O., Kirschner, M. W., & Alon, U. (2009). The incoherent
feedforward loop can provide fold-change detection in gene regulation. Molecular cell,
36(5), 894-899. 10.1016/j.molcel.2009.11.018
Guiziou, S., Mayonove, P., & Bonnet, J. (2019). Hierarchical composition of reliable
recombinase logic devices. Nature communications, 10(1), 1-7.
https://doi.org/10.1038/s41467-019-08391-y
Heyde, K. C., & Ruder, W. C. (2015). Exploring host-microbiome interactions using an in
silico model of biomimetic robots and engineered living cells. Scientific reports, 5(1),
1-12. https://doi.org/10.1038/srep11988
Howell, M., Daniel, J. J., & Brown, P. J. (2017). Live cell fluorescence microscopy to
observe essential processes during microbial cell growth. Journal of visualized
experiments: JoVE, (129). 10.3791/56497
Hsiao, V., Hori, Y., Rothemund, P. W., & Murray, R. M. (2016). A population‐based
temporal logic gate for timing and recording chemical events. Molecular systems biology,
12(5), 869. 10.15252/msb.20156663
Inamori, Y., Murakami, K., Sudo, R., Kurihara, Y., & Tanaka, N. (1992). Environmental
assessment method for field release of genetically engineered microorganisms using
microcosm systems. Water Science and Technology, 26(9-11), 2161-2164.
https://doi.org/10.2166/wst.1992.0686
Jack, B. R., Leonard, S. P., Mishler, D. M., Renda, B. A., Leon, D., Suárez, G. A., &
Barrick, J. E. (2015). Predicting the genetic stability of engineered DNA sequences with
the EFM calculator. ACS synthetic biology, 4(8), 939-943. doi:10.1021/acssynbio.5b00068
Jin, E., Wong, L., Jiao, Y., Engel, J., Holdridge, B., & Xu, P. (2017). Rapid evolution
of regulatory element libraries for tunable transcriptional and translational control of
gene expression. Synthetic and systems biotechnology, 2(4), 295-301. doi:
10.1016/j.synbio.2017.10.003
Kim, S., Jeong, H., Kim, E. Y., Kim, J. F., Lee, S. Y., & Yoon, S. H. (2017). Genomic
and transcriptomic landscape of Escherichia coli BL21 (DE3). Nucleic acids research,
45(9), 5285-5293. https://doi.org/10.1093/nar/gkx228
Lewis, M. (2011). A tale of two repressors. Journal of molecular biology, 409(1), 14-27.
10.1016/j.jmb.2011.02.023
Liu, G., Zhang, Y., & Zhang, T. (2020). Computational approaches for effective CRISPR
guide RNA design and evaluation. Computational and structural biotechnology journal, 18,
35-44. https://doi.org/10.1016/j.csbj.2019.11.006
de Lorenzo, V. (2009). Recombinant bacteria for environmental release: what went wrong
and what we have learnt from it. Clinical Microbiology and Infection, 15, 63-65.
https://doi.org/10.1111/j.1469-0691.2008.02683.x
McGinness, K. E., Baker, T. A., & Sauer, R. T. (2006). Engineering controllable protein
degradation. Molecular cell, 22(5), 701-707.
https://doi.org/10.1016/j.molcel.2006.04.027
McKinnon, K. M. (2018). Flow cytometry: an overview. Current protocols in immunology,
120(1), 5-1. https://doi.org/10.1002/cpim.40
MedlinePlus. (s.f.) What are genome editing and CRISPR-Cas9?
https://medlineplus.gov/genetics/understanding/genomicresearch/genomeediting/
Merrick, C. A., Zhao, J., & Rosser, S. J. (2018). Serine integrases: advancing synthetic
biology. ACS synthetic biology, 7(2), 299-310. 10.1021/acssynbio.7b00308
Mukherjee, S., & Bassler, B. L. (2019). Bacterial quorum sensing in complex and
dynamically changing environments. Nature Reviews Microbiology, 17(6), 371-382.
https://doi.org/10.1038/s41579-019-0186-5
Noman, N., Inniss, M., Iba, H., & Way, J. C. (2016). Pulse detecting genetic circuit–a
new design approach. PLoS One, 11(12), e0167162.
https://doi.org/10.1371/journal.pone.0167162
Pasotti, L., Zucca, S., Lupotto, M., De Angelis, M. G. C., & Magni, P. (2011).
Characterization of a synthetic bacterial self-destruction device for programmed cell
death and for recombinant proteins release. Journal of biological engineering, 5(1),
1-12. https://doi.org/10.1186/1754-1611-5-8
Perez-Garcia, O., Lear, G., & Singhal, N. (2016). Metabolic network modeling of
microbial interactions in natural and engineered environmental systems. Frontiers in
microbiology, 7, 673. https://doi.org/10.3389/fmicb.2016.00673
Politz, M. C., Copeland, M. F., & Pfleger, B. F. (2013). Artificial repressors for
controlling gene expression in bacteria. Chemical communications, 49(39), 4325-4327.
10.1039/c2cc37107c
Roquet, N., Soleimany, A. P., Ferris, A. C., Aaronson, S., & Lu, T. K. (2016). Synthetic
recombinase-based state machines in living cells. Science, 353(6297). DOI:
10.1126/science.aad8559
Sleight, S. C., Bartley, B. A., Lieviant, J. A., & Sauro, H. M. (2010). Designing and
engineering evolutionary robust genetic circuits. Journal of biological engineering,
4(1), 1-20. https://doi.org/10.1186/1754-1611-4-12
Stark, W. M. (2017). Making serine integrases work for us. Current opinion in
microbiology, 38, 130-136. https://doi.org/10.1016/j.mib.2017.04.006
Stirling, F., Bitzan, L., O’Keefe, S., Redfield, E., Oliver, J. W., Way, J., & Silver,
P. A. (2017). Rational design of evolutionarily stable microbial kill switches.
Molecular cell, 68(4), 686-697. https://doi.org/10.1016/j.molcel.2017.10.033
Su, T., Liu, F., Gu, P., Jin, H., Chang, Y., Wang, Q., & Qi, Q. (2016). A CRISPR-Cas9
assisted non-homologous end-joining strategy for one-step engineering of bacterial
genome. Scientific reports, 6(1), 1-11. https://doi.org/10.1038/srep37895
Su, T., Liu, F., Chang, Y., Guo, Q., Wang, J., Wang, Q., & Qi, Q. (2019). The phage T4
DNA ligase mediates bacterial chromosome DSBs repair as single component non-homologous
end joining. Synthetic and systems biotechnology, 4(2), 107-112. doi:
10.1016/j.synbio.2019.04.001
Széliová, D., Krahulec, J., Šafránek, M., Lišková, V., & Turňa, J. (2016). Modulation of
heterologous expression from PBAD promoter in Escherichia coli production strains.
Journal of biotechnology, 236, 1-9. https://doi.org/10.1016/j.jbiotec.2016.08.004
Team: Cambridge. (2009). E.chromi. The International Genetically Engineered Machine
iGEM. https://2009.igem.org/Team:Cambridge/Project/Amplification/Characterisation
Team: Groningen. (2011). Count coli. The International Genetically Engineered Machine
iGEM. https://2011.igem.org/Team:Groningen/project_description
Team: Paris Liliane Bettencourt. (2010). Memo-cell. The International Genetically
Engineered Machine iGEM.
https://2010.igem.org/Team:Paris_Liliane_Bettencourt/Project/Memo-cell
Team: SJTU-BioX-Shanghai. (2009). E.coli the napper. he International Genetically
Engineered Machine iGEM.
https://2009.igem.org/Team:SJTU-BioX-Shanghai/Project_design#Overview
Team: Stanford-Brown. (2012). The transit of synthetic astrobiology. The International
Genetically Engineered Machine iGEM. https://2012.igem.org/Team:Stanford-Brown
Team: SYSU China. (2015). Micro-time system. The International Genetically Engineered
Machine iGEM. https://2015.igem.org/Team:SYSU_CHINA/Design
Team: UT-Tokyo. (2016). Like does not beget like. The way to make a 100-stage cycle. The
International Genetically Engineered Machine iGEM.
https://2016.igem.org/Team:UT-Tokyo/Project#The_way_to_make_a_100-stage_cycle
The International Genetically Engineered Machine iGEM. (s.f.a). Promoters/Catalog.
https://parts.igem.org/Promoters/Catalog
The International Genetically Engineered Machine iGEM. (s.f.b). Terminators.
https://parts.igem.org/Terminators.
Tomimatsu, K., Kokura, K., Nishida, T., Yoshimura, Y., Kazuki, Y., Narita, M., &
Ohbayashi, T. (2017). Multiple expression cassette exchange via TP 901‐1, R4, and Bxb1
integrase systems on a mouse artificial chromosome. FEBS open bio, 7(3), 306-317.
https://doi.org/10.1002/2211-5463.12169
Torres, L., Krüger, A., Csibra, E., Gianni, E., & Pinheiro, V. B. (2016). Synthetic
biology approaches to biological containment: pre-emptively tackling potential risks.
Essays in Biochemistry, 60(4), 393-410. 10.1042/EBC20160013
Van Melderen, L., & Saavedra De Bast, M. (2009). Bacterial toxin–antitoxin systems: more
than selfish entities?. PLoS genetics, 5(3), e1000437.
https://doi.org/10.1371/journal.pgen.1000437
Wang, X., Tang, B., Ye, Y., Mao, Y., Lei, X., Zhao, G., & Ding, X. (2017). Bxb1
integrase serves as a highly efficient DNA recombinase in rapid metabolite pathway
assembly. Acta biochimica et biophysica Sinica, 49(1), 44-50.
https://doi.org/10.1093/abbs/gmw115
Whiteley, M., Diggle, S. P., & Greenberg, E. P. (2017). Progress in and promise of
bacterial quorum sensing research. Nature, 551(7680), 313-320. doi:10.1038/nature24624
Wu, S., Liu, J., Liu, C., Yang, A., & Qiao, J. (2020). Quorum sensing for
population-level control of bacteria and potential therapeutic applications. Cellular
and Molecular Life Sciences, 77(7), 1319-1343.
https://doi.org/10.1007/s00018-019-03326-8
Xu, Z., & Brown, W. R. (2016). Comparison and optimization of ten phage encoded serine
integrases for genome engineering in Saccharomyces cerevisiae. BMC biotechnology, 16(1),
1-10. https://doi.org/10.1186/s12896-016-0241-5
Xu, X., & Qi, L. S. (2019). A CRISPR–dCas toolbox for genetic engineering and synthetic
biology. Journal of molecular biology, 431(1), 34-47.
https://doi.org/10.1016/j.jmb.2018.06.037
Yang, L., Nielsen, A. A., Fernandez-Rodriguez, J., McClune, C. J., Laub, M. T., Lu, T.
K., & Voigt, C. A. (2014). Permanent genetic memory with> 1-byte capacity. Nature
methods, 11(12), 1261-1266. 10.1038/nmeth.3147
Zhao, J., Pokhilko, A., Ebenhöh, O., Rosser, S. J., & Colloms, S. D. (2019). A
single-input binary counting module based on serine integrase site-specific
recombination. Nucleic acids research, 47(9), 4896-4909. 10.1093/nar/gkz245
Our motivation
When we decided to explore sequential logic in biology through our software, we
encountered two determinant situations from which this conceptual framework was born.
The first one was, after several rounds of interviews, most experts we talked to did
not know about the relationship between sequential logic and biology. Second, as
this is a rather unexplored approach, there is minimal literature and guidance on
how to assess a sequentially dependent biological system or how to conceptualize a
circuit as a state machine. Due to the lack of information on the topic, we first had to
discuss extensively and define foundational information in order to establish the right
approach for the software's design. Therefore, we chose to compile what we had gathered
and expand on it to generate a guide that would help not only us as the
developers of the software, but also its immediate users and even those
who wish to implement another tool or just better understand sequential
logic in biology.
Our inspiration
Even though the application of sequential logic in synthetic biology is rather
unexplored, the area has a lot of potential. As Roquet et al.(2016) remarked,
despite functional state machines' prospect to be transformative tools in the
understanding and engineering of complex biological systems, their implementation
has been hindered because of the absence of a framework that would offer a scalable
and generalizable approach. The proposal Roquet et al. (2016) developed
was a recombinase-based framework for the implementation of state machines in cells, by
encoding the state in a particular DNA sequence; they expanded on it with a large,
searchable database of such state machines registers to help researchers design
circuits. Even previously, Oishi and Klavins (2014) proposed a framework for building
finite state machines with engineered regulatory networks based on repressing
transcription factors. However, these approaches pursue highly specific methods and do
not focus on a conceptual guide for a wide range of functionalities. Therefore, a more
general and modular toolset for the exploration of sequential logic based
synthetic biology would greatly benefit scientists and professionals with different
levels of knowledge.
Our goal
We set as our objective the proposal of a framework composed of a set of rules and
concepts for this approach, which would be the base for the exploration of an
incipient field and for the development of novel tools. Such a guide
would serve as a structure upon which to design, construct, model and analyze -either
computationally or manually- synthetic gene circuits from a sequential logic standpoint.
Both the purpose for this framework and the process by which we achieved it, was to ask
the right questions that represented our inquiries and offer the corresponding answers
to help and guide others. Such questions and answers would begin a discussion to further
advance and understand this area. Beyond the applications that can be given to this
framework and the work it can support, it also offers a comprehensive and
paradigm-shifting guide in itself; therefore anyone who comes across it can dive
into the analysis and advantages of biological systems as sequential systems, providing
them with a new perspective.
Here we propose a conceptual framework, which is an efficient and modular approach on
how to tackle sequentially dependent biological tasks, specifically for
synthetic genetic circuits and their design. It is a proposal on how to describe in a
high abstraction level the biological elements involved in these types of
sequential applications, such as parts, circuits, their environment and their
interactions, and their assembly to simulate a behavior as a finite state
machine.
01
Objective
The objective of this framework is to conceptualize, in an abstract and modular
manner, biological processes as state machines; specifically synthetic
genetic circuits for the exploration of their structure and behavior from a
sequential logic point of view.
02
Foundational Concepts
Even though we expound the characteristics of the biological problems that match
with this approach, it is important to note from the beginning that this is a
proposal for complex, time and memory-dependent, or regulatory intricate
systems. Therefore, other circuits and systems exist whose behavior doesn't
require to be abstracted as sequential logic. For more specific examples of
applications, go to implementation where we expand on the uses we and the
potential users we talked to propose.
03
Scope and Applications
To better understand our approach, it is important to know that the framework is
supported by three key concepts: sequential logic, abstraction and
modularity. These foundational concepts are core to our project as they
guide the development of the framework and are aligned with our purpose.
Logic
The conceptualization of a genetic circuit's function as sequential logic can be
daunting and hard to replicate consistently. There are a number of elements that must be
considered to properly represent the circuit's structure and behavior. In a sequential
system, the output depends not only on the present inputs but also on the current
state. For synthetic biology, a circuit and its environment can be
portrayed as a sequential system or state machine when the behavior can be described
by states whose transitions are controlled by time-ordered inputs and are affected
by past behavior and inputs. The modification and behavior of circuits can be
explored through the sequential progress of states. Exploring the potential states of a
circuit requires recurrent simulations of its changes and behavior; hence, there must
exist an iterable ready approach to tackle these types of problems. A biological
process, natural or engineered, that has characteristics such as: being time-dependent,
having highly interconnected regulatory networks, behaving differently with repeating
inputs, requiring information to be saved stably (Letsou & Cai, 2016; Madec, Rosati, &
Lallement, 2021; Magdevska et al., 2017; Oishi and Klavins, 2014; Roquet et
al., 2016), would greatly benefit from being conceptualized and analyzed as a
sequential logic problem. Moreover, the ability to sequentially modulate the system's
response offers high scalability as it enables larger or more complex applications
(Zúñiga et al., 2020).
Abstraction
To perform a thorough exploration of the design space (i.e. possible
architectures), the framework follows a highly abstract approach. This is necessary
since there is a direct relationship between biological accuracy and modeling
complexity; therefore, by abstracting most of the biological details we are able to
focus on performance, simplicity, modularity and expandability. This tradeoff is
valuable from a design perspective where it is more efficient to perform a coarse
analysis unlimited and unbiased by biological technicalities and then focus on
fine-tuning an oriented and pre-processed set of circuits, based on a specific use-case.
This framework works in an area of what is biologically viable, but more complex
biological challenges present when looking into implementing a circuit, such as
transcription rates or residual protein concentration, are meant to be addressed at
other following stages and solved through other existing solutions or tools. This
approach allows for a general-purpose framework.
Modularity
Every biological system is different and should not be represented by a static model or
relation. Conceptualizations and models such as truth tables work with the coupling of
multiple modules but focus on small scale inputs and outputs rather than an overall
behaviour and functionality. When we build a device we focus on processes rather than
specific responses, which is why we propose an execution through atomic functions that
can be stacked to achieve more complex behavior while maintaining simplicity and
modularity. The abstract representation we propose uses a modular approach in
terms of composition, allowing the user to adjust, append or remove concepts
according to the application. Everything is built upon a primary building block
to enable modularity and scalability.
The framework is divided into three stages to address all relevant aspects. First, the
framework expands on how a circuit, cell or gene regulatory network can be a sequential
system in the first phase: {biological state machines}, which dives into sequential
logic applied in synthetic biology and its characteristics. Then, it defines the
concepts involved in the rest of the framework in the {description phase}: an approach
to conceptualize biological elements and their interactions. Lastly, the framework
describes how to assess and develop circuits as state machines; the {execution stage}
consists of the coupling of the first two, to fully describe and simulate, from a
sequential logic standpoint, biological processes and regulatory networks from synthetic
circuits.
When?
In order to conceptualize a circuit as a state machine, the behavior must be
dependent on the historical and present inputs.
How?
The system’s input is the composition of the circuit's environment.
Meanwhile, the output is an expression profile related to the state.
A state, is defined by two elements: the circuit architecture, i.e.
the disposition of genetic parts or pair bases, and the gene expression profile.
These elements were inspired by previous literature (Fritz et al., 2007; Oishi &
Klavins, 2014; Roquet et al., 2016), as we considered multiple perspectives
required and advantageous. Therefore, a change in state would be caused by a change in
the DNA sequence, by a change in a gene's expression, or by a change in both. The first
element, the circuit's architecture, corresponds to the order, orientation and sequence
of parts; thus DNA modification caused by insertion, deletion, substitution, inversion,
mutation, would cause a change in state. For this, enzymes that modify the DNA strand
are essential. This is a highly evolutionary stable way to store the state as it is
coded in the DNA and does not require a constant gene expression (with the corresponding
metabolic burden) (Roquet et al., 2016). On the other hand, the expression
profile refers to determined transcribed genes and their expression levels, which can be
reflected in the environment. By extension, it considers gene regulatory networks and
epigenetic mechanisms that affect the functionality of the circuit. Although a state
defined by this aspect is not as stable, the objective is focused on considering
relevant alterations as state changes, since they will affect the response to future
inputs. If a change in expression causes a change in architecture or the other way
around: if a change in architecture causes a change in expression, both could be grouped
into the same state change.
To exemplify, we can consider the following system.
A circuit with two different inducible promoters that achieves five states depending on
the order of inputs.
State A is the initial state without any induction.
If inducer 1 is the first input, the promoter 1 is activated, the rfp and recombinase 1
are expressed. The rec inverts the sequence between the sites, which changes the circuit
architecture and generates state B.
If inducer 1 is added again, the rfp and rec 1 are expressed again but there is no
relevant change and therefore there is no change of state.
If inducer 2 follows instead, the rec 2 is expressed and the sequence between the sites
is excised, which generates state C.
With input 1 next, the gfp is expressed, but if input 2 is inserted instead, the rec 2
is expressed but has no effect.
If instead of this order of inputs, the system in state A starts with inducer 2, the rec
2 is expressed, which excises the sequence between the sites generating state D.
In state D, if the inducer 2 is input, the recombinase 2 is expressed but nothing
happens.
Instead, if inducer 1 is added next, the yfp and repressor are expressed. The repressor
associates to the operator site, which modifies the expression profile, generating state
E.
In state E, with time as an input, the repressor degrades, which deactivates the
operator site, going back to state D.
Characteristics
The application of sequential logic to biological systems has certain characteristics
which help understand better its applicability.
01
Traceability
Sequential logic and state machines generate highly traceable systems since present
actions respond to the previous history. In addition, it saves the complete
behavior of the circuit, including the progression of states, inputs and
outputs, instead of only a DNA sequence. Even further, according to the
structure the state machine follows, the history of states can be known with only a
single state.
02
Memory-based
Since the current state of the system affects the output and future states, such
states can be used for saving information even when the input is gone, rather
than losing the response to the input. Therefore, such systems have a memory and can
save data.
03
Time-ordered
Since a sequential system responds differently to the same input depending on the
current state, the specific point at which the input is present is determinant
(Roquet et al., 2016). Due to this ability to modulate the response
according to a time-orderly induction, state machines offer the capacity to
carry out complex functions (Zúñiga et al., 2020) and sequentially
regulate targets which, if using combinatorial logic, would get co-activated
(Letsou & Cai, 2016). This enables applications such as counting the number of
stimuli (such as in {our counter), or modifying the behavior depending on the
order of inputs (Madec, Rosati, & Lallement, 2021). The latter is why cells
accomplish a vast number of events and processes with limited genes (Letsou & Cai,
2016). State changes enable the consideration of changes of a circuit over time
and how that system evolves, especially the environment.
04
Alternative scenarios
Inherently, sequential logic is well fitted for the consideration of alternative
routes as they can be naturally integrated as multiple states that can potentially
follow the previous one. This is beneficial when dealing with multiple
alternative scenarios, exploring unknown routes, and considering error
routes and potential mutations. This allows the proposal of control
plans, and a better and more thorough exploration, especially when the inputs or
their order is not completely straightforward, as in biology. This can be coupled to
probabilities for each state change, such as in a Markov Model. For example,
a state machine could be built for a circuit in state 1 that after an input moves
into desired state 2 but has a lesser probability of changing into state 3, the
alternative scenario. Overall, this characteristic suits biology due to the
stochastic element in biochemical reactions (Stelling, Szallasi, & Periwal;
2019).
05
Multipurpose system
The use of logic in biology enables complex response patterns and overall processes.
Since a single input can lead to a different output depending on the state,
state machines facilitate multipurpose systems, also tightly related to alternative
scenarios and a timed order response. With sequentiality, from a single
circuit or state machine, a great number of responses and fates can be
derived, as it naturally occurs in living cells.
06
Easier troubleshooting
A state machine facilitates an emphasis on the states' progression rather
than on a single final output, contrary to a combinatorial system composed of a
truth table and logic gates whose structure is highly focused on the output as a 0/1
(on/off) response. Therefore, troubleshooting becomes straightforward when using
sequential logic because it doesn't require the circuit to reach its final state
for it to fail. In addition to this, by recording the circuit's history, the
specific input that went wrong can be easily distinguished from the rest.
This is possible given the system's traceability -due to the relation between
the state and the output- together with the capacity to measure the current
state, especially when the state is saved in the DNA sequence and can be
known by sequencing it. In this case, the last state before the system failed can be
known by looking at the sequence, instead of only having a failed/succeeded outcome
without a way to know when and where the problem occurred.
07
Concept extrapolation
Sequential logic can be explained as decision making based on previous
situations. This is a concept analogous to human behavior: making decisions
based on past experiences and current situations and opportunities. Furthermore, as
a logic circuit, it is based on electronics. Therefore, the concept is easily
comprehensible across, even beyond, biological sciences.
08
Biologically suitable
Sequential logic is an approach distinctly similar to nature, as observed in highly
interconnected gene regulatory networks, where regulators affect multiple targets
and the targets are affected by multiple regulators (Letsou & Cai, 2016). These
systems are not limited by a specific input-output response nor by the number of
inducers, but rather respond to the temporally ordered presence of inducers (Letsou
& Cai, 2016; Madec, Rosati, & Lallement, 2021; Roquet et al., 2016). By
following sequential logic, the design of circuits emulates and expands on the
advantages nature offers. This allows encompassing more flexible
systems and proposing highly scalable designs (Roquet et al.,
2016).
Furthermore, sequentiality reduces the necessity of orthogonal inducers, a
relevant limitation in synthetic biology, as an induction directly linked to a point
in time (or state) practically eliminates the possibility of crosstalk.
Therefore, the number of inducers can be minimized if a system does not demand a
different one for each desired output. This facilitates the design of circuits by
excluding the biological limitation of quantity of highly specific inducers.
Also related to the inducers is the metabolic burden, which can be reduced by
storing information in the DNA sequence rather than requiring a constant gene
expression or complex inducer combinations.
Another limitation that is minimized is the size of the circuit since the
number of pair bases can be reduced by modifying the DNA sequence rather than
requiring multiple linearly positioned logic gates. This is especially important in
synthetic circuits that must conform to a specific number of base pairs.
Lastly, saving the state in the sequence itself leads to an increase in
evolutionary stability (Zúñiga et al., 2020). As we mentioned
previously, mutations and stochasticity are core elements of biology. By increasing
the stability and offering an approach to consider alternative scenarios, state
machines are perfectly designed to describe biological genetic systems.
09
Guided design
Sequentially dependent circuits applied in experimental characterization offer an
efficient structure that facilitates a guided design of better and more efficient
methodologies. By having a comprehensible map of the complete circuit behavior
(parts’ interaction, inputs, outputs, etc.) and sequentially activating target genes
or products, it is possible to reduce the characterization time, resources and
metabolic burden of the constructs, increase the scalability, and
propose smarter architectures.
Structural
Biological elements can have multiple representations, for example, SBOL is a standard
for the representation of biological designs, especifically for the electronic exchange
of information on their structural and functional aspects (Synthetic biology open
language). Electrical engineering iconography is used for biological systems to
illustrate logic gates as a function of inputs (Densmore & Anderson, 2009), and even
further, strings of the letters ATCG are used to represent DNA sequences, common in DNA
manipulation and design tools. Each representation best suits the purpose of its
application, whether that is design, mathematical modeling, detailed description,
synthesis, bioinformatics analysis, databases. For our framework, the structural segment
encompasses the basic representations, as conceptual units, of biological elements that
best suit our goal and seek to be abstract and modular to be used for further
applications.
Parts
A part is the framework's abstraction of a coding or non-coding DNA sequence with a
function. Each part is a conceptual unit with a biological role, e.g.
coding, recognizing, expressing, promoting/repressing transcribing. This abstraction is
based on the function rather than the sequence itself, thus the biological role is the
essential element of a part, regardless of the molecular structure it is directly
related to: DNA, RNA or protein. Besides the biological role, parts can represent more
specific information related to them, such as the corresponding DNA sequence. Parts have
properties, interact with other parts or elements, and execute functions;
and they are the building blocks for genetic circuits.
Circuit
The circuit abstraction is what provides cohesion to a set of parts. Circuits are not
static, they can modify themselves or be affected by other elements. In its simplest
version, a circuit is just an organized set of parts. The organization method
defines the circuit architecture and provides functionality. This organization can be
established by any structure; whether it is a drawing or a fully-fledged software data
structure.
The circuit structure to be used depends on the desired functionality, but
at the same time conditions the behavior and relationships between the parts. The most
standard structure is a simple one dimensional array and is what probably works best for
most applications, since it is the closest to the biological organization of DNA. In
this structure, circuits would be limited to adjacent parts, while other non-adjacent
parts, circuits or genetic material somehow functionally related would be addressed as
part of the environment. This helps set a clear limit to what a single circuit
encompasses, since functional interactions with other genetic material can be infinite,
and it would separate the circuit (synthetic DNA) from the organism's endogenous genetic
material.
If a simple one dimensional structure doesn’t fit the functionality, a circuit can be
defined using other structures, such as circular lists or hierarchical tree structures
and graphs, if that suits the application best. Remember that the only requirement is
that the circuit must be an organized container, the structure can vary.
Environment
A circuit can interact with many different elements apart from itself, such
elements can modify the circuit structurally, behaviorally or both. The environment
describes both the circuit and these simultaneous conditions that potentially
interact with or could affect it, but are not part of the circuit itself. The
environment can be subdivided into circuits, internal (plasmid, genome,
intracellular proteins) and external (media, environmental conditions).
Since this framework is intended to be applied in a wide range of areas, some that we
may not have yet considered the specific requirements for, it is especially important
for the element of the environment to be a broad concept to act modularly to factors we
may not have reckoned yet.
Behavioral
Since circuit state changes are allowed by the function of the circuit's parts and the
environment, it is required to conceptualize how a part, multiple parts and the complete
circuit act and interact with each other. The multiple ways in which parts relate with
each other and the environment enables them to act, carry out their biological role,
expand it, and inhibit other parts. These relationships can achieve high levels of
complexity, especially since circuits can modify themselves, the environment and thus
change states, requiring their constant validation and integrating dependent networks of
interactions. The behavioral stage is still part of the description phase, thus, these
are only the definitions of the actions; the process of carrying them out relates to the
second phase, the execution.
Interactions
Interactions describe how multiple parts and even other elements of the
environment influence each other, affecting their transcription, translation or
activity. These interactions are not mutually exclusive and may even overlap. Some of
these relationships are:
Parts can have a directionality relative to each other in the circuit.
E.g. Two recombinase recognition sites facing each other, facing away, or
facing the same direction determine the carried out action; A unidirectional
terminator´s direction relative to a coding part.
Parts have a location in the circuit relative to each other.
E.g. a promoter upstream or downstream of a coding part.
potentially interact with each other when coexisting in the same
environment.
E.g. A recombinase and a RDF (recombinase directionality factor) can
potentially interact to perform the inversion.
When two or more elements require each other to function there is a
codependency between these, that would require for parts not only to be present,
but also to be functional.
E.g. A sgRNA requires a Cas protein for it to act; a recombinase requires
at least two recognition sites
This relationship describes any kind of recognition between two or more
element, such as dna-protein, protein-protein, dna-rna, rna-protein.
E.g. An inhibitor and an operator site; an RNA polymerase and the
promoter; a protein and an associable subunit
Takes place when elements inhibit another or each others’ transcription, action or overall presence.
E.g. A terminator directly upstream of a coding sequence prevents its transcription.
Functions
A function is an action carried out by a part or group of parts determined by a
set of interactions. Some functions are: inversion, excision, edition,
insertion, repression, expression.
For example, the function of a recombinase is determined by an interaction of
alignment with a promoter for the coding sequence can be expressed, an interaction
of codependency with the recognition sites, and an interaction of recognition of the
sites.
Application
The application refers to how the actions, based on the
structure and relationships, are carried out. This stage
focuses on the execution of the parts’ functions from a previously
defined circuit and its environment, as a processing workflow.
Therefore, the input necessary to develop this workflow is a
defined environment -which includes a circuit-, and the final
output is one or more modified circuits in their
environments. Multiple outputs might be necessary when
considering alternative scenarios for the execution of the parts
functions. One cycle of this workflow could encompass one or multiple
circuit modifications, simultaneous or not, depending on the desired
output.
Processing workflow:
- Define the potential interactions of all elements.
- Review the environment (excluding the circuit) and define the elements that interact with the circuit to consider them in this cycle of the workflow, the rest can be stored for a following cycle.
- Traverse the circuit to find the active promoters, terminators, operators and other regulatory parts (i.e. can carry out their function of promoting or repressing transcription), based on their interactions with the environment or because they are constitutive.
- Traverse the circuit starting from the promoter(s) and following the circuit’s structure (biologically: downstream) to review the coding parts that can be transcribed based on the alignment and position interactions with the parts determined in step 3.
- Validate the interactions of the elements (part(s) and other environment) to ensure the requirements for the function’s execution are met.
- If necessary, define a priority to perform the functions or else, a simultaneity strategy.
- Execute the elements functions following the prioritization or considering as many scenarios as desired to generate one or multiple parallel new states of the circuit, depending on whether alternative scenarios are considered or not.
- Review changes in the environment, e.g. protein degradation, and extract one or multiple outputs per each modification
Integration
The process described above sets the foundation for operating over a
circuit; this process can be the base of larger more complex
applications by incorporating it into a larger workflow. This is
what we call integration and it is an essential part of studying
sequential circuits.
This process of how to integrate a specific circuit cycle simulation is
largely dependent on the application and can be used to incorporate new
elements inline. By modifying a circuit through the processing workflow
and then taking the output (the modified circuit and environment) as the
new input, we are able to study chains of events and their
effects. An alternative to the base cyclic approach would be to
connect a circuit generator and a post-execution circuit
evaluator; this way smarter selection approaches can be
integrated into smarter the process such as genetic algorithms
and machine learning applications, or see the compatibility
with a desired organism by, for example, analyzing the
endogenous machinery and potential crosstalk.
As mentioned before, the conceptual framework was a set of ideas to guide a
computational or manual design, construction, simulation and analysis of synthetic gene
circuits as state machines. We decided to apply our framework as a base for software
tools’ design, where such tools would aid the exploration of alternative circuit
architectures, while offering flexibility in the implementation and optimization. This
means, we implemented the concepts described and the coupling of them with sequential
logic -specially the {processing workflow}- as computational concepts and a
workflow to be used as a software’s guide.
A software framework provides an abstraction for generic software functionalities
upon which other software applications can be built. It makes implementation details
transparent to other developers, which gets rid of most tedious and repetitive
work, while providing a sturdy base for development and allowing most of the development
effort to be tailored towards functional and valuable software. Thus this software
framework can be easily used as a starting point for implementing software
tools.
The objective of this software framework is to set a structural guide for
facilitating the design of a software to be used for the generation, assessment
and manipulation of synthetic gene circuits as state machines.
The workflow presented begins from the premise that the desired parts and their
properties are defined. The software framework is divided into four: an initialization
stage, two core processing stages and a post-processing stage.
Input: defined parts with their corresponding properties.
Output: a data structure that defines the circuit through its parts and order.
From a defined circuit, a cycle of two core stages is executed, this cycle processes the change from one state to another, i.e. the modification of the circuit architecture. A single cycle, or state change, consists of evaluation and action.
Input: a data structure that defines the circuit through its parts and order.
Output: A list of parts that get expressed in this state. The related inputs and interaction that enables the expression of the parts: which promoter and/or which inducer.
Input: a data structure that defines the circuit through its parts and order, and a list of parts that get expressed in this state.
Output: a new state or states as data structures that define each circuit.
At this point, the output circuit or circuits can be cycled back into the validator to redefine the functional parts. Thus, the number of evaluator/actuator cycles is repeated according to the number of desired states/state changes, or until there are no more possible modifications.
Description
Structural
A circuit seen as an organized set of the parts is implemented as a data structure, where this data structure provides the specific organization.
Behavioral
The interactions and functions of parts and other biological elements are applied as functions that each part can call on the validator and actuator, or as information. Alignment interactions define the functional parts in the validator. Other interactions are considered for the actuator to generate the changes the circuit will go through, whether as requirements or as function itself.
Execution: application/integration
The software framework is the direct application of the execution phase. The
generator-validator stages execute the part's functions as in the atomic workflow,
enabling the modelling of the circuits as state machines. On the other hand, as the
software framework's objective is the design of circuits and the exploration of the
design space, the generator-validator circuit modelling requires the integration of
other stages that would complement it. This was implemented as the first and fourth
stages, generator and analysis, which coupled enable the building, saving and analisis
the circuits.
Roquet, N., Soleimany, A. P., Ferris, A. C., Aaronson, S., & Lu, T. K. (2016). Synthetic recombinase-based state machines in living cells. Science (American Association for the Advancement of Science), 353(6297), aad8559. doi:10.1126/science.aad8559
Oishi, K., & Klavins, E. (2014). Framework for engineering finite state machines in gene regulatory networks. ACS Synthetic Biology, 3(9), 652-665. doi:10.1021/sb4001799
Madec, M., Rosati, E., & Lallement, C. (2021). Feasibility and reliability of sequential logic with gene regulatory networks. PloS One, 16(3), e0249234. doi:10.1371/journal.pone.0249234
Magdevska, L., Pušnik, Ž, Mraz, M., Zimic, N., & Moškon, M. (2017). Computational design of synchronous sequential structures in biological systems. Journal of Computational Science, 18, 24-31. doi:10.1016/j.jocs.2016.11.010
Letsou, W., & Cai, L. (2016). Noncommutative biology: Sequential regulation of complex networks. PLoS Computational Biology, 12(8), e1005089. doi:10.1371/journal.pcbi.1005089
Fritz, G., Buchler, N., Hwa, T., & Gerland, U. (2007). Designing sequential transcription logic: A simple genetic circuit for conditional memory. Systems and Synthetic Biology, 1(2), 89-98. doi:10.1007/s11693-007-9006-8.
Zúñiga, A., Guiziou, S., Mayonove, P., Meriem, Z. B., Camacho, M., Moreau, V., . . . Bonnet, J. (2020). Rational programming of history-dependent logic in cellular populations. Nature Communications, 11(1), 4758. doi:10.1038/s41467-020-18455-z
Stelling, J., Szallasi, Z., & Periwal, V. (2019). System modeling in cell biology: From concepts to nuts and bolts The MIT Press. Retrieved from https://www.vlebooks.com/vleweb/product/openreader?id=none&isbn=9780262257060
Densmore, D., & Anderson, J. C. (May 2009). Combinational logic design in synthetic biology. Paper presented at the 301-304. 10.1109/ISCAS.2009.5117745 Retrieved from https://ieeexplore.ieee.org/document/5117745
Synthetic biology open language. Retrieved from https://sbolstandard.org/