Team:TEC COSTA RICA/Description

Description

Our project consists of two phases. The first encapsulates the circuit and focuses on the fulfillment of specific requirements through an in-depth architectural design along with a thorough assessment of the circuit’s behavior and characteristics. The second focuses on a conceptual framework: a complete guide for the assessment of sequential logic in synthetic biology, together with its implementation as a model for our software tool’s design.

For the design of said circuit we defined three essential needs:

1. A way to determine the lifetime of our engineered bacteria
2. A mechanism to prevent the life span from running out during laboratory manipulation
3. A multilayered killswitch to ensure a safe death

We tackled our first design requirement by exhaustive brainstorming and research. Our main idea involved a counter system, hence, multiple mechanisms to perform this count were discussed, from binary linear systems dependent on inducible promoters to our final implementation, a sequential logic based recombinase device. Some of the counter systems we assessed and used as inspiration are in the comparison section.

In spite of the value of the previously mentioned proposals, none of the counters fit our more specific needs which were focused on the complexity, count limit and length of the device. Since we envisioned our circuit being used as an auxiliary to other applications, it was essential that the available count was as high as possible without the length growing exponentially. This left out counters such as the ones proposed by Groningen Team (2011) and Paris Liliane Bettencourt Team (2010), since they behave linearly and “in situations where the same signal is being recorded over multiple occurrences (for example, a series of cell division events), reliably rewritable elements are needed to realize geometric increases in data storage capacity (for example, combinatorial counters capable of recording 2N events given N storage elements)” (ref). Whereas Team: UT-Tokyo’s (2016) approach was not only lengthy but also complex and imposing a high metabolic burden, which were undesired qualities as well. Other proposals, even with 2n capability, displayed undesirable qualities such as the need for very specific regulatory systems such as the delay system presented by Zhao et al. (2019). Also, many of the counters were based on the theoretical assumption that there are n protein-induced promoters or n serine recombinases which don't have crosstalk issues being available, stating the limit at the number of orthogonal characterized recombinases (Yang et al., 2014). While good for stating our counter’s basis, such as the viability of placing several recombinases next to each other and having them behave correctly, we wanted to go even further. This led us to propose our own counter system, which went through several changes and iterations as explained in {engineering success} until we came up with a suitable approach as validated by PhD. Pakpoom Subsoontorn and PhD. Jérôme Bonnet.

It’s important to note that our counter is a practical representation and proof of concept for the sequential-logic based development of genetic circuits that we propose in the {conceptual framework}.

The final genetic suicide circuit design consists of different modules, whose functions are explained next.

01

Repress-Initialize

The first device was designed to maintain the other three inactivated until the user is ready to trigger the countdown with the intent of permitting the manipulation, cultivation, etc. of the engineered organism without the circuit counter being activated. Otherwise, if the circuit wasn’t under strict control, the growing times and uses of these microorganisms would be limited. The proposed mechanism consists of regulating the transcription of the main promoter via repression triggered by an inducible promoter.

02

Count

Autonomous genetic counter whose purpose is to keep track of the number of times a cell goes through cell division (from the moment the counter is activated by the first device). The counter is based on a cell-cycle coupled promoter, recombinases and unidirectional terminators, as detailed in the {architecture section}. This device is able to count up to 2n where n is the number of recombinases.

03

Expand Count

The third device’s purpose is to enable a higher count, hence it’s also regulated by the second device’s promoter and it’s based on a self-excising DNA segment that resets the counter state of the previous device. This system is CRISPR Cas-9 reliant and amplifies the count by x times where x is the number of repeats of self-excising segments.

04

Kill Switch

Multi-layered containment system designed with the objective of killing the host cells leaving no trace and whose initiation depends on previous devices. Several toxins are involved in this device, including RelE and NucA1, along with self-targeting single guides to ensure its safety.

In the following image you can see the design of our whole device and, by clicking on the boxes below, you can access our insight on each of the different parts and modules.

Promoters

Promoters are the main regulatory units for translation; they define where translation starts and under which conditions (iGEM, s.f.a).

Promoter 1: pBAD {BBa_I0500}

This promoter is induced by arabinose and it’s one of the most characterized for heterologous protein expression (Széliová et al., 2016), and commonly used in iGEM and synthetic biology itself (http://parts.igem.org/PBAD_Promoter_Family http://parts.igem.org/Part:BBa_I0500). Many of the other counter systems discussed in this project were developed and characterized with this promoter as its regulatory element (Roquet et al., 2016; Bonnet et al., 2012; Zhao et al., 2019; Yang et al., 2014).

Promoter 2: pnrd {BBa_K2070012}

This promoter is cell cycle dependent according to (Sun 1992), since it is a highly regulated promoter region which contains several DnaA boxes, indicating that activity is partly affected by initiation of DNA replication (Messer 2002). The nrd operon itself in E. coli expresses a ribonucleotide reductase, an enzyme that reduces ribonucleotides into deoxyribonucleotides and is involved in bacterial cell cycle in E. coli (Sun 1994). This part has been used previously by Stanford Brown in 2012 (https://2012.igem.org/Team:Stanford-Brown); they proved that the promoter turned on once every cell cycle. A variation of this promoter was also used by UT-Tokyo 2016 (link) to build a regulatory system in which gene expression changes after each cell division which is very similar to our own purpose. The use of this promoter ensures that our system will be activated only once every cell cycle.

Promoter 2: pnrd {BBa_K2070012}

This promoter is cell cycle dependent according to (Sun 1992), since it is a highly regulated promoter region which contains several DnaA boxes, indicating that activity is partly affected by initiation of DNA replication (Messer 2002). The nrd operon itself in E. coli expresses a ribonucleotide reductase, an enzyme that reduces ribonucleotides into deoxyribonucleotides and is involved in bacterial cell cycle in E. coli (Sun 1994). This part has been used previously by Stanford Brown in 2012 (https://2012.igem.org/Team:Stanford-Brown); they proved that the promoter turned on once every cell cycle. A variation of this promoter was also used by UT-Tokyo 2016 (link) to build a regulatory system in which gene expression changes after each cell division which is very similar to our own purpose. The use of this promoter ensures that our system will be activated only once every cell cycle.

Recombinases

DNA site-specific recombinases are enzymes that catalyse breaking and rejoining of DNA strands at specific points, thereby bringing about precise genetic rearrangements. Serine integrases are a group of recombinases with unusual properties such as recombination directionality and simple site requirements (Stark, 2017). According to Zhao et al. (2019) when inverting, serine recombinases recognize specific genetic sites, attB and attP (PB state), and invert the sequence between these sites generating the new sites attL and attR (LR state). This inversion is unidirectional, meaning that another component must be present for the inversion to be reversed. This component is the RDF (recombination directionality factor), and, when co-expressed with the recombinase, allows to switch the LR state back to PB (Zhao et al., 2019). Depending on the DNA arrangement, the use of integrases allow the control of the flow of RNA polymerase along the DNA, hence regulating genetic expression (Merrick et al., 2018). The selected recombinases have been tried out together and display orthogonality (Guiziou et al., 2019). They have also been characterized in several other chassises, making our circuit easier to extrapolate (Tomimatsu et al., 2017; Xu & Brown, 2016). BXB1 and TP901 particularly have been used in similar proposals such as Hsiao et al. (2016) y Roquet et al. (2016). It’s important to highlight that the order assigned to these recombinases is not random, and was supported by our mathematical model.

Recombinase 1:Bxb1

We chose Bxb1 as the first integrase since it displayed the best recombination efficiency to obtain products with no residual substrates remaining as reported by Wang et al. (2017), showing that the activity of Bxb1 integrase is obviously superior compared with any other integrase. Bxb1 integrase mediated excellent, efficient site-specific recombination in vitro in this study.

Recombinase 2, 3 and 4: Represent TP901, Int5 and Int7 respectively.

There are several versions of each recombinase and other recombinases available in the registry http://parts.igem.org/Recombination/Other

Terminator

Terminators define the end of translation, they can be uni or bidirectional and are relevant to correct protein expression (iGEM, s.f.b).

Terminator 1: {BBa_B1002}

Bidirectional terminators to ensure the separation of the different modules, i.e.

Terminator 2: {BBa_B0010}

Only functions in its forward direction and can be used to regulate translation of polycistronic elements similarly to Friedland et al. (2009)’s proposal, in which they built a DNA Invertase Cascade (DIC).

CRISPR+ligase

The CRISPR-Cas9 system, a gene editing system, depends on two main elements, a small RNA fragment that contains the guide sequence that binds to the target DNA sequence and the Cas9 enzyme joined to the guide RNA sequence. When the guide is transcribed, it binds to the Cas9 in order to recognize the target DNA sequence which is then cut by the enzyme, creating a double strand break. This break can be used to delete, insert or replace DNA if used coupled with the cell’s own repair machinery (MedlinePlus, s.f.). In our case E. coli doesn’t have a non-homologous end joining (NHEJ) repair system, so a double stranded break can only be fixed by homologous recombination. Since we don’t want the insertion of a template to be required, we searched for a different alternative, and found that a T4 ligase can fix the break creating indels that knockout the gene and prevent the cell from committing suicide because of the broken DNA (Su et al., 2016). This is according to Su et al. (2019), who report using a ligase for repairing a DSB made by Cas9 in LacZ gene in E. coli, leading to the survival of cells. In turn Su et al. (2016) report using a different strategy: which involved the design of sg to avoid HR and the use of conserved prokaryotic NHEJ proteins from Mycobacterium tuberculosis H37Rv instead for repairing complete gene deletions, proving that this is in fact possible even though in their case it had only 36% efficiency. We believe that by joining the best of each strategy, the ligase and the sg design, and having a much smaller deletion target (which in turn is closer to DNA deletion consistency at 200bp) we can have a much better efficiency in both deletion and repair.

Cas9

http://parts.igem.org/Part:BBa_K1774001:Experience

Sg

Our single guides are to be designed to fulfill several requirements for an increased specificity and efficiency and decreased off-target effects as proposed by Liu et al. (2020).

Ligase

{BBa_K3917004}, created by basing off of Su et al. (2019).

Operator sites

“A common method of repressing gene expression in prokaryotes involves the binding of a protein to a DNA sequence (a.k.a. operator) near or overlapping its promoter” (Politz et al., 2013). These operators, in our design, were purposely ambiguous, since their identity can vary depending on available technology. The common requirement they must fulfill is binding to specific DNA sequences. A readily available mechanism to do this is taking advantage of already existing repressors such as LacI from the Lac operon or the phage encoded lambda repressor (cI) (Lewis, 2011). The deactivated version of the cas9 protein is also an alternative, but its location would have to be rethought, since it would need to be active and functional for all three cases (Xu & Qi, 2019). Finally, other synthetically designed DNA-binding proteins can be used, TALEs being one of the most used of these categories (Politz et al., 2013).

Multi-layered containment system

A multi-layered safeguard system is a series of barriers that can decrease the escape frequency of synthetically modified microorganisms (Gallagher et al., 2015). In this device, the main objective is to develop a biocontainment, condition independent, kill switch.

RelE: modified {BBa_K2449029}, removing the double terminator.

The toxin RelE, a mRNA interferase, causes translation inhibition by cutting mRNAs positioned on site A of the ribosome. It cleaves the stop codons of the mRNA between the 2nd and 3rd nucleotide (https://www.uniprot.org/uniprot/P0C077), which blocks processes such as its binding to release factors, release of formed polypeptides and availability of ribosomes to be used again (SJTU-BioX-Shanghai, 2009). This ultimately hinders the correct translation and promotes a decrease in cell growth. However, since mRNA interferases play a role in bacterial persistence to antibiotics (https://www.uniprot.org/uniprot/P0C077), it is necessary to additionally apply other death mechanisms together.

Nuclease A1: modified {BBa_K3027000}, removing its terminator.

The nuclease A1 is in charge of the genomic DNA elimination, without being associated with cell lysis (GO_Paris-Saclay, 2009), which is important because the genetic material can remain contained during the DNA escision. When the cell is left without DNA, some remnants, like ribosomes and housekeeping proteins, could still get transferred when interacting with other microorganisms and provide them new functions including and beyond our specific modification. DNA-less cells are still equivalent to cells with DNA (GO_Paris-Saclay, 2009) with the minimal genome to maintain metabolic homeostasis, reproduce, and evolve (Gil et al., 2004).

CRISPR: GMK target

Already described mechanism of action (see CRISPR+ligase). In this module, it includes is a sgRNA to target the nucleoside monophosphate kinases Gmk, a multimeric enzyme showed to be determinant in the reduction of the nucleotide pool and decrement of growth and/or respiration rate when it is knocked out, due to the specificity established with the phosphate acceptor substrate (EcoCyc, 2012). We also aim to use another set of sgRNA to generate breaks to our synthetic construct, leaving no possibilities for synthetic genomic remnants to be freed. Single guides are to be designed as specified previously, targeting the Gmk gene, the synthetic circuit’s main promoter and its recombinases.

Plasmid

Heterologous expression in E. coli may be achieved through several vectors such as plasmids or bacterial chromosomes, as well as by the insertion of DNA directly into the bacterial artificial chromosome. The ideal implementation for our proposal is chromosomal insertion, in order to prevent the existence of cells with a mixed expression within them (Zhao et al, 2019), otherwise understood as several counters in a different state. It’s also a way to allow the stable expression of foreign DNA without the need for antibiotic selection, the increased protein expression and the decrease of metabolic burden (Englaender et al., 2017). The likelihood of uptake of the synthetic DNA by other organisms is also decreased by this strategy. Nonetheless if it’s to be implemented in a plasmid, this should be low-copy in order to reduce the noise caused from the plasmid’s divergent expression, and the selection mechanism should be via auxotrophy to allow for more diverse implementations.

Degradation tags and RBS optimization

For our system to be as efficient as possible all of the recombinase’s expressions must be fine-tuned. One of the most common ways to do this is via RBS selection, which would allow the balancing out of the recombinases and their expression rate by regulating their translation (Jin et al., 2017). Also, this way the ratio of 2:1 needed for correct RDF:recombinase expression can be achieved. Another relevant point for our system is that recombinases are to execute their work and then be degraded as quickly as possible to not interfere with other recombinases. This is particularly true for RDFs, since they’re being generated several generations previous to their participation. Hence, we researched ways to fast-teack the degradation of proteins, particularly the application of degradation tags, which are a mechanism for targeted proteolysis mediated by the addition of a degradation tag to the coding sequence (McGinness et al., 2006) . These have been used in counter systems to improve their efficiency such as that (SYSU China Team, 2015), and can similarly be applied to our circuit.

Evolutionary Stability

Evolutionary stability of recombinase based memory circuits

According to Fernandez-Rodriguez et al. (2015) one of the design principles for long-term stability is the avoidance of constitutive expression, which we have applied in our circuit. They also mention that “Placing the circuit in the genome could also improve stability, as this is a common approach to improve the robustness of strains in metabolic engineering” (Fernandez-Rodriguez et al., 2015). Another consideration that must be made is the prevention of leaky expression, since it can generate a drift into a different state and even disable the ability to hold a state. In our circuit the expression of all recombinases is regulated by the same promoter, so this issue is not expected to be relevant to its performance. Canton et al., (2008) note that the evolutionary stability is also dependent on the state in which the circuit is carried; the authors note that a circuit in the OFF state can easily be carried over many generations, whereas in the ON state it quickly breaks. This specific quality can be problematic for our circuit, since one way or another it’s always turned on because of the dependence on the cell-cycle promoter. Nonetheless, since different activities are carried out in each specific state and there is no constant expression of regulatory proteins, this might not be relevant to the circuit’s stability, since “the burden is a function of regulator expression, the evolutionary stability depends on the state in which the circuit is carried” (Fernandez-Rodriguez et al., 2015).

This was seconded by Sleight et al. (2010), who state that “Evolutionary stability is a problem in genetic circuits if there is no selective pressure to maintain function of the circuit. The current belief is that this loss-of-function occurs because any cell in the population that acquires a mutation in the genetic circuit often has a growth advantage and can outcompete the cells in the population with all functional plasmids”; hence, if there is no significant metabolic burden, there is no advantage. Finally, we weren’t able to carry out an evolutionary stability experiment of our own; however, extrapolating the results obtained by Fernandez-Rodriguez et al. (2015) shows great promise for our circuit. They report a stability of over 400 hours for their memory switch, and credit its breaking to continuous invertase expression and the host’s evolution towards reducing its expression. More importantly, Fernandez-Rodriguez et al. (2015) indicate that during the 400 hour period, 6 cycles of their circuit, “there was no loss of performance, reduction in dynamic range, or increase in population variability”. The main risk to our circuits stability are transposable insertion elements and even homologous recombination, which can be avoided by using diverse parts in the construction of the circuit and using strains where insertion elements have been deleted from the genome; nevertheless, both these strategies involve challenges on themselves, particularly the need for more robust strains for many applications, as stated by Fernandez-Rodriguez et al. (2015). Finally, although evolutionary stability must be considered when designing a circuit, the main focus should be on the specific end application. The required robustness of a circuit will be directly related to the duration of its application. “For example, circuits do not need to be robust for months if designed for a one week fermentation”(Fernandez-Rodriguez et al., 2015).

The time for the loss of function of a genetic circuit can also be predicted by applying the function developed by Arkin & Fletcher (2006) through a simulation study.

Evolutionary stability of kill switches

According to Stirling et al. (2017), “any kill switch with leaky, low-level expression of a toxin in permissive conditions may be quickly disabled in rapidly growing microbes.” In our circuit this is highly unlikely, since the expression of the toxins is regulated by the same promoter that controls the expression of recombinases, and between them stand several terminators. Furthermore, the toxins from our kill-switch are to be expressed only once, and induced by the change of state from the previous module (counter circuit), hence avoiding the complexities of an environmentally induced kill switch. The necessity for induction is another decisive factor in the stability of the circuit, since, uninduced there is no selective pressure for it to be dismantled. This was proven by Pasotti et al. (2011), who evaluated the stability of an inducible killswitch, finding it to be stable for a maximum of 100 generations and strain dependent.

Further optimization for the final application can be done by applying tools such as “Predicting the Genetic Stability of Engineered DNA Sequences with the EFM Calculator” and making the necessary alterations regarding the sequence such as altering codon usage in the different CDSs or swapping out promoters/rbs of equal strength (Jack et al., 2015).

For further characterization and application examples, check out our Proposed Implementation page.

Modularity

Each of these modules can be applied separately to other proposals, and modified accordingly, and even exchanged for others. For example, the final module can be changed to execute a specific function instead of inducing the cell’s death. Even the toxins from the multilayered containment system can be changed if a different death approach is desired, like having specific conditions (like pH) unfold. Regarding our crucial module, the counter, it can have its recombinase number diminished or increased easily, redesigning it by following the architectural generality of the circuit: a single promoter at the beginning, then the first recombination site of all the recombinases, then as many basic modules as required, (RBS+ recombinase+ unidirectional terminator+ second recombination site+ RDF). *The final recombinase’s RDF must only be included if the reset is to be part of the sequential process.

The functioning is based on four main premises based on the architecture:

  • The promoter will work one time per cell cycle.
  • The terminator will only work in its forward direction.
  • Each recombinase (when expressed) will invert non-reversibly the DNA sequence when such is flanked by PB sites.
  • Each recombinase together with the recombination directionality factor (RDF) (when both are expressed) will invert non-reversibly the DNA sequence when such is flanked by LR sites.

Repress-Initialize

During lab work with the cloned bacteria, the inducer must be provided in order for the counter to be repressed. Once ready to implement the bacteria on its specific application the inducer must be withdrawn from or not supplied to the culture. The repressor will then be degraded and the counter will start.

Counter Circuit

A

First generation

Only the first recombinase is transcribed and hence expressed. It inverts non-reversibly the sequence between the recognition sites in the PB state, which now are in the LR state. This leaves/positions the recombinase and RDF coding sequences in the same reading frame (reverse).

B

Second generation

Because of the past inversion, the recombinase 1 and its respective RDF are reverse and the first terminator is now inactive, which allows the second recombinase to be expressed. The recombinase 2 inverts the sequence between the recognition sites in the PB state, which become the LR state. This leaves the recombinase 2 and RDF coding sequences in the same reading frame (reverse).

C

Third generation

The recombinase 1, RDF and terminator are now in the correct reading strand. Only the rec 1 and the RDF are expressed, thus allowing the invertion between/of the sites in the LR state back into the PB state. This leaves the RDF 1 coding sequence in the same position (forward) and the recombinase 1 in the reverse.

D

Fourth generation

Recombinase 1, and recombinase 2 with the RDF 2 are in the reverse strand and the first two promoters are inactive, thus, only the RDF of the recombinase 1 and the third recombinase are expressed. The RDF has no effect without its corresponding recombinase. The recombinase 3 inverts irreversibly the sequence between the recognition sites in the PB state, which now are in the LR state. This leaves the recombinase 3 and RDF 3 coding sequences in the same reading frame (reverse).

E

Fifth generation

The RDF 2 and rec 1 are expressed. The RDF 2 has no effect without its corresponding recombinase. The recombinase 1 inverts irreversibly the sequence between the recognition sites in the PB state, which now are in the LR state. This leaves the recombinase 1 and RDF 1 coding sequences in the same reading frame (reverse).

F

Sixth generation

Only the rec 2 and the RDF 2 are expressed, thus allowing the invertion between/of the sites in the LR state back into the PB state. This leaves the RDF 2 coding sequence in the same position (forward) and the recombinase 2 in the reverse.

G

Seventh generation

The RDF of the recombinase 2 and the first recombinase with its corresponding RDF are expressed. The RDF 2 has no effect without its corresponding recombinase. The recombinase 1 with the RDF invert the sequence between the recognition sites in the LR state back into the PB state. This leaves the RDF 1 coding sequence in the same position (forward) and the recombinase 1 in the reverse.

H

Eight generation

The fourth recombinase is expressed which inverts the complete circuit which is located between the PB sites and switch them in the LR conformation.

I

Ninth generation

The RDF of the recombinase 3 and the recombinase 1 are expressed step 1 is repeated. The RDF 3 has no effect without its corresponding recombinase. The recombinase 1 inverts non-reversibly the sequence between the PB recognition sites, which now are in the LR conformation. This leaves/positions the recombinase 1 and RDF 1 coding sequences in the same reading frame (reverse).

J

Tenth generation

The RDF 3 and recombinase 2 are expressed. The RDF 3 has no effect without its corresponding recombinase. The recombinase 2 inverts non-reversibly the sequence between the PB recognition sites, which turn/ now are in the LR conformation. This leaves/positions the recombinase 2 and RDF 2 coding sequences in the same reading frame (reverse).

K

Eleventh generation

The RDF 3 and the rec 1 and the RDF 1 are expressed. The RDF 3 has no effect without its corresponding recombinase. The recombinase 1 and RDF 1 allow the invertion between/of the sites in the LR conformation back into the PB conformation. This leaves the RDF 1 coding sequence in the same position (forward) and the recombinase 1 in the reverse.

L

Twelfth generation

The RDF 1 and the rec 3 and the RDF 3 are expressed. The RDF 1 has no effect without its corresponding recombinase. The recombinase 3 and RDF 3 allow the invertion between/of the sites in the LR conformation back into the PB conformation. This leaves the RDF 3 coding sequence in the same position (forward) and the recombinase 3 in the reverse.

M

Thirteenth generation

The RDF 3 and the rec 2 and the RDF 2 are expressed. The RDF 3 has no effect without its corresponding recombinase. The recombinase and RDF allow the invertion between/of the sites in the LR conformation back into the PB conformation. This leaves the RDF 2 coding sequence in the same position (forward) and the recombinase 2 in the reverse.

N

Fourteenth generation

The RDF 2, RDF 3, rec 1 and the RDF 1 are expressed. The RDFs have no effect without their corresponding recombinase. The recombinase and RDF allow the invertion between/of the sites in the LR conformation back into the PB conformation. This leaves the RDF 1 coding sequence in the same position (forward) and the recombinase 1 in the reverse.

O

Fifteenth generation

Counter reaches an end and transcription is able to carry on with the next device.

Expand Count

A single segment is analyzed in this animation, it consists of an inducer for the inducible promoter that controls the fourth recombinase, a Cas9 and a ligase, the recombinases’ directionality factor which allows it to invert the sequence from LR to PB, and two single guides which enable the excision of the segment by the Cas9’s action. Hence, one by one the auto-excisable segments are expressed and deleted until the last one is reached, giving way for the last module to express itself. It’s important to note that only a slight change in terminator sequence is needed to yield dysfunctional activity for each module.

Kill Switch

When the RelE toxin is expressed, translation of proteins slowly comes to a halt. Simultaneously genomic DNA will be destroyed by the NucA toxin, the Gmk gene will be targeted by the Cas9 and our construct will be destroyed by this very same enzyme. The coupled action of these enzymes will ultimately kill the cell without releasing its genetic contents.

Overcount Prevention

“A potential challenge in designing a robust biological counter is the ability to count at completion of the event. The existing designs of the counters are sensitive to the pulse duration–a brief pulse will be ignored and a lengthy pulse can cause the counter to count ahead” (Noman et al., 2016). Given the concerns over the regulation of the count expressed by Dr. Pakpoom and Dr. Bonnet, further explained in engineering success, we designed a module to prevent overcount. This system works by placing an proteic inducer for an external promoter and an operator site after the main promoter. When active, the external promoter will produce a repressor which will in turn bind to the operator site preventing translation. The central concept to it is affinity, both the proteic inducer and the operator affinity are tunable, the first more than the latter. Once a pulse is received by the main promoter the proteic inducer will be expressed, along with the respective recombinase depending on the circuit’s state. If designed correctly, the affinity of the proteic inducer and the repressor will be just so that exactly the right amount of recombinase will be produced in order to change state without doing it twice or being shut off before being able to act. We designed a construct to prove this hypothesis, you can check our work on it in Wet Lab.

Another possible solution to the response of a continuous pulse would be to include an incoherent feedforward loop to regulate the type of pulse the system responds to, as suggested by PhD. Rodrigo Mora. This type of standardized regulation consists of an activator which “regulates both a gene and a repressor of the gene”, similar to the manner in which we proposed our overcount system (Goentoro et al., 2009).

Promoters

Additionally, we propose the use of inducible promoters (such as pBAD) instead of pnrd for user-controlled counter systems. These could even involve the association of the counter to specific environmental conditions such as pH, temperature, presence of certain molecules and others can easily be done by changing the initiating module to one dependent on these conditions.

Containment

Plasmid specific

A toxin-antitoxin could also be used for regulation and the actual kill (fourth device), in our proposal it’s counter effective to have another regulatory device of this magnitude but for systems in which a plasmid needs to be added and contained it could be useful, for example, the toxin could be placed on the plasmid and the respective antitoxin on the genome, avoiding horizontal gene transfer, because the plasmid will get the toxin but not the antitoxin to persist in other microorganisms (Torres et al., 2016). However, the toxin-antitoxin systems are usually complemented by a synthetic auxotrophy in which cells can only grow in the presence of an exogenously supplied metabolite that limits the cell survival under desired conditions, but implicates some risks (Chan et al., 2016). Toxin-antitoxin systems have been reported with high mobility between genomes through horizontal gene transfer and its stability in the genome is still uncertain due to mutations accumulation due to genetic drift (Van Melderen & Saavedra, 2009), and auxotrophy could be satisfied by metabolite cross-feeding or essential small molecules on environmental conditions (Gallagher et al., 2015), besides it requires extensive genome-wide engineering and reprogrammation for different environmental conditions is complicated (Chan et al., 2015). Nevertheless, we propose Gmk (https://www.uniprot.org/uniprot/P60546#sequences) as an auxotrophy gene, which has been reported as an essentialness by single-gene knocking out technique (Baba et al., 2006). It is important to denote that some toxin-antitoxin systems are able to lysate cells (Van Melderen & Saavedra, 2009), which is not beneficial for the optimal work timing of the whole system we propose.

Specific toxins

If a toxin could be designed to target our strain and only our strain of E. coli it could be released from the cells that first reach the end of the counter, reducing the noise of signal processing from each cell and ensuring that even cells with flaws in the system are killed.

Growth phase association: quorum sensing

Suggested by our users, i.e. M.Sc Alexander Schmidt, this proposal can be achieved in the same manner as the one described in the previous “promoters” section. Nonetheless, its implications are much more extensive, since “quorum sensing allows groups of bacteria to synchronously alter behaviour in response to changes in the population density and species composition of the vicinal community” (Mukherjee & Bassler, 2019). “QS is likely to control behaviours that are crucial for the development and success of these communities in diverse environments such as the human gut and chronic infections in humans” (Whiteley et al., 2017). According to (Wu et al., 2020) “Dynamic control of bacterial populations usually includes population size control, dynamic metabolic engineering for desirable products and the regulation of various physiological activities”. This regulation mechanism has been used in the development “of various genetic circuits such as genetic oscillators, toggle switches and logic gates with AHL-based QS devices in Gram-negative bacteria” (Wu et al., 2020).

To further demonstrate the novelty and capacity of our counter circuit we present a comparison with previously developed counters, highlighting their characteristics of interest.

Adan, A., Alizada, G., Kiraz, Y., Baran, Y., & Nalbant, A. (2017). Flow cytometry: basic principles and applications. Critical reviews in biotechnology, 37(2), 163-176. https://doi.org/10.3109/07388551.2015.1128876 ">https://doi.org/10.3109/07388551.2015.1128876 ">https://doi.org/10.3109/07388551.2015.1128876

Arkin, A. P., & Fletcher, D. A. (2006). Fast, cheap and somewhat in control. Genome biology, 7(8), 1-6. https://doi.org/10.1186/gb-2006-7-8-114

Baba, T., Ara, T., Hasegawa, M., Takai, Y., Okumura, Y., Baba, M., & Mori, H. (2006). Construction of Escherichia coli K‐12 in‐frame, single‐gene knockout mutants: the Keio collection. Molecular systems biology, 2(1), 2006-0008. https://doi.org/10.1038/msb4100050

Beal, J., Baldwin, G. S., Farny, N. G., Gershater, M., Haddock-Angelli, T., Buckley-Taylor, R., & iGEM Interlab Study Contributors. (2021). Comparative analysis of three studies measuring fluorescence from engineered bacterial genetic constructs. PloS one, 16(6), e0252263. https://doi.org/10.1371/journal.pone.0252263

Beal, J., Haddock-Angelli, T., Farny, N., & Rettberg, R. (2018). Time to get serious about measurement in synthetic biology. Trends in biotechnology, 36(9), 869-871. 10.1016/j.tibtech.2018.05.003

Bonnet, J., Subsoontorn, P., & Endy, D. (2012). Rewritable digital data storage in live cells via engineered control of recombination directionality. Proceedings of the National Academy of Sciences, 109(23), 8884-8889. https://doi.org/10.1073/pnas.1202344109

Canton, B., Labno, A., & Endy, D. (2008). Refinement and standardization of synthetic biological parts and devices. Nature biotechnology, 26(7), 787-793. 10.1038/nbt1413

Chan, C. T., Lee, J. W., Cameron, D. E., Bashor, C. J., & Collins, J. J. (2016). 'Deadman' and 'Passcode' microbial kill switches for bacterial containment. Nature chemical biology, 12(2), 82-86. 10.1038/nchembio.1979

EcoCyc. [Data base]. Escherichia coli K-12 substr. MG1655 reference genome (EcoCyc). https://biocyc.org/gene?orgid=ECOLI&id=GUANYL-KIN-MONOMER#

Englaender, J. A., Jones, J. A., Cress, B. F., Kuhlman, T. E., Linhardt, R. J., & Koffas, M. A. (2017). Effect of genomic integration location on heterologous protein expression and metabolic engineering in E. coli. ACS synthetic biology, 6(4), 710-720. 10.1021/acssynbio.6b00350

Fedorec, A. J., Robinson, C. M., Wen, K. Y., & Barnes, C. P. (2020). FlopR: an open source software package for calibration and normalization of plate reader and flow cytometry data. ACS synthetic biology, 9(9), 2258-2266. 10.1021/acssynbio.0c00296

Fernandez-Rodriguez, J., Yang, L., Gorochowski, T. E., Gordon, D. B., & Voigt, C. A. (2015). Memory and combinatorial logic based on DNA inversions: dynamics and evolutionary stability. ACS synthetic biology, 4(12), 1361-1372. 10.1021/acssynbio.5b00170

Fredrickson, J. K., Bentjen, S. A., Bolton Jr, H., Li, S. W., Ligotke, M. W., McFadden, K. M., & Van Voris, P. (1989). Evaluation of terrestrial microcosms for assessing the fate and effects of genetically engineered microorganisms on ecological processes (No. PNL-6850). Pacific Northwest Lab., Richland, WA (USA). https://doi.org/10.2172/6294057

Friedland, A. E., Lu, T. K., Wang, X., Shi, D., Church, G., & Collins, J. J. (2009). Synthetic gene networks that count. Science, 324(5931), 1199-1202. doi: 10.1126/science.1172005

Gallagher, R. R., Patel, J. R., Interiano, A. L., Rovner, A. J., & Isaacs, F. J. (2015). Multilayered genetic safeguards limit growth of microorganisms to defined environments. Nucleic acids research, 43(3), 1945-1954. 10.1093/nar/gku1378

Gil, R., Silva, F. J., Peretó, J., & Moya, A. (2004). Determination of the core of a minimal bacterial gene set. Microbiology and Molecular Biology Reviews, 68(3), 518-537. 10.1128/MMBR.68.3.518-537.2004

Goentoro, L., Shoval, O., Kirschner, M. W., & Alon, U. (2009). The incoherent feedforward loop can provide fold-change detection in gene regulation. Molecular cell, 36(5), 894-899. 10.1016/j.molcel.2009.11.018

Guiziou, S., Mayonove, P., & Bonnet, J. (2019). Hierarchical composition of reliable recombinase logic devices. Nature communications, 10(1), 1-7. https://doi.org/10.1038/s41467-019-08391-y

Heyde, K. C., & Ruder, W. C. (2015). Exploring host-microbiome interactions using an in silico model of biomimetic robots and engineered living cells. Scientific reports, 5(1), 1-12. https://doi.org/10.1038/srep11988

Howell, M., Daniel, J. J., & Brown, P. J. (2017). Live cell fluorescence microscopy to observe essential processes during microbial cell growth. Journal of visualized experiments: JoVE, (129). 10.3791/56497

Hsiao, V., Hori, Y., Rothemund, P. W., & Murray, R. M. (2016). A population‐based temporal logic gate for timing and recording chemical events. Molecular systems biology, 12(5), 869. 10.15252/msb.20156663

Inamori, Y., Murakami, K., Sudo, R., Kurihara, Y., & Tanaka, N. (1992). Environmental assessment method for field release of genetically engineered microorganisms using microcosm systems. Water Science and Technology, 26(9-11), 2161-2164. https://doi.org/10.2166/wst.1992.0686

Jack, B. R., Leonard, S. P., Mishler, D. M., Renda, B. A., Leon, D., Suárez, G. A., & Barrick, J. E. (2015). Predicting the genetic stability of engineered DNA sequences with the EFM calculator. ACS synthetic biology, 4(8), 939-943. doi:10.1021/acssynbio.5b00068

Jin, E., Wong, L., Jiao, Y., Engel, J., Holdridge, B., & Xu, P. (2017). Rapid evolution of regulatory element libraries for tunable transcriptional and translational control of gene expression. Synthetic and systems biotechnology, 2(4), 295-301. doi: 10.1016/j.synbio.2017.10.003

Kim, S., Jeong, H., Kim, E. Y., Kim, J. F., Lee, S. Y., & Yoon, S. H. (2017). Genomic and transcriptomic landscape of Escherichia coli BL21 (DE3). Nucleic acids research, 45(9), 5285-5293. https://doi.org/10.1093/nar/gkx228

Lewis, M. (2011). A tale of two repressors. Journal of molecular biology, 409(1), 14-27. 10.1016/j.jmb.2011.02.023

Liu, G., Zhang, Y., & Zhang, T. (2020). Computational approaches for effective CRISPR guide RNA design and evaluation. Computational and structural biotechnology journal, 18, 35-44. https://doi.org/10.1016/j.csbj.2019.11.006

de Lorenzo, V. (2009). Recombinant bacteria for environmental release: what went wrong and what we have learnt from it. Clinical Microbiology and Infection, 15, 63-65. https://doi.org/10.1111/j.1469-0691.2008.02683.x

McGinness, K. E., Baker, T. A., & Sauer, R. T. (2006). Engineering controllable protein degradation. Molecular cell, 22(5), 701-707. https://doi.org/10.1016/j.molcel.2006.04.027

McKinnon, K. M. (2018). Flow cytometry: an overview. Current protocols in immunology, 120(1), 5-1. https://doi.org/10.1002/cpim.40

MedlinePlus. (s.f.) What are genome editing and CRISPR-Cas9? https://medlineplus.gov/genetics/understanding/genomicresearch/genomeediting/

Merrick, C. A., Zhao, J., & Rosser, S. J. (2018). Serine integrases: advancing synthetic biology. ACS synthetic biology, 7(2), 299-310. 10.1021/acssynbio.7b00308

Mukherjee, S., & Bassler, B. L. (2019). Bacterial quorum sensing in complex and dynamically changing environments. Nature Reviews Microbiology, 17(6), 371-382. https://doi.org/10.1038/s41579-019-0186-5

Noman, N., Inniss, M., Iba, H., & Way, J. C. (2016). Pulse detecting genetic circuit–a new design approach. PLoS One, 11(12), e0167162. https://doi.org/10.1371/journal.pone.0167162

Pasotti, L., Zucca, S., Lupotto, M., De Angelis, M. G. C., & Magni, P. (2011). Characterization of a synthetic bacterial self-destruction device for programmed cell death and for recombinant proteins release. Journal of biological engineering, 5(1), 1-12. https://doi.org/10.1186/1754-1611-5-8

Perez-Garcia, O., Lear, G., & Singhal, N. (2016). Metabolic network modeling of microbial interactions in natural and engineered environmental systems. Frontiers in microbiology, 7, 673. https://doi.org/10.3389/fmicb.2016.00673

Politz, M. C., Copeland, M. F., & Pfleger, B. F. (2013). Artificial repressors for controlling gene expression in bacteria. Chemical communications, 49(39), 4325-4327. 10.1039/c2cc37107c

Roquet, N., Soleimany, A. P., Ferris, A. C., Aaronson, S., & Lu, T. K. (2016). Synthetic recombinase-based state machines in living cells. Science, 353(6297). DOI: 10.1126/science.aad8559

Sleight, S. C., Bartley, B. A., Lieviant, J. A., & Sauro, H. M. (2010). Designing and engineering evolutionary robust genetic circuits. Journal of biological engineering, 4(1), 1-20. https://doi.org/10.1186/1754-1611-4-12

Stark, W. M. (2017). Making serine integrases work for us. Current opinion in microbiology, 38, 130-136. https://doi.org/10.1016/j.mib.2017.04.006

Stirling, F., Bitzan, L., O’Keefe, S., Redfield, E., Oliver, J. W., Way, J., & Silver, P. A. (2017). Rational design of evolutionarily stable microbial kill switches. Molecular cell, 68(4), 686-697. https://doi.org/10.1016/j.molcel.2017.10.033

Su, T., Liu, F., Gu, P., Jin, H., Chang, Y., Wang, Q., & Qi, Q. (2016). A CRISPR-Cas9 assisted non-homologous end-joining strategy for one-step engineering of bacterial genome. Scientific reports, 6(1), 1-11. https://doi.org/10.1038/srep37895

Su, T., Liu, F., Chang, Y., Guo, Q., Wang, J., Wang, Q., & Qi, Q. (2019). The phage T4 DNA ligase mediates bacterial chromosome DSBs repair as single component non-homologous end joining. Synthetic and systems biotechnology, 4(2), 107-112. doi: 10.1016/j.synbio.2019.04.001

Széliová, D., Krahulec, J., Šafránek, M., Lišková, V., & Turňa, J. (2016). Modulation of heterologous expression from PBAD promoter in Escherichia coli production strains. Journal of biotechnology, 236, 1-9. https://doi.org/10.1016/j.jbiotec.2016.08.004

Team: Cambridge. (2009). E.chromi. The International Genetically Engineered Machine iGEM. https://2009.igem.org/Team:Cambridge/Project/Amplification/Characterisation

Team: Groningen. (2011). Count coli. The International Genetically Engineered Machine iGEM. https://2011.igem.org/Team:Groningen/project_description

Team: Paris Liliane Bettencourt. (2010). Memo-cell. The International Genetically Engineered Machine iGEM. https://2010.igem.org/Team:Paris_Liliane_Bettencourt/Project/Memo-cell

Team: SJTU-BioX-Shanghai. (2009). E.coli the napper. he International Genetically Engineered Machine iGEM. https://2009.igem.org/Team:SJTU-BioX-Shanghai/Project_design#Overview

Team: Stanford-Brown. (2012). The transit of synthetic astrobiology. The International Genetically Engineered Machine iGEM. https://2012.igem.org/Team:Stanford-Brown

Team: SYSU China. (2015). Micro-time system. The International Genetically Engineered Machine iGEM. https://2015.igem.org/Team:SYSU_CHINA/Design

Team: UT-Tokyo. (2016). Like does not beget like. The way to make a 100-stage cycle. The International Genetically Engineered Machine iGEM. https://2016.igem.org/Team:UT-Tokyo/Project#The_way_to_make_a_100-stage_cycle

The International Genetically Engineered Machine iGEM. (s.f.a). Promoters/Catalog. https://parts.igem.org/Promoters/Catalog

The International Genetically Engineered Machine iGEM. (s.f.b). Terminators. https://parts.igem.org/Terminators.

Tomimatsu, K., Kokura, K., Nishida, T., Yoshimura, Y., Kazuki, Y., Narita, M., & Ohbayashi, T. (2017). Multiple expression cassette exchange via TP 901‐1, R4, and Bxb1 integrase systems on a mouse artificial chromosome. FEBS open bio, 7(3), 306-317. https://doi.org/10.1002/2211-5463.12169

Torres, L., Krüger, A., Csibra, E., Gianni, E., & Pinheiro, V. B. (2016). Synthetic biology approaches to biological containment: pre-emptively tackling potential risks. Essays in Biochemistry, 60(4), 393-410. 10.1042/EBC20160013

Van Melderen, L., & Saavedra De Bast, M. (2009). Bacterial toxin–antitoxin systems: more than selfish entities?. PLoS genetics, 5(3), e1000437. https://doi.org/10.1371/journal.pgen.1000437

Wang, X., Tang, B., Ye, Y., Mao, Y., Lei, X., Zhao, G., & Ding, X. (2017). Bxb1 integrase serves as a highly efficient DNA recombinase in rapid metabolite pathway assembly. Acta biochimica et biophysica Sinica, 49(1), 44-50. https://doi.org/10.1093/abbs/gmw115

Whiteley, M., Diggle, S. P., & Greenberg, E. P. (2017). Progress in and promise of bacterial quorum sensing research. Nature, 551(7680), 313-320. ​​doi:10.1038/nature24624

Wu, S., Liu, J., Liu, C., Yang, A., & Qiao, J. (2020). Quorum sensing for population-level control of bacteria and potential therapeutic applications. Cellular and Molecular Life Sciences, 77(7), 1319-1343. https://doi.org/10.1007/s00018-019-03326-8

Xu, Z., & Brown, W. R. (2016). Comparison and optimization of ten phage encoded serine integrases for genome engineering in Saccharomyces cerevisiae. BMC biotechnology, 16(1), 1-10. https://doi.org/10.1186/s12896-016-0241-5

Xu, X., & Qi, L. S. (2019). A CRISPR–dCas toolbox for genetic engineering and synthetic biology. Journal of molecular biology, 431(1), 34-47. https://doi.org/10.1016/j.jmb.2018.06.037

Yang, L., Nielsen, A. A., Fernandez-Rodriguez, J., McClune, C. J., Laub, M. T., Lu, T. K., & Voigt, C. A. (2014). Permanent genetic memory with> 1-byte capacity. Nature methods, 11(12), 1261-1266. 10.1038/nmeth.3147

Zhao, J., Pokhilko, A., Ebenhöh, O., Rosser, S. J., & Colloms, S. D. (2019). A single-input binary counting module based on serine integrase site-specific recombination. Nucleic acids research, 47(9), 4896-4909. 10.1093/nar/gkz245

Our motivation

When we decided to explore sequential logic in biology through our software, we encountered two determinant situations from which this conceptual framework was born. The first one was, after several rounds of interviews, most experts we talked to did not know about the relationship between sequential logic and biology. Second, as this is a rather unexplored approach, there is minimal literature and guidance on how to assess a sequentially dependent biological system or how to conceptualize a circuit as a state machine. Due to the lack of information on the topic, we first had to discuss extensively and define foundational information in order to establish the right approach for the software's design. Therefore, we chose to compile what we had gathered and expand on it to generate a guide that would help not only us as the developers of the software, but also its immediate users and even those who wish to implement another tool or just better understand sequential logic in biology.

Our inspiration

Even though the application of sequential logic in synthetic biology is rather unexplored, the area has a lot of potential. As Roquet et al.(2016) remarked, despite functional state machines' prospect to be transformative tools in the understanding and engineering of complex biological systems, their implementation has been hindered because of the absence of a framework that would offer a scalable and generalizable approach. The proposal Roquet et al. (2016) developed was a recombinase-based framework for the implementation of state machines in cells, by encoding the state in a particular DNA sequence; they expanded on it with a large, searchable database of such state machines registers to help researchers design circuits. Even previously, Oishi and Klavins (2014) proposed a framework for building finite state machines with engineered regulatory networks based on repressing transcription factors. However, these approaches pursue highly specific methods and do not focus on a conceptual guide for a wide range of functionalities. Therefore, a more general and modular toolset for the exploration of sequential logic based synthetic biology would greatly benefit scientists and professionals with different levels of knowledge.

Our goal

We set as our objective the proposal of a framework composed of a set of rules and concepts for this approach, which would be the base for the exploration of an incipient field and for the development of novel tools. Such a guide would serve as a structure upon which to design, construct, model and analyze -either computationally or manually- synthetic gene circuits from a sequential logic standpoint. Both the purpose for this framework and the process by which we achieved it, was to ask the right questions that represented our inquiries and offer the corresponding answers to help and guide others. Such questions and answers would begin a discussion to further advance and understand this area. Beyond the applications that can be given to this framework and the work it can support, it also offers a comprehensive and paradigm-shifting guide in itself; therefore anyone who comes across it can dive into the analysis and advantages of biological systems as sequential systems, providing them with a new perspective.

Here we propose a conceptual framework, which is an efficient and modular approach on how to tackle sequentially dependent biological tasks, specifically for synthetic genetic circuits and their design. It is a proposal on how to describe in a high abstraction level the biological elements involved in these types of sequential applications, such as parts, circuits, their environment and their interactions, and their assembly to simulate a behavior as a finite state machine.

01

Objective

The objective of this framework is to conceptualize, in an abstract and modular manner, biological processes as state machines; specifically synthetic genetic circuits for the exploration of their structure and behavior from a sequential logic point of view.

02

Foundational Concepts

Even though we expound the characteristics of the biological problems that match with this approach, it is important to note from the beginning that this is a proposal for complex, time and memory-dependent, or regulatory intricate systems. Therefore, other circuits and systems exist whose behavior doesn't require to be abstracted as sequential logic. For more specific examples of applications, go to implementation where we expand on the uses we and the potential users we talked to propose.

03

Scope and Applications

To better understand our approach, it is important to know that the framework is supported by three key concepts: sequential logic, abstraction and modularity. These foundational concepts are core to our project as they guide the development of the framework and are aligned with our purpose.

Logic

The conceptualization of a genetic circuit's function as sequential logic can be daunting and hard to replicate consistently. There are a number of elements that must be considered to properly represent the circuit's structure and behavior. In a sequential system, the output depends not only on the present inputs but also on the current state. For synthetic biology, a circuit and its environment can be portrayed as a sequential system or state machine when the behavior can be described by states whose transitions are controlled by time-ordered inputs and are affected by past behavior and inputs. The modification and behavior of circuits can be explored through the sequential progress of states. Exploring the potential states of a circuit requires recurrent simulations of its changes and behavior; hence, there must exist an iterable ready approach to tackle these types of problems. A biological process, natural or engineered, that has characteristics such as: being time-dependent, having highly interconnected regulatory networks, behaving differently with repeating inputs, requiring information to be saved stably (Letsou & Cai, 2016; Madec, Rosati, & Lallement, 2021; Magdevska et al., 2017; Oishi and Klavins, 2014; Roquet et al., 2016), would greatly benefit from being conceptualized and analyzed as a sequential logic problem. Moreover, the ability to sequentially modulate the system's response offers high scalability as it enables larger or more complex applications (Zúñiga et al., 2020).

Abstraction

To perform a thorough exploration of the design space (i.e. possible architectures), the framework follows a highly abstract approach. This is necessary since there is a direct relationship between biological accuracy and modeling complexity; therefore, by abstracting most of the biological details we are able to focus on performance, simplicity, modularity and expandability. This tradeoff is valuable from a design perspective where it is more efficient to perform a coarse analysis unlimited and unbiased by biological technicalities and then focus on fine-tuning an oriented and pre-processed set of circuits, based on a specific use-case. This framework works in an area of what is biologically viable, but more complex biological challenges present when looking into implementing a circuit, such as transcription rates or residual protein concentration, are meant to be addressed at other following stages and solved through other existing solutions or tools. This approach allows for a general-purpose framework.

Modularity

Every biological system is different and should not be represented by a static model or relation. Conceptualizations and models such as truth tables work with the coupling of multiple modules but focus on small scale inputs and outputs rather than an overall behaviour and functionality. When we build a device we focus on processes rather than specific responses, which is why we propose an execution through atomic functions that can be stacked to achieve more complex behavior while maintaining simplicity and modularity. The abstract representation we propose uses a modular approach in terms of composition, allowing the user to adjust, append or remove concepts according to the application. Everything is built upon a primary building block to enable modularity and scalability.

The framework is divided into three stages to address all relevant aspects. First, the framework expands on how a circuit, cell or gene regulatory network can be a sequential system in the first phase: {biological state machines}, which dives into sequential logic applied in synthetic biology and its characteristics. Then, it defines the concepts involved in the rest of the framework in the {description phase}: an approach to conceptualize biological elements and their interactions. Lastly, the framework describes how to assess and develop circuits as state machines; the {execution stage} consists of the coupling of the first two, to fully describe and simulate, from a sequential logic standpoint, biological processes and regulatory networks from synthetic circuits.

When?

In order to conceptualize a circuit as a state machine, the behavior must be dependent on the historical and present inputs.

How?

The system’s input is the composition of the circuit's environment. Meanwhile, the output is an expression profile related to the state.

A state, is defined by two elements: the circuit architecture, i.e. the disposition of genetic parts or pair bases, and the gene expression profile. These elements were inspired by previous literature (Fritz et al., 2007; Oishi & Klavins, 2014; Roquet et al., 2016), as we considered multiple perspectives required and advantageous. Therefore, a change in state would be caused by a change in the DNA sequence, by a change in a gene's expression, or by a change in both. The first element, the circuit's architecture, corresponds to the order, orientation and sequence of parts; thus DNA modification caused by insertion, deletion, substitution, inversion, mutation, would cause a change in state. For this, enzymes that modify the DNA strand are essential. This is a highly evolutionary stable way to store the state as it is coded in the DNA and does not require a constant gene expression (with the corresponding metabolic burden) (Roquet et al., 2016). On the other hand, the expression profile refers to determined transcribed genes and their expression levels, which can be reflected in the environment. By extension, it considers gene regulatory networks and epigenetic mechanisms that affect the functionality of the circuit. Although a state defined by this aspect is not as stable, the objective is focused on considering relevant alterations as state changes, since they will affect the response to future inputs. If a change in expression causes a change in architecture or the other way around: if a change in architecture causes a change in expression, both could be grouped into the same state change.

To exemplify, we can consider the following system.
A circuit with two different inducible promoters that achieves five states depending on the order of inputs. State A is the initial state without any induction.

If inducer 1 is the first input, the promoter 1 is activated, the rfp and recombinase 1 are expressed. The rec inverts the sequence between the sites, which changes the circuit architecture and generates state B.

If inducer 1 is added again, the rfp and rec 1 are expressed again but there is no relevant change and therefore there is no change of state. If inducer 2 follows instead, the rec 2 is expressed and the sequence between the sites is excised, which generates state C.

With input 1 next, the gfp is expressed, but if input 2 is inserted instead, the rec 2 is expressed but has no effect.

If instead of this order of inputs, the system in state A starts with inducer 2, the rec 2 is expressed, which excises the sequence between the sites generating state D.

In state D, if the inducer 2 is input, the recombinase 2 is expressed but nothing happens. Instead, if inducer 1 is added next, the yfp and repressor are expressed. The repressor associates to the operator site, which modifies the expression profile, generating state E.

In state E, with time as an input, the repressor degrades, which deactivates the operator site, going back to state D.

Characteristics

The application of sequential logic to biological systems has certain characteristics which help understand better its applicability.

01

Traceability

Sequential logic and state machines generate highly traceable systems since present actions respond to the previous history. In addition, it saves the complete behavior of the circuit, including the progression of states, inputs and outputs, instead of only a DNA sequence. Even further, according to the structure the state machine follows, the history of states can be known with only a single state.

02

Memory-based

Since the current state of the system affects the output and future states, such states can be used for saving information even when the input is gone, rather than losing the response to the input. Therefore, such systems have a memory and can save data.

03

Time-ordered

Since a sequential system responds differently to the same input depending on the current state, the specific point at which the input is present is determinant (Roquet et al., 2016). Due to this ability to modulate the response according to a time-orderly induction, state machines offer the capacity to carry out complex functions (Zúñiga et al., 2020) and sequentially regulate targets which, if using combinatorial logic, would get co-activated (Letsou & Cai, 2016). This enables applications such as counting the number of stimuli (such as in {our counter), or modifying the behavior depending on the order of inputs (Madec, Rosati, & Lallement, 2021). The latter is why cells accomplish a vast number of events and processes with limited genes (Letsou & Cai, 2016). State changes enable the consideration of changes of a circuit over time and how that system evolves, especially the environment.

04

Alternative scenarios

Inherently, sequential logic is well fitted for the consideration of alternative routes as they can be naturally integrated as multiple states that can potentially follow the previous one. This is beneficial when dealing with multiple alternative scenarios, exploring unknown routes, and considering error routes and potential mutations. This allows the proposal of control plans, and a better and more thorough exploration, especially when the inputs or their order is not completely straightforward, as in biology. This can be coupled to probabilities for each state change, such as in a Markov Model. For example, a state machine could be built for a circuit in state 1 that after an input moves into desired state 2 but has a lesser probability of changing into state 3, the alternative scenario. Overall, this characteristic suits biology due to the stochastic element in biochemical reactions (Stelling, Szallasi, & Periwal; 2019).

05

Multipurpose system

The use of logic in biology enables complex response patterns and overall processes. Since a single input can lead to a different output depending on the state, state machines facilitate multipurpose systems, also tightly related to alternative scenarios and a timed order response. With sequentiality, from a single circuit or state machine, a great number of responses and fates can be derived, as it naturally occurs in living cells.

06

Easier troubleshooting

A state machine facilitates an emphasis on the states' progression rather than on a single final output, contrary to a combinatorial system composed of a truth table and logic gates whose structure is highly focused on the output as a 0/1 (on/off) response. Therefore, troubleshooting becomes straightforward when using sequential logic because it doesn't require the circuit to reach its final state for it to fail. In addition to this, by recording the circuit's history, the specific input that went wrong can be easily distinguished from the rest. This is possible given the system's traceability -due to the relation between the state and the output- together with the capacity to measure the current state, especially when the state is saved in the DNA sequence and can be known by sequencing it. In this case, the last state before the system failed can be known by looking at the sequence, instead of only having a failed/succeeded outcome without a way to know when and where the problem occurred.

07

Concept extrapolation

Sequential logic can be explained as decision making based on previous situations. This is a concept analogous to human behavior: making decisions based on past experiences and current situations and opportunities. Furthermore, as a logic circuit, it is based on electronics. Therefore, the concept is easily comprehensible across, even beyond, biological sciences.

08

Biologically suitable

Sequential logic is an approach distinctly similar to nature, as observed in highly interconnected gene regulatory networks, where regulators affect multiple targets and the targets are affected by multiple regulators (Letsou & Cai, 2016). These systems are not limited by a specific input-output response nor by the number of inducers, but rather respond to the temporally ordered presence of inducers (Letsou & Cai, 2016; Madec, Rosati, & Lallement, 2021; Roquet et al., 2016). By following sequential logic, the design of circuits emulates and expands on the advantages nature offers. This allows encompassing more flexible systems and proposing highly scalable designs (Roquet et al., 2016).
Furthermore, sequentiality reduces the necessity of orthogonal inducers, a relevant limitation in synthetic biology, as an induction directly linked to a point in time (or state) practically eliminates the possibility of crosstalk. Therefore, the number of inducers can be minimized if a system does not demand a different one for each desired output. This facilitates the design of circuits by excluding the biological limitation of quantity of highly specific inducers.
Also related to the inducers is the metabolic burden, which can be reduced by storing information in the DNA sequence rather than requiring a constant gene expression or complex inducer combinations.
Another limitation that is minimized is the size of the circuit since the number of pair bases can be reduced by modifying the DNA sequence rather than requiring multiple linearly positioned logic gates. This is especially important in synthetic circuits that must conform to a specific number of base pairs.
Lastly, saving the state in the sequence itself leads to an increase in evolutionary stability (Zúñiga et al., 2020). As we mentioned previously, mutations and stochasticity are core elements of biology. By increasing the stability and offering an approach to consider alternative scenarios, state machines are perfectly designed to describe biological genetic systems.

09

Guided design

Sequentially dependent circuits applied in experimental characterization offer an efficient structure that facilitates a guided design of better and more efficient methodologies. By having a comprehensible map of the complete circuit behavior (parts’ interaction, inputs, outputs, etc.) and sequentially activating target genes or products, it is possible to reduce the characterization time, resources and metabolic burden of the constructs, increase the scalability, and propose smarter architectures.

Structural

Biological elements can have multiple representations, for example, SBOL is a standard for the representation of biological designs, especifically for the electronic exchange of information on their structural and functional aspects (Synthetic biology open language). Electrical engineering iconography is used for biological systems to illustrate logic gates as a function of inputs (Densmore & Anderson, 2009), and even further, strings of the letters ATCG are used to represent DNA sequences, common in DNA manipulation and design tools. Each representation best suits the purpose of its application, whether that is design, mathematical modeling, detailed description, synthesis, bioinformatics analysis, databases. For our framework, the structural segment encompasses the basic representations, as conceptual units, of biological elements that best suit our goal and seek to be abstract and modular to be used for further applications.

Parts

A part is the framework's abstraction of a coding or non-coding DNA sequence with a function. Each part is a conceptual unit with a biological role, e.g. coding, recognizing, expressing, promoting/repressing transcribing. This abstraction is based on the function rather than the sequence itself, thus the biological role is the essential element of a part, regardless of the molecular structure it is directly related to: DNA, RNA or protein. Besides the biological role, parts can represent more specific information related to them, such as the corresponding DNA sequence. Parts have properties, interact with other parts or elements, and execute functions; and they are the building blocks for genetic circuits.

Circuit

The circuit abstraction is what provides cohesion to a set of parts. Circuits are not static, they can modify themselves or be affected by other elements. In its simplest version, a circuit is just an organized set of parts. The organization method defines the circuit architecture and provides functionality. This organization can be established by any structure; whether it is a drawing or a fully-fledged software data structure.

The circuit structure to be used depends on the desired functionality, but at the same time conditions the behavior and relationships between the parts. The most standard structure is a simple one dimensional array and is what probably works best for most applications, since it is the closest to the biological organization of DNA. In this structure, circuits would be limited to adjacent parts, while other non-adjacent parts, circuits or genetic material somehow functionally related would be addressed as part of the environment. This helps set a clear limit to what a single circuit encompasses, since functional interactions with other genetic material can be infinite, and it would separate the circuit (synthetic DNA) from the organism's endogenous genetic material.

If a simple one dimensional structure doesn’t fit the functionality, a circuit can be defined using other structures, such as circular lists or hierarchical tree structures and graphs, if that suits the application best. Remember that the only requirement is that the circuit must be an organized container, the structure can vary.

Environment

A circuit can interact with many different elements apart from itself, such elements can modify the circuit structurally, behaviorally or both. The environment describes both the circuit and these simultaneous conditions that potentially interact with or could affect it, but are not part of the circuit itself. The environment can be subdivided into circuits, internal (plasmid, genome, intracellular proteins) and external (media, environmental conditions).

Since this framework is intended to be applied in a wide range of areas, some that we may not have yet considered the specific requirements for, it is especially important for the element of the environment to be a broad concept to act modularly to factors we may not have reckoned yet.

Behavioral

Since circuit state changes are allowed by the function of the circuit's parts and the environment, it is required to conceptualize how a part, multiple parts and the complete circuit act and interact with each other. The multiple ways in which parts relate with each other and the environment enables them to act, carry out their biological role, expand it, and inhibit other parts. These relationships can achieve high levels of complexity, especially since circuits can modify themselves, the environment and thus change states, requiring their constant validation and integrating dependent networks of interactions. The behavioral stage is still part of the description phase, thus, these are only the definitions of the actions; the process of carrying them out relates to the second phase, the execution.

Interactions

Interactions describe how multiple parts and even other elements of the environment influence each other, affecting their transcription, translation or activity. These interactions are not mutually exclusive and may even overlap. Some of these relationships are:

Alignment

Parts can have a directionality relative to each other in the circuit.
E.g. Two recombinase recognition sites facing each other, facing away, or facing the same direction determine the carried out action; A unidirectional terminator´s direction relative to a coding part.

Position

Parts have a location in the circuit relative to each other.
E.g. a promoter upstream or downstream of a coding part.

Co-existence

potentially interact with each other when coexisting in the same environment.
E.g. A recombinase and a RDF (recombinase directionality factor) can potentially interact to perform the inversion.

Co-dependence

When two or more elements require each other to function there is a codependency between these, that would require for parts not only to be present, but also to be functional.
E.g. A sgRNA requires a Cas protein for it to act; a recombinase requires at least two recognition sites

Recognition

This relationship describes any kind of recognition between two or more element, such as dna-protein, protein-protein, dna-rna, rna-protein.
E.g. An inhibitor and an operator site; an RNA polymerase and the promoter; a protein and an associable subunit

Inhibition

Takes place when elements inhibit another or each others’ transcription, action or overall presence.
E.g. A terminator directly upstream of a coding sequence prevents its transcription.

Functions

A function is an action carried out by a part or group of parts determined by a set of interactions. Some functions are: inversion, excision, edition, insertion, repression, expression.
For example, the function of a recombinase is determined by an interaction of alignment with a promoter for the coding sequence can be expressed, an interaction of codependency with the recognition sites, and an interaction of recognition of the sites.

Application

The application refers to how the actions, based on the structure and relationships, are carried out. This stage focuses on the execution of the parts’ functions from a previously defined circuit and its environment, as a processing workflow. Therefore, the input necessary to develop this workflow is a defined environment -which includes a circuit-, and the final output is one or more modified circuits in their environments. Multiple outputs might be necessary when considering alternative scenarios for the execution of the parts functions. One cycle of this workflow could encompass one or multiple circuit modifications, simultaneous or not, depending on the desired output.

Processing workflow:

  • Define the potential interactions of all elements.
  • Review the environment (excluding the circuit) and define the elements that interact with the circuit to consider them in this cycle of the workflow, the rest can be stored for a following cycle.
  • Traverse the circuit to find the active promoters, terminators, operators and other regulatory parts (i.e. can carry out their function of promoting or repressing transcription), based on their interactions with the environment or because they are constitutive.
  • Traverse the circuit starting from the promoter(s) and following the circuit’s structure (biologically: downstream) to review the coding parts that can be transcribed based on the alignment and position interactions with the parts determined in step 3.
  • Validate the interactions of the elements (part(s) and other environment) to ensure the requirements for the function’s execution are met.
  • If necessary, define a priority to perform the functions or else, a simultaneity strategy.
  • Execute the elements functions following the prioritization or considering as many scenarios as desired to generate one or multiple parallel new states of the circuit, depending on whether alternative scenarios are considered or not.
  • Review changes in the environment, e.g. protein degradation, and extract one or multiple outputs per each modification

Integration

The process described above sets the foundation for operating over a circuit; this process can be the base of larger more complex applications by incorporating it into a larger workflow. This is what we call integration and it is an essential part of studying sequential circuits. This process of how to integrate a specific circuit cycle simulation is largely dependent on the application and can be used to incorporate new elements inline. By modifying a circuit through the processing workflow and then taking the output (the modified circuit and environment) as the new input, we are able to study chains of events and their effects. An alternative to the base cyclic approach would be to connect a circuit generator and a post-execution circuit evaluator; this way smarter selection approaches can be integrated into smarter the process such as genetic algorithms and machine learning applications, or see the compatibility with a desired organism by, for example, analyzing the endogenous machinery and potential crosstalk.

As mentioned before, the conceptual framework was a set of ideas to guide a computational or manual design, construction, simulation and analysis of synthetic gene circuits as state machines. We decided to apply our framework as a base for software tools’ design, where such tools would aid the exploration of alternative circuit architectures, while offering flexibility in the implementation and optimization. This means, we implemented the concepts described and the coupling of them with sequential logic -specially the {processing workflow}- as computational concepts and a workflow to be used as a software’s guide.

A software framework provides an abstraction for generic software functionalities upon which other software applications can be built. It makes implementation details transparent to other developers, which gets rid of most tedious and repetitive work, while providing a sturdy base for development and allowing most of the development effort to be tailored towards functional and valuable software. Thus this software framework can be easily used as a starting point for implementing software tools.

The objective of this software framework is to set a structural guide for facilitating the design of a software to be used for the generation, assessment and manipulation of synthetic gene circuits as state machines.

The workflow presented begins from the premise that the desired parts and their properties are defined. The software framework is divided into four: an initialization stage, two core processing stages and a post-processing stage.

The initialization stage, which precedes the processing, is a circuit generator, where the environment with the circuit is defined. The circuit can be a preexisting one, built upon another one, or designed from scratch using a set of parts. The construction of the circuit can consider different levels of randomness and/or certain constraints, such as base pair length, number of parts, obligatory parts, number of promoters, etc. This stage can also tap into machine learning for a guided design of the circuits by, for example, clustering the generated outputs and using that data to generate artificial new circuits with similar characteristics. The chosen parameters for the circuit generation can direct the process to have a starting point more suitable for the specific objective.

Input: defined parts with their corresponding properties.

Output: a data structure that defines the circuit through its parts and order.




From a defined circuit, a cycle of two core stages is executed, this cycle processes the change from one state to another, i.e. the modification of the circuit architecture. A single cycle, or state change, consists of evaluation and action.

The evaluator traverses the circuit following its organization -given by the circuit's data structure- to define the parts that are functional, i.e. expressed, by considering relevant dependencies, such as repressors, inducers, promoters, terminators. The functionality relates to the interactions involving coding parts and other parts that promote or inhibit transcription, such as promoters, terminators, operators, recognition sites.

Input: a data structure that defines the circuit through its parts and order.

Output: A list of parts that get expressed in this state. The related inputs and interaction that enables the expression of the parts: which promoter and/or which inducer.

In the actuator phase, the functions of the previously tagged functional parts are executed. This process must follow the same organization and can consider simultaneous functions, alternate scenarios, or follow a single route. After the function is executed, a different circuit state is obtained. Therefore, the state change is processed by carrying out the functionality.

Input: a data structure that defines the circuit through its parts and order, and a list of parts that get expressed in this state.

Output: a new state or states as data structures that define each circuit.




At this point, the output circuit or circuits can be cycled back into the validator to redefine the functional parts. Thus, the number of evaluator/actuator cycles is repeated according to the number of desired states/state changes, or until there are no more possible modifications.

After the desired set of cycles are runned, or throughout it, the analysis stage is performed. The analysis is based on a database fed the designed and modelled circuits. Everytime a evaluator/actuator cycle is performed, the generated state or states are stored here, thus saving the history of the circuit’s behavior. Through the database, the circuits can be analyzed and filtered using simple queries based on the circuit’s characteristics. The filters allow for the development and exploration of circuits to be oriented.

Description

Structural

A circuit seen as an organized set of the parts is implemented as a data structure, where this data structure provides the specific organization.

Behavioral

The interactions and functions of parts and other biological elements are applied as functions that each part can call on the validator and actuator, or as information. Alignment interactions define the functional parts in the validator. Other interactions are considered for the actuator to generate the changes the circuit will go through, whether as requirements or as function itself.

Execution: application/integration

The software framework is the direct application of the execution phase. The generator-validator stages execute the part's functions as in the atomic workflow, enabling the modelling of the circuits as state machines. On the other hand, as the software framework's objective is the design of circuits and the exploration of the design space, the generator-validator circuit modelling requires the integration of other stages that would complement it. This was implemented as the first and fourth stages, generator and analysis, which coupled enable the building, saving and analisis the circuits.

Roquet, N., Soleimany, A. P., Ferris, A. C., Aaronson, S., & Lu, T. K. (2016). Synthetic recombinase-based state machines in living cells. Science (American Association for the Advancement of Science), 353(6297), aad8559. doi:10.1126/science.aad8559

Oishi, K., & Klavins, E. (2014). Framework for engineering finite state machines in gene regulatory networks. ACS Synthetic Biology, 3(9), 652-665. doi:10.1021/sb4001799

Madec, M., Rosati, E., & Lallement, C. (2021). Feasibility and reliability of sequential logic with gene regulatory networks. PloS One, 16(3), e0249234. doi:10.1371/journal.pone.0249234

Magdevska, L., Pušnik, Ž, Mraz, M., Zimic, N., & Moškon, M. (2017). Computational design of synchronous sequential structures in biological systems. Journal of Computational Science, 18, 24-31. doi:10.1016/j.jocs.2016.11.010

Letsou, W., & Cai, L. (2016). Noncommutative biology: Sequential regulation of complex networks. PLoS Computational Biology, 12(8), e1005089. doi:10.1371/journal.pcbi.1005089

Fritz, G., Buchler, N., Hwa, T., & Gerland, U. (2007). Designing sequential transcription logic: A simple genetic circuit for conditional memory. Systems and Synthetic Biology, 1(2), 89-98. doi:10.1007/s11693-007-9006-8.

Zúñiga, A., Guiziou, S., Mayonove, P., Meriem, Z. B., Camacho, M., Moreau, V., . . . Bonnet, J. (2020). Rational programming of history-dependent logic in cellular populations. Nature Communications, 11(1), 4758. doi:10.1038/s41467-020-18455-z

Stelling, J., Szallasi, Z., & Periwal, V. (2019). System modeling in cell biology: From concepts to nuts and bolts The MIT Press. Retrieved from https://www.vlebooks.com/vleweb/product/openreader?id=none&isbn=9780262257060

Densmore, D., & Anderson, J. C. (May 2009). Combinational logic design in synthetic biology. Paper presented at the 301-304. 10.1109/ISCAS.2009.5117745 Retrieved from https://ieeexplore.ieee.org/document/5117745

Synthetic biology open language. Retrieved from https://sbolstandard.org/