Team:Kyoto/Model

Model

Introduction

To efficiently produce biomolecule, we designed an efficient and stable biomolecule production platform using an asymmetric plasmid partitioning (APP) system. APP is a system where expression of a DNA-binding protein prevents equal plasmid partitioning during cell division. Cells are differentiated into two daughter cells, one containing the plasmid and one not containing the plasmid. In this way, both "stem" cells and "differentiated" cells can be generated within one bacterial culture as the below figure. By using APP, we can express two different genes in different timing. The time difference can be used to optimize biomolecule production. We defined “BLOOM" as a biosynthetic circuit that utilizes the APP system to express two different genes in different timing.

Fig.1

Initially, we had no information about the time difference generated by our system. Our dry team built a model of BLOOM to answer the following questions. (Click each question to jump to the answer of it)

Q0. How can we model our system?

Q1. How can we customize the time difference of expression?

Q2. How can the system be optimized for production?

Q3. How can we further develop this system?

As a result, we succeeded in building a model of APP, and confirmed that desired time difference can be generated by “BLOOM". We were also able to conclude that the best way to customize the time difference is to control degradation rates of repressors. This outcome affected the experimental design of our wet team. Specifically, they decided to do experiments about degradation rates by using ssRA tag. In addition, we simulated how efficient our device can produce in a given production system and confirmed that “BLOOM" is widely applicable.

First, we handled the question "How can we model our device?".

Building a model of asymmetric plasmid partitioning

First of all, to understand our system, we begin with building a model of APP, because there was no mathematical model of APP since it was first reported in 2019[1]. In APP, a self-aggregating protein binds to a particular DNA sequence of a target plasmid to form plasmid-protein complexes that are asymmetrically segregated when E.coli divisions. In this section, we built a model of aggregation of plasmid-protein complexes and cell division at a given time.

Fig.2

Aggregation of plasmid

We built a model of aggregation of plasmid-protein complexes.
Given that the target plasmid is P, we defined P0 as one copy of P not binding to aggregating protein, and P1 as one copy of P binding to aggregating protein respectively. By this rule, we defined P2 as a composition in which two copies of P are aggregated by aggregating protein. P3, P4, P5… are defined in a similar way. By using these, we can represent a certain condition of E.coli by a vector “P".

Fig.3

By using these variables, the probability density of binding Pi and Pj is formulated as below.

Fig.4

In addition, we assumed that the probability density of binding P0 and aggregating protein ParB is formulated as below.

Fig.5

When we formulated the probability density of binding Pi and Pj, we made the assumption as below.

Aggregation of plasmids is regarded as a binding of aggregating proteins which bind to plasmids.
Steric hindrance by plasmids is neglectable, and the surface of an aggregating protein cannot be covered completely by plasmids. Therefore, the probability of binding Pi and Pj is only dependent on the probability of collision of Pi and Pj.

Fig.6

Cell division

Next, we built a model of cell division at a given time. We made the assumptions below for this model.

In cell division, aggregates of P in E.coli are distributed with 50 % to one of two daughter cells.
If E. coli had P immediately after cell division, Ps are replenished by the replication origin to have the same copy number as before cell division.
Only if an E.coli loses P completely by cell division, since there is no P that has replication origin in an E.coli, Ps are not replenished.

Fig.7

We randomly selected one of two daughter cells after cell division and tracked the cell in one simulation.

Fig.8

The parameters used so far are shown in the following table.

Parameter	Values	Units	Description	Ref
kagg	3.00E-31	L/min	Reaction rate constant of aggregation	Estimate

Supplementary 1 (about Gillespie algoithm)
Therefore, we succeeded in building a model of APP.

Kinetic model of E.coli

Furthermore, we simulated the expressions of the products triggered by APP.
As above, we can simulate the transition of the copy number of plasmid in E.coli traced at each time by using a model of APP. Simultaneously, we can simulate kinetics of E.coli traced by using the copy number of plasmid. Kinetics model of E.coli is formulated as below.

Fig.9

Parameter	Values	Units	Description	Ref
c3	1.50E-07	M/min	Maximum rate of expression repressed by TetR	[2]
K_TetR	1.79E-10	M	TetR binding affinity	[2]
n_TetR	3		Binding co-operativity between TetR and DNA	[2]
γ_TetR	0.0693	1/min	TetR degradation rate	[3]
c4	4.00E-07	M/min	Maximum rate of expression repressed by cI	[4]
K_cI	8.00E-12	M	cI binding affinity	[2]
n_cI	2		Binding co-operativity between cI and DNA	[2]
γ_cI	0.042	1/min	cI degradation rate	[5]
α	2.40E+01	1/min	Translation rate	[6]
β	2.88E-01	1/min	RNA degradation rate	[6]
γ_GFP	2.70E-02	1/min	GFP degradation rate	[7]
γ_RFP	4.10E-03	1/min	RFP degradation rate	[8]
N1	600		Normalization parameter for c1 and c2	[9]
N2	600		Normalization parameter for c3 and c4	[9]
c1	5901.4	M/min	Andersone promoter activity of expression of TetR (strong promoter)	[6]
c2	593.15	M/min	Andersone promoter activity of expression of cI (weak promoter)	[6]
numA	600		Copy number of plasmid A(pMB1:high)	[9]
numD	20		Copy number of plasmid D(ori:p15A:medium)	[9]
γ_YFP	8.00E-03	1/min	YFP degradation rate	[10]
α_YFP	1.25	1/min	Translation rate for YFP	[11]
K_max_ara	4.17	1/min	Maximum rate of transcription of arabinose	[11]
K_half_ara	1.60E-04	M	Ara binding affinity	[11]
n_ara	2.65		Hill coefficient	[11]

Model assumptions

Aggregation of plasmids is regarded as a binding of aggregating proteins which bind to plasmids. (See above)
Steric hindrance by plasmids is neglectable, and the surface of an aggregating protein cannot be covered completely by plasmids. Therefore, the probability of binding Pi and Pj is only dependent on the probability of collision of Pi and Pj. (See above)
Dissociation of plasmid-protein complexes and aggregating protein and plasmid are not considered.

Fig.10

Decreasing of the concentration of aggregating protein by aggregation is not considered.
Aggregating protein will reach equilibrium immediately, and amounts of the expression of aggregating proteins are only dependent on the copy number of a corresponding kind of plasmids.
Cell division is caused every 20 minutes in each E.coli.
There is no leak of protein and plasmids during cell division.
In cell division, aggregates of P in E.coli are distributed with 50 % to one of two daughter cells. (See above)
If E.coli had P immediately after cell division, Ps are replenished by the replication origin to have the same copy number as before cell division. (See above)
Only if an E.coli loses P completely by cell division, since there is no P that has replication origin in an E.coli, Ps are not replenished. (See above)
The degradation rate of repressors is constant all time.
The start point is equilibrium under the condition that APP is not triggered (all kinds of plasmids are not missing).

By running the simulation a sufficient number of times and averaging the obtained values, it is possible to calculate sequential changes of average total number of plasmids in E.coli and average concentration of the product.
In these theories, we succeeded in building a model of BLOOM. In the next section, we will answer the second question “How to customize the time difference of expression?" by simulating BLOOM.

Result 1 - A2. How can we customize the time difference of expression?

2plasmid system

Secondly, we tried to simulate 2 plasmid system by using the model. Plasmids, A and D in the 2 plasmid system are as follows.

Fig.11

In the 2 plasmid system, the mechanism of APP is described by the figure below. This mechanism is explained in detail on this page.

Fig.12

We simulated the 2 plasmid system 10,000 times in this way and averaged the transition of the total number of D plasmid in E.coli, the result is as follows.

Fig.13

Although vector D is (20,0,0….) at the start point, the fraction of complexes composed of many plasmid D increases as the aggregation progresses, and the total number of D0 and plasmid-protein complexes decreases. As the aggregation progresses more, there are only E.coli containing only D20 and E.coli not containing D plasmid. Ultimately, almost all E.coli loses D20 by cell division, therefore plasmids are completely lost by almost all E.coli (Because D20 is distributed E.coli which is not traced).

We collated data from the thesis which we referred to in building a model, adjusted and estimated reaction rate constant “kagg" which is unknown. As a result, we concluded that kagg is 3*10**-31(L/min).

We averaged the concentration of reporters, the results are as follows.

Fig.14

From the above results, we confirmed that there is a time difference generated by using BLOOM. We can substitute the protein which you want and the protein for collecting the target protein (ex. lysis, aggregation of cells) for two reporters. In some cases, it may take a lot of time to produce the target protein which you want, and you think you want to delay the time when the protein for collecting is expressed. Conversely, in some cases, you may want to advance the timing of expression of the protein for collecting protein to express it immediately after the expression of the product finishes. In order to respond to such a request, we researched how to freely customize the time difference.

Then, we investigated what is the most effective parameter for customizing the time difference in order to make BLOOM useful.

The degradation rates of repressors

Fig.15

Original
(γTetR=0.0693(1/min), γcI=0.042(1/min))

Fig.16 γTetR=0.05(1/min), γcI=0.042(1/min)

Fig.17 γTetR=0.10(1/min), γcI=0.042(1/min)

Fig.18 γTetR=0.30(1/min), γcI=0.042(1/min)

Fig.19 γTetR=0.0693(1/min), γcI=0.03(1/min)

Fig.20 γTetR=0.0693(1/min), γcI=0.05(1/min)

Fig.21 γTetR=0.0693(1/min), γcI=0.10(1/min)

The promoter activities of repressors

Fig.22 Original
(The promoter of TetR is weak, and the promoter of cI is strong.)

Fig.23 The promoter of TetR is strong, and the promoter of cI is strong.

Fig.24 The promoter of TetR is weak, and the promoter of cI is weak.

Because the difference of the promoter activities of repressors makes the difference as follows, we told our wet team the promoter set that generates the time difference maximumly.

Maximum of the copy number of plasmids

Fig.25 Original
(Maximum of the copy number of plasmid is 20.)

Fig.26 Maximum of the copy number of plasmid is 100.

Arabinose concentration

Fig.27 Original
(0.200% arabinose)

Fig.28 0.020% arabinose

Fig.29 0.002% arabinose

As a result, we concluded that the degradation rate of repressors is critical for the time difference and other parameters don't have an effect on the time difference. Accordingly, it is suggested that the most important factor is to prepare various repressors that have different degradation rates in order to use BLOOM in versatile applications. By doing this, we can customize the time difference in the expressions of two proteins and collect the products at optimal timing.

Our conclusion was told to our wet team, and experiments were done to mutate the ssRA tag (protein degradation tag) that regulates the degradation rate of repressor in order to generate mutations that have various degradation rates. As a result, we got various variants of protein degradation tags that have different activities, therefore BLOOM got more applicable.

Result 2 - A2. Can this system be effective for production?

Production

Next, we investigated whether BLOOM is effective for production.
Continuous cultures, in which we continuously harvest E.coli by collecting cultures while inflowing liquid media at a constant rate, are generally more efficient in biomolecule production than batch cultures. On the other hand, we cannot apply complex system to continuous cultures but can be applied by batch cultures. We devised a production method that is efficient close to that of continuous cultures and can use a complex system as well as batch cultures by using BLOOM. In this section, we compared a continuous culture system introducing BLOOM to batch culture by calculating production efficiency. In our system, we can get permanently and automatically E.coli which the product reaches equilibrium by continuously culturing E.coli introducing BLOOM (2 plasmid).

Fig.30

We discuss the system in which E.coli produces GFP by using a 1L culture tank in 24 hours in order to simulate continuous cultures introducing BLOOM. E.coli that we simulated can aggregate and become collectable by expressing the aggregation protein “AG43" immediately after GFP reaches equilibrium.

Fig.31

In order to optimize our system, we replaced TetR in the 2 plasmid system with TetR with ssRA tag(WT) and replaced cI in the 2 plasmid system with cI with mutant ssRA tag of which the degradation rate is 0.126(1/min).

Parameter	Values	Units	Description	Ref
γ_ssRA-protein	1.4	1/min	Protein tagged with ssRA degradation rate	[12]
γ_AG43	2.70E-02	1/min	AG43 degradation rate	Estimate

The transition of the averaged concentration of GFP and AG43 in one E.coli is as follows. X-axis is the time after introduction of arabinose, which is an inducer to start reactions.

Fig.32

The blue auxiliary line means averaged time when all plasmid is lost by APP. In non-stem cells, the starting point of the transition between GFP and AG43 concentrations is on the blue auxiliary.
In our system, differentiated cells missing plasmids are continuously generated from stem cells, and each differentiated cell expresses two proteins in stages with the schedule of Fig.32
On the other hand, we considered a 1L batch culture system that incubates for 8 hours, to compare with our system. The protocol of the batch culture was referred to that of peptide purification of our wet team.
We used the assumptions below.

Aggregation of AG43 is completed in 120 min after starting to translate aggregation protein[13] Because it takes 360 min from the start time to the time to start expressing aggregating proteins, there is an interval of 480 min in total from the start time. Therefore, we set the concentration of E.coli in the culture tank reach 3x10^12(cells/L) and half of E.coli aggregate in the culture tank immediately after 480 min.
The degradation rate of AG43 is the same as GFP.
All E.coli is in a logarithmic growth phase in our system. The concentration of E.coli is constant after the interval.
The flow velocity in our system is about 34(mL/min). This flow velocity is optimized to get the product in continuous culture. (The detail of this is in supplementary 2.)
In our system, culture solutions include only E.coli and surplus to culture medium. E.coli, which is not aggregated and collected, and the culture solution containing non-aggregated E. coli and excess culture medium is returned to the culture tank with additional culture medium.
The amount of culture medium to one E.coli in our system is the same as that in batch culture.
The concentration of GFP when copy number of plasmid D is zero from the beginning in continuously culturing E.coli introducing BLOOM (2 plasmid) is the same as that of GFP in equilibrium in E.coli in batch culture.
The concentration of E.coli in batch culture is 3x10^12(cells/L) when the batch culture ends.
GFP is in equilibrium when the batch culture ends.
It takes 8 hours to finish batch culture.

The amount of product in our system, Vc(mol) can be formulated as the function of the time t(h) and the interval int(h) and the volume of culture tank L(L). The concentration of the product in E.coli when the product is in equilibrium P(M) as below.

Fig.33

On the other hand, the amount of product in batch culture Vb(mol) can be formulated as the function of the time t(h) and the interval int(h) and the volume of culture tank L(L), the concentration of the product in E.coli when the product is in equilibrium P(M) as below.

Fig.34

We calculated the amount of GFP which we can get by using a 1L culture tank in 24 hours from these assumptions and these formulas. As a result, the amount of product in batch culture is 6.48μmol, and the amount of product in our system is 3.81*10μmol. Therefore, the amount of product that we get and culture medium used in our system is 5.88 times more than that in batch culture. If we incubate for a very long time, the amount of product which we get and culture medium consumed in our system is 8.32 times more than that in batch culture.
From the above, it was demonstrated that continuous culture introducing BLOOM is more efficient than batch culture. Other merits of our system as below can be thought of.

Because the product is collected after it reaches equilibrium in E.coli, the quality of it is stable.
If nutrition is sufficient, we can produce the product permanently at a constant rate.
Because BLOOM is a flexible system, we can set various conditions to express automatically.

Therefore, this model quantitatively proved that BLOOM is effective for production.

Result 3 - A4. How can we develop this system in the future?

3plasmid system

As an example of the application of BLOOM, we simulated a 3 plasmid system, where the 2 plasmid system is augmented by adding one more APP system. The 3 plasmid system is also based on aggregating proteins binding particular DNA sequences as well as the 2 plasmid system. In the 3 plasmid system, orthogonal two kinds of APP systems coexist. Plasmids used in the 3 plasmid system are the following three; A, B, C.

Fig.35

In the 3 plasmid system, the mechanism of APP is described by the figure below. This mechanism is explained in detail on this page.

Fig.36

The set of new parameters we used in this simulation are presented in the table as below.

Parameter	Values	Units	Description	Ref
gammma_sopB	0.027	1/min	sopB degradation rate	Estimate
numB	5		Copy number of plasmid B(pSC101:low)	[9]
numC	20		Copy number of plasmid C(ori:p15A:medium)	[9]
Na	6.02E+23		Avogadro constant

We simulated the 3 plasmid system 10,000 times in this way and averaged the transition of the total number of B, C plasmids in E.coli. The result is shown below.

Fig.37

This graph suggested that two kinds of plasmids are missing with the time difference. This result suggests not only that BLOOM can generate the time difference in expressions of two kinds of genes, but also that we can make a more complicated system by using 3 or more plasmids.

Future application

From the results of the simulation, we concluded not only that we can customize the time difference in expressions of two kinds of genes, but also can make a more complicated system by using 3 or more plasmids, and genes. Various applications by using BLOOM are thought of as follows.

By expressing different genes depending on whether a certain kind of plasmid is missing, we can make E.coli to work differently for each kind.
If we have to express 3 or more genes in stages, we can do it automatically in each E.coli without adding inductors manually.
By combining the Toxin-Antitoxin system to BLOOM, we can get E.coli to express in stages without leakage.
By programming the E.coli to suicide after production and collection, we can create a system with guaranteed biosafety.

Fig.38

We can confirm the finishing of the reaction visually by programming E.coli to fluorescent after equilibrium of expression.

Fig.39

If intermediate is toxic in a production stage, we can decrease the time of toxic substances exist by programming to express an enzyme that degradated the intermediate of the next reaction in advance.

Fig.40

If a product is toxic in a production system, we can increase efficiency by programming to reach the equilibrium of a reactant in advance of an enzyme of the last reaction.

Supplementary

Supplementary 1(Gillespie algorithm)

Supplementary 2(about production)

References

Molinari, S., Shis, D.L., Bhakta, S.P., Chappell, J., Igoshin, O.A., and Bennett, M.R. (2019) "A synthetic system for asymmetric cell division in Escherichia coli", Nat. Chem. Biol. 15, 917–924.
iGEM TUDelft 2009 Modeling_Parameters
Elowitz, M.B., and Leibler, S. (2000) "A synthetic oscillatory network of transcriptional regulators", Nature 403, 335–338.
iGEM Waseda 2020 Model
Arkin, A., Ross, J., and McAdams, H.H. (1998) "Stochastic Kinetic Analysis of Developmental Pathway Bifurcation in Phage λ-Infected Escherichia coli Cells", Genetics 149, 1633–1648.
Wu, C.-H., Lee, H.-C., and Chen, B.-S. (2011) "Robust synthetic gene network design via library-based search method", Bioinformatics 27, 2700–2706.
iGEM 2018 Valencia_UPV Model
iGEM 2012 NTU-Taida Modeling Parameters
(JAPANESE) 橋本義輝 pUC プラスミドにまつわるエトセトラ　
BBa_K592101 Parts page
iGEM 2015 Oxford Modeling
Lies, M., and Maurizi, M.R. (2008) "Turnover of Endogenous SsrA-tagged Proteins Mediated by ATP-dependent Proteases in Escherichia coli", J. Biol. Chem. 283, 22918–22929.
BBa_K1352001 Parts page
(JAPANESE) 菅原一秀確率的シミュレーションアルゴリズムの時間領域量子化