iGEM Wiki

Model

Overview

Modelling played a pivotal role in shaping our iGEM project. We leveraged on modelling to gain a better understanding of the various systems, optimize the circuitry design, and guide the direction of the overall project. The data collected by experimental characterization was subsequently used to generate predictive equations and estimate the parameters in our model. We primarily designed 3 models that allowed us to:

Rationalize and Optimize the Blue-Light Activation
Rationalize the Dynamics of the Flocculation
Estimate time taken for 99% flocculation in our bioreactor

The results obtained from modelling have greatly assisted the DBTL process. Specifically, they have assisted the Wet Lab Team in the optimisation of the Blue-Light System and has directed the hardware design

In the absence of light, the EL222 proteins remain as a single subunit. However, upon exposure to blue light, the EL222 subunits change their conformation, exposing both a LOV domain and a helix turn helix domain. This conformation change allows the EL222 to form homodimers that bind to a target C120 sequence, bringing the activation domain in close proximity and activating downstream transcription.

Figure 1. Light inducible system for the mRNA of mKO (orange fluorescence).

We attempted to model the inducible blue light system to reveal its mechanistic kinetics as illustrated in Figure 1. Subsequently, we utilized the fitted curve to suggest and explore possible improvements to the blue-light activated expression system.

Next, based on the curve fitting result, we analyze how to improve the expression of the system under blue light activation.

Method and Model

Optical Density (OD) and fluorescence data were obtained from the experiments conducted by the Wet Lab. A model was then fit and analyzed using Python lmfit and Salib libraries. To generate enough data points for the curve fitting process and further understand the effect of the light-duty cycle on the activation of the C120 promoter, we performed the following investigative experiments:

Blue Light Off for 6hrs
Blue Light On for 6hrs
Blue Light Toggled On/Off every 30min for 6 hours

To simulate the light pattern, we initially used a stepwise function with 0 and 1 representing the off and on states. However, the discontinuity of the light function radically affected the result of the simulation. Therefore, we decided to choose a ‘soft’ stepwise function to allow better continuity. The soft stepwise function utilizes the idea of sigmoid functions:

Figure 2. Comparison of light pattern using stepwise function and soft stepwise function.

To simplify our model, we made a few assumptions:

The initial concentration of dimerized EL222 (EL222*) is 0
The concentration of EL222 monomers (EL222total) is constant (10000 a.u, scaled based on experimental data)
The initial concentration of mKO mRNA is 0
The initial concentration of mKO is 0

The following 9 factors were accounted for while modelling the inducible system:

k_dim = The rate of EL222 dimerization
k_und = The rate of EL222 undimerization
bas_C120 = The basal transcription rate of C120
syn_C120 = The max transcription rate of C120
n_EL222 = The Hill coefficient for the promoter bound EL222 dimers
K_EL222 = The concentration of EL222 dimers at which half of maximum transcription rate is reached
deg_mRNA = The rate of mRNA degradation
syn_pro = The rate of translation
deg_pro = The rate of mKO degradation

The system of differential equations is:

Result

The fitting was done using the Basin-hopping algorithm to find the global minima. The model exhibited a good fit to the experimental data (Figure 3).

Figure 3. Result of curve fitting to the blue light induction data.

Parameter	Optimized Value	Unit
k_dim	0.014	AU FU^-1 hr^-1
k_und	0.034	hr^-1
bas_C120	0.1	FU AU^-1 hr^-1
syn_C120	28.732	FU AU^-1 hr^-1
n_EL222	1.103
K_EL222	300.089	FU AU^-1
deg_mRNA	3.746	hr^-1
syn_pro	54.766	hr^-1
deg_pro	0.141	hr^-1

Next, we performed a sensitivity test within the range of 10% of the original values (Figure 4). The result showed that K_EL222 is the most sensitive variable, followed by protein translation, protein degradation rate, mRNA degradation rate, maximum transcription rate, and the Hill coefficient. Therefore, we decided to improve K_EL222 and possibly the Hill coefficient n_EL222. We hypothesize that, by increasing the number of repeats of the C120 domain, the sensitivity of the inducible promoter will increase and these two parameters may be improved for better expression.

Figure 4. Result of sensitivity analysis of blue light system model.

Based on this result, the Wet Lab team constructed a new strain of Yeast with 3 repeats of the C120 domain (3C120-CYCp mko). The experimental protocol for curve fitting was repeated for the new strain. Except that this time only K_EL222 and n_EL222 were fitted, other parameters were kept at their values established from the previous curve fitting result. The curve fitting of 3C120-CYCp mKO showed a great fit to experimental data (Figure 5), implying that the effect of increasing the number of C120s on other parameters is negligible. Furthermore, the max fluorescence/OD value of the 3C120-CYCp in the presence of 6hrs of blue-light, exhibited a 20% increase when compared to the max fluorescence/OD value of EL222 for the same curve.

Figure 5. Result of curve fitting to the blue light induction for 3C120-CYCp mKO strain.

Subsequently, we compared the K_EL222 and n_EL222 values between the old and new strains. We found a clear improvement of both parameters in the new strain with 3 C120 repeats. The K_EL222 value decreased 3-fold and the n_EL222 value increased almost 8-fold.

Figure 6.1. Comparison of n_EL222 between 2 yeast strains.

Figure 6.2. Comparison of K_EL222 between 2 yeast strains.

Figure 6.3. Comparison of Max Fluorescence/OD between 2 yeast strains.

Conclusion

The inducible systems can be modelled accurately using the nine mentioned parameters. The sensitivity results helped the wet lab to improve the expression induction by increasing the number of C120 repeats. Although the model exhibited a good fit, calibration of the protein concentration based on the fluorescence/OD data could have been performed. This would allow for the conversion of the units from fluorescence/OD to molar concentration and further allow us to validate our data by comparing it with other published results.

Reference

[1] D. Benzinger and M. Khammash, “Tuning gene expression variability and multi-gene regulation by dynamic transcription factor control,” 2018.

[2] P. Jayaraman, K. Devarajan, T. K. Chua, H. Zhang, E. Gunawan, and C. L. Poh, “Blue light-mediated transcriptional activation and repression of gene expression in bacteria,” Nucleic Acids Research, vol. 44, no. 14, pp. 6994–7005, 2016.

Model Development

Modelling flocculation is challenging! Flocculation is a complex dynamic process, where the cell surface properties, buoyancy of the cell, and physical agitation (shaking/stirring) play a significant role in determining the kinetics of the overall process. Three kinetic processes happen during flocculation:

Perikinetic flocculation – Random Brownian motion of cell particles
Orthokinetic flocculation – Power input due to external mixing
Sedimentation due to gravitation

Most flocculation papers attempt to model the sequential forming and breaking of flocs of varying sizes, often resulting in a complex model with a vast number of differential equations and a high computational cost [1]. Furthermore, curve fitting for these models often requires an extensive amount of experimental data and expensive hardware, such as cell counting equipment. Other models approach this problem by differentially classifying flocculated and single cells and predicting their dynamics using simple exponential models [2]. This approach, although simplifies the flocculation models to a great extent, fails to capture the dynamics of floc of different sizes. Consequently, our goal for this iGEM was to establish a new model for flocculation that took the best of both worlds: A model that is not only computationally inexpensive but also easy to verify via OD measurement using only a spectrophotometer.

Our protocol to quantify flocculation was to measure the supernatant OD after 2 minutes of resting. Applying the Stokes law of sedimentation, we can calculate the distance that each floc size has travelled due to sedimentation:

Here ρ denotes density, μ is media viscosity, g is gravitational acceleration, d_i is the number of cells inside one floc (floc size).

We assumed that flocs have a spherical shape with their diameter being proportionate to the number of cells inside each floc. For media of height 1 cm, almost every floc above size 3 cells sank to the bottom of the flask after 2 minutes. A similar result was seen for flocs with a size 5 cells in 2 cm of media. Therefore, we hypothesize that in almost every case, the upper bound of the size of the flocs retained in the supernatant is 5 cells.

Since each floc size contributes differently to the overall OD, the OD of supernatant is a summation of multiple factors:

Here, N is the total number of cells/flocs inside the flask. S_i is the ratio of the flocs of a size d_i, with the contribution to each floc size being proportional to its square. α is the conversion ratio. From the work of Hamersveld et al. (1997) [3], we can assume the floc size has a log-normal distribution. The cumulative distribution function (CDF) of log-normal distribution has the form:

Where μ is the mean floc size, σ is standard deviation, and erf() is the Gauss error function:

Therefore,

Thus, the change in OD due to growth and flocculation is:

where the change of cell numbers follows traditional Monod kinetics and depends on the concentration of carbon source [G].

Based on the definition of erf() function, the derivative of S_i w.r.t mean floc size is:

Here, we assume that the mean floc size shifts from 1 cell to higher values due to flocculation, and the maximal floc size is limited due to breakage and cell loss. The increase in the mean size of flocs depends on the expression of FLO1 protein:

Therefore, the finalized equation for OD is:

Other equations in the system include the expression of FLO1 and the decrease of carbon source over time. The decrease of carbon source is dependent on the number of cells:

Our yeast expresses FLO1 under the control of a GAL promoter. Despite the metabolic flux complexity of galactose, and the sequential activation/inhibition of GAL promoter by many transcription factors, we decided to consider GAL promoter as a constitutive promoter, where the expression of mRNA reaches a quasi-steady state.