# Team:ZJUT-China/Model

Model

## We built a deterministic gene expression model, which takes into account the RNA polymerase ( RNAp )/ ribosome performing transcription/translation on DNA template/mRNA.

Abstract

To better understand gene expression in a Cell-Free system and to assist in experimental design, we built a deterministic gene expression model. The model takes into account the RNA polymerase (RNAp)/ ribosome performing transcription/translation on DNA template/mRNA based on earlier work by Vincent N. et al.[1].

Sensitivity analysis of different biochemical parameters was performed to find the key parameters that lead to saturation of the protein synthesis rate when the concentration of DNA template is increased. By fitting the model to our experimental data, the redetermination of the strength of three untranslated regions (UTR1, UTR2, UTR3) not only assisted us to improve an existing part (BBa_K2205002) , but also helped us successfully to create a new part (BBa_K3885311).

We then used the model to explore a two-stage transcriptional activation cascade that uses E. coli $$σ^{28}$$ as a transcription factor, which is a very important part of our gene circuit. In the simulations, we mathematically determined the optimal concentration of each plasmid to be added to the cell-free system, which greatly reduced our trial and error and accelerated the experimental progress.

Values of the parameters for simulations were either assumed, derived from the literature or obtained by fitting model to experimental data.

Model
Figure 1. (a) Schematic of mRNA and protein synthesis in a Cell-Free system. Only the species included in the model are shown.
(b) Equations describing a single transcription unit regulated by σN.
Kinetics of mRNA and protein synthesis
Reaction

The mRNA and protein synthesis in the model comprises four basic reactions: DNA transcription, mRNA degradation, mRNA translation and protein maturation. Except for protein maturation, all other reactions are described by Michaelis-Menten kinetics.

#### Protein maturation

Table 1. Critical species and parameters for DNA transcription, mRNA degradation and mRNA translation.
 Enzyme Substrate Product Michaelis-Menten constant Catalytic rate constant RNA polymerase holoenzyme/ EσN DNA template/ t mRNA/ m KM,N kcat,N RNase/ X mRNA/ m null kM,X kdeg,m Ribosome/ R mRNA/ m Protein/ p kM,R kcat,p

#### Why don't we include protein degradation in the model?

The degradation of protein in the Cell-Free system is achieved by endogenous ClpXP protease, following a zeroth-order kinetics. ClpXP recognizes proteins with the degradation tag ssrA and hydrolyzes them. For proteins without ssrA degradation tags, it is almost stable in the Cell-Free system. This was confirmed by one of our experiments: we added purified eGFP to a Cell-Free system and continuously measured its fluorescence intensity.

Figure 2. Kinetics of the fluorescence intensity of purified eGFP in a Cell-Free system. The fluorescence intensity decays at a very slow rate, so we assume that the protein without the ssrA tag can be stable in the Cell-Free system.
Assumptions and derivations
1. The Cell-Free system is under quasi-steady state for Michaelis-Menten kinetics.
2. The substances required for gene expression are in unlimited supply.
3. The only sigma factor present in the cell extract is σ70
4. This was verified by a crosstalk assay, which concluded that all of the alternative sigma factors are either not present in the extract or are present at insufficient concentrations to activate specific promoters[4].

5. The concentration of DNA template (t) is lower than the concentration of RNAp holoenzyme (EσN)
6. The concentration of synthesized mRNA (m) is larger than the concentration of RNase (X).
7. In the latter, kdeg,m[X] is replaced by kd,m.
8. The concentration of synthesized mRNA (m) is lower than the concentration of ribosomes (Rfree).
9. The maturation of protein (punm) follows a first-order kinetics.
10. The maturation time of deGFP has been well determined[1]. For simplicity, we assume that the maturation of the protein follows a first-order kinetics.

Equations

With these assumptions, the kinetics of mRNA and protein synthesis is described as following:

Conservation of sigma factors, core RNAp and ribosome
Species

#### Sigma factor

Every sigma factor has two forms: free ($$σ_{free}^N$$) or complexed with core RNAp ($${E\sigma}^N$$).

#### Core RNAp

In the presence of a single sigma factor species, core RNAp has three forms: free ($$E_{free}$$), complexed with sigma factor ($${E\sigma}^N$$) or performing transcription on DNA template ($$E_{tx}$$).

#### Ribosome

Ribosome exists in two forms: free ($$R_{free}$$) or performing translation on mRNA ($$R_{tl}$$).

Assumptions and Derivations
1. The concentrations of transcription and translation machinery remain constant.
2. This is the prerequisite for building conservation equations for sigma factor, core RNA polymerase and ribosome.

3. The binding between sigma factor and core RNAp is a fast biochemical reaction.
4. This keeps the following biochemical reaction in equilibrium with the dissociation constant $$K_N$$ all the time.

Thus, the concentration of RNAp holoenzyme is found to be

5. The length of the transcribed gene is $$L_m$$.
6. The core RNAp which performing transcription on DNA template moves forward along the DNA template at a constant rate (TX).
7. The rate of mRNA synthesis is given by

which indicates that the total number of bases per second that core RNA polymerase moves along the DNA template is

8. Dividing this equation by the elongation rate constant of core RNAp (TX), we obtain the concentration of the core RNA polymerase that is synthesizing the mRNA:
9. Considering the RNA polymerase bound to the promoter, the concentration of core RNAp performing transcription on DNA template ($$E_{tx}$$) is given by

10. The ribosome which performing translation on mRNA moves forward along the mRNA at a constant rate (TL).
11. Proceed in a similar manner, the concentration of ribosome performing translation on mRNA ($$R_{tl}$$) is given by

Equations

Using these assumptions, we get the conservation equations for core RNAp and ribosomes:

Single transcription unit

Combining the above two parts of equations, we obtained the set of equations describing the gene expression of a single transcription unit:

Sensitivity analysis

Since the model takes into account the consumption of RNA polymerase and ribosomes during transcription and translation, there is a theoretical maximum for the rate of protein synthesis. To find the key parameters that limit the maximum rate of protein synthesis as the concentration of DNA template increases, we tested the sensitivity of the model by varying the value of each biochemical parameter. The biochemical constants of P70a-UTR1-deGFP were used as original inputs[1].

Figure 3.Parameter sensitivity analysis of the model. The maximum rate of protein synthesis is almost linearly related to the concentration of DNA template before the Cell-Free system saturates for all the cases.

In our simulations, high sensitivity is observed only when varying $$k_{cat,N}$$, $$k_{d,m}$$, $$R_{total}$$, and $$k_{cat,p}$$. However, changes in $$k_{cat,N}$$ and $$k_{d,m}$$ only affect how quickly the output saturates as the DNA template concentration increases, and do not increase or decrease the value at which the output saturates. Therefore, we consider $$R_{total}$$ and $$k_{cat,p}$$ as the key parameters that leading to the saturation of protein synthesis rate, especially $$R_{total}$$.

Table 2. Values of biochemical parameters adopted for sensitivity analysis. The redetermination of $$k_{cat,p}$$ and $$R_{total}$$ will be discussed later.

Based on the model of single transcription unit, we explored the performance of the two-stage transcriptional activation cascade. The two-stage transcriptional activation cascade consists of two transcription units, one of which encodes an alternative sigma factor required to activate the other.

Figure 4. Schematic of a two-stage transcriptional activation cascade using E. coli $$\sigma^{28}$$. The untranslated region of both single transcription units is UTR1.

The system of equations describing the cascade P70a-UTR1 -σ28$$\rightarrow$$P28-UTR1-deGFP is as follows:

Parameter determination

RNA biomarkers can be detected by our Cell-Free biosensor. The measurements, experiments, and hardware testing demonstrated that without RNA biomarker, the fluorescence intensity was significantly lower than with the RNA biomarker that was present. As a result, we have demonstrated the feasibility of our concept. Moreover, the RNA biomarker concentration is related to the sensitivity of the biosensor, but we do not have much more time to test this attribute, so we put forward a plan for the next step.

Single transcription unit

Currently, the strongest expression of single transcription unit in the Cell-Free system is delivered by combining the strongest promoter (P70a) with the strongest UTR (UTR1) reported so far. P70b and P70c are derived from the P70a mutation (strengths: P70a > P70b > P70c), and UTR2 and UTR3 are derived from the UTR1 mutation (strengths: UTR1 > UTR2 > UTR3)[2]. By changing the combination of promoter and UTR, the expression of single transcription unit can be adjusted to appropriate strength. The catalytic rate constant of these promoters and UTRs has been well determined[1].

Before characterizing these nine combinations in the laboratory, we first simulated the expression of P70a-UTR1-deGFP in a Cell-Free system.

Figure 5.Simulation vs our data. By refitting the model to our experimental data to redetermined $$k_{cat,p}$$ and $$R_{total}$$, the simulations match to the data well. The concentration of plasmid is 5 nM.

Although we used the same Cell-Free system as in Vincent N. et al. (myTXTL kit from Arbor Biosciences), a large deviation between experiments and simulations was observed. Considering that the reference was published in 2019, we attribute this deviation to differences in the concentration of transcriptional and translational machinery in Cell-Free systems (the quality of myTXTL kit may have improved).

Sensitivity analysis showed that the model is most sensitive to $$k_{cat,p}$$ and $$R_{total}$$. So, we redetermined $$k_{cat,p}$$ and $$R_{total}$$ by fitting the model to our experimental data and the data from The all E. coli TXTL Toolbox 2.0[2]. We found that the total concentration of ribosomes and the catalytic rate constant of UTR1 were about twice as high as in the literature.

We also characterized the other eight combinations. We then fit the model to our experimental data and redetermined $$k_{cat,p}$$ rates for UTR2 and UTR3, respectively.

Table 3. Comparison of the best numerical fit values of $$k_{cat,p}$$ and $$R_{total}$$. The fit values were obtained by fitting the model to different data.
Figure 6. Kinetics of nine combinations of promoters and UTRs. The concentration of plasmid is 5 nM.

This transcriptional activation cascade consists of two single transcription units: P70a-UTR1-σ28 encoding for the transcriptional factor and P28-UTR1-deGFP encoding for the reporter protein. To explore the performance of the cascade, we made the following assumptions:

1. $$K_{28}=0.74\ nM$$
2. The dissociation constant for $$\sigma^{28}$$ binding to core RNAp was obtained from reference[3].

3. $$K_{M,28}=1\ nM$$
4. Because the model is insensitive to $$K_{M,28}$$, we set the value of $$K_{M,28}$$ to be the same as $$K_{M,70}$$.

5. $$L_{m_{\sigma^{28}}}=800\ bp$$
6. The length of transcribed gene of $$\sigma^{28}$$ is similar to that of deGFP, 800 bp was taken for simplicity.

7. $$k_{cat,28}=0.021\ s^{-1}$$
8. P70a-UTR1-deGFP reaches its maximal protein synthesis rate at 5 nM. For optimal expression, P70a-UTR1-σ28 and P28-UTR1-deGFP should be set at 0.5 nM and 15 nM, respectively. Due to the low concentration, we assume that P70a-UTR1-σ28 provides sufficient σ28 without depleting the transcriptional and translational machinery. Thus, the maximum protein synthesis rate of P28-UTR1-deGFP saturates at 15 nM.

P70-UTR1-deGFP reaches its maximal protein synthesis rate at 5 nM. So, we consider the value of $$k_{cat,28}$$ to be about one-third of $$k_{cat,70}$$:

9. The value of $$k_{mat,\sigma^{28}}$$ should be significantly larger than $$k_{mat,deGFP}$$.
10. The $$\sigma^{28}$$ is necessary for flagellar biosynthesis and chemotaxis in many bacteria. A slow maturation rate will prevent efficient regulation of gene expression.

By fitting the model to the data from The all E. coli TXTL Toolbox 2.0, the protein maturation rate constant $$k_{mat,\sigma^{28}}$$ is found to be 0.16 s-1 (two hundred times faster than deGFP), which confirms our assumption.

Table 4. Values of parameters adopted in the simulation of two-stage transcriptional activation cascade.
Figure 7. Simulation vs data.
Assistance in experimental design

A complex gene comprises multiple single transcription units. Each single transcription unit interacts with the others to affect the output of the system as a whole. Any change in plasmid concentration will interfere with the performance of the system. Therefore, it is crucial to determine the optimal concentration of plasmids for the experiment.

Simulations based on experimental data will provide design considerations for experiments. To ensure that our gene circuit function properly in a Cell-Free system, we need to set each plasmid at the appropriate concentration. The saturation concentration of P28-UTR1-deGFP in the Cell-Free system is 15 nM. In our experiments, we set it to 10 nM in order not to affect the expression effect of other plasmids. To achieve optimal deGFP expression, we need to redetermine the concentration of P70a-UTR1-σ28. By varying the concentration of P70a-UTR1-σ28, we simulated the expression of deGFP and found that deGFP could be maximally expressed when P70a-UTR1-σ28 was at 2 nM. Eventually, we characterized this cascade in the laboratory and found that the data matched the simulation well.

Figure 8. Simulation vs data. P70a-UTR1-σ28 was set at 2 nM and P28-UTR1-deGFP was set at 10 nM.
Figure 9. Expression of P28-UTR1-deGFP at different P70a-UTR1-σ28 concentrations. deGFP had the highest expression when P70a-UTR1-σ28 was 2 nM.
Discussion

Our model describes gene expression of a single transcription unit in a Cell-Free system and explores the performance of a two-stage transcriptional activation cascade. By performing sensitivity analysis of the model and fitting the model to experimental data, we are convinced that the maximum rate of protein synthesis in Cell-Free systems is mainly limited by the translation process.

By fitting the model to our experimental data, we redetermined the strength of UTR1, UTR2 and UTR3. In addition, we provide reference values for the strength of P28 and the maturation rate of σ28. We hope that this model will provide experimental design considerations for future iGEM teams who will use Cell-Free system in their projects.