Overview
The goals of our modeling work are to:
- Perform an independent, easy-to-use simulation for all the parts of our design regarding SCFAs detection.
- Analyze their function both qualitatively and quantitatively.
- Predict their behavior and response in a range of conditions.
- Utilize the data produced to improve the design of the wet lab work.
G Protein-Coupled Receptor (GPCR) Module
For the modeling of our FFAR2 GPCR, in the beginning we worked on a previously documented and developed SBML model (Heitzler, et al.,2012), consulting the BioModels database (Li, et al., 2010). Many modifications were needed, though, because our research was only narrowed down to receptors that conduct β arrestin-mediated signaling, and not to ones of the specific FFAR2 type, mainly due to the scarcity of documentation on this receptor. We also ignored various second messengers and signals and fine-tuned the model, to better depict our project design.
Figure 1: The FFAR2 GPCR genetic circuit. If we do not have a protein degradation step, we can say that we assume that receptor degradation after cleavage does not occur, but we plan to include such reactions in an updated version of the model.
The simulations below indicate that the balance between the main effectors of the detection module, beta-arrestin and GPCR, need to be in an optimal ratio at protein level, for the best signal to be produced. In future iterations, the wet lab team will experiment on different combinations of promoters/RBSs and inducer concentration, to achieve optimal signal production.
Figure 2: Normal function of the GPCR. The yellow curve represents β-arrestin 2 concentration
and the red curve the LacI output signal.
Figure 3: The GPCR’s output signal (LacI) as the concentration of SCFAs increases.
Figure 4: The GPCR’s output signal (LacI) as β-arrestin 2 concentration increases. Bigger β-arrestin 2 concentration results in faster LacI convergence to a final value and oscillations of smaller amplitude.
lac operon: the NOT Gate
To model the NOT Gate of our detection system, we employed components of the lac operon, which is a well-documented paradigm of genetic regulation, with bi-stable, switch-like behavior and is utilized in the wet lab design. We focused on the center of the whole negation logic, which is the repressor-operator interaction in the lac operon and the chemical kinetics surrounding it. The model reactions, parameters and rate rules were derived from Stamatakis and Mantzaris (2009) therefore the first version of our model did too.
Degradation reactions:
Given that [O]T = [O] + [R2O] (the total operator concentration is constant and equal to the free operator concentration plus that bound to the repressor), the mass balances are depicted in the following equations:
Symbol |
Species Denoted |
MR |
LacI repressor mRNA |
R |
LacI repressor monomer |
R2 |
LacI repressor dimer |
O |
lacO operator |
R2O |
Repressor-operator complex |
I |
Intracellular inducer IPTG |
Iex |
Extracellular inducer IPTG |
I2R2 |
Repressor-inducer complex |
MY |
LacY permease mRNA |
Y |
LacY permease |
YIex |
Permease-inducer complex |
The parameters that appear in the reaction model:
Symbol |
Value |
Units |
Description |
[O]T |
~2.8 |
nM |
operator content |
ksMR |
0.23 |
nM*min-1 |
lacI transcription rate |
ksR |
15 |
min-1 |
LacI monomer translation rate constante |
k2R |
50 |
nM-1*min-1 |
LacI dimerization constant |
k-2R |
10-3 |
min-1 |
LacI dimer dissociation rate constant |
kr |
960 |
nM-1*min-1 |
association rate constant for repression |
k-r |
2.4 |
min-1 |
dissociation rate constant for repression |
kdr1 |
3*10-7 |
nM-2*min-1 |
association rate for 1st derepression |
k-dr1 |
12 |
min-1 |
dissociation rate for 1st derepression |
kdr2 |
3*10-7 |
nM-2*min-1 |
association rate for 2st derepression |
k-dr2 |
4.8*103 |
nM-1*min-1 |
dissociation rate for 2st derepression |
ks1MY |
0.5 |
min-1 |
lacY transcription rate constant |
ksOMY |
0.01 |
min-1 |
leak lacY transcription rate constant |
ksY |
30 |
min-1 |
lacY translation rate constant |
kp |
0.12 |
nM-1*min-1 |
LacY-IPTGex association rate constant |
k-p |
0.1 |
min-1 |
LacY-IPTGex dissociation rate constant |
kft |
6*104 |
min-1 |
IPTG-facilitated transport constant |
kt |
0.92 |
min-1 |
IPTG passive diffusion constant |
λMR |
0.462 |
min-1 |
lacI mRNA degradation constant |
λMY |
0.462 |
min-1 |
lacY mRNA degradation constant |
λR |
0.2 |
min-1 |
repressor monomer degradation constant |
λR2 |
0.2 |
min-1 |
repressor dimer degradation constant |
λY |
0.2 |
min-1 |
LacY degradation constant |
λYIex |
0.2 |
min-1 |
LacY-inducer degradation constant |
λI2R2 |
0.2 |
min-1 |
repressor-inducer degradation constant |
Figure 5: The standard lac operon genetic circuit. To create a detectable signal, we used GFP (Green Fluorescent Protein) for our simulations. Specifically, we replaced the model
Instead of producing LacY and activating the IPTG positive feedback loop, our engineered system expresses GFP. Thus, we determined a standard input and a standard output for this module. The solution for this problem was to actually open the lac operon genetic circuit, by cutting off the feedback loop the permease is responsible for.
Nevertheless, the mathematical modeling of GFP’s expression kinetics was a difficult topic to find explicit literature on, mainly because most of the literature we encountered features GFP as a reporter module in case studies of cells of particular type and provides at best experimental expression/fluorescence data fitted with various methods, rather than deterministic kinetic equations for its expression.
Lastly, we set lacI at the protein level as the system’s input, ignoring only the mRNA_LacI species, since it is an input from the previous in-line detection module and not expressed, and updated our equations with all of the above changes to ensure they still follow mass balance principles. For a comprehensive design of the Detection System, see our Engineering page
In the diagrams below, concentrations of the various species during model simulations can be observed, when it is induced or not by IPTG, the system’s fail-safe inducer.
Figure 6: The lac operon modified, “open” genetic circuit.
Figure 7.1: Comments on figure 7.2.
Figure 7.2: The response of the system when it is not induced. In both cases, the extracellular IPTG is equal to zero, with the difference
that in the first case LacI is present and in the second is absent. As a result of non-induction, the final quantity of GFP is pretty near to zero, meaning that no reporting sequence is triggered. The repression, on the other hand, happens independently with stray repressor and operator quantities in the model.
Figure 8.1: Comments on figure 8.2.
Figure 8.2: The response of the system when it is induced. There is a non-zero initial quantity of extracellular IPTG in both cases.
In the first one LacI is absent, while in the second one LacI is present. With a little tuning in the initial extracellular IPTG value, we can see that when initial LacI is equal to zero, LacO (the operator) is temporarily repressed by stray quantities of LacI dimer and then produced normally, whereas when initial LacI is non-zero, LacO is repressed to a negligible value. This encapsulates the NOT gate logic. Of course, as opposed to the previous diagrams, here we can observe a very large production of GFP.
Reporter Module
At first, so as to understand the way our reporter module behaves in terms of modeling, we had to go through a vast literature that dealt with electronic interrogation of biological systems that performed redox signaling. Redox signaling is signal transduction that involves redox reactions and consequently changes in a solution’s charge, and strictly depends on the kinetics and thermodynamics of the electron movement. In our research, we also encountered systems that interestingly enough performed redox cycling - that is, the consecutive happening of a reduction and an oxidation reaction back and forth in a periodical manner, which is a procedure that models systems such as genetic oscillators. So, one intriguing question we had to answer was: “Does our reporter perform redox cycling?”. It turned out, after some discussion with the wet lab, that in our system there are no oscillatory redox phenomena, the production of L-DOPA quinone from L-DOPA is a one-way reaction (oxidation), and the release of electrons is constant, producing a steady signal. Note that this realization was the fruit of our better understanding of the system’s inherent characteristics, rather than some kind of assumption.
That being said, our final version of the lac operon module has to produce tyrosine to activate the reporting sequence, and GFP needs to be replaced with tyrosine. We need to be cautious not to lose any information while taking this “replacement” step, though: GFP is an imaging, semi-quantitative indicator of a biological process. Its expression is not associated with the process it reports linearly, but is an approximate gauge of how much of it is happening. Also, its expression is an operation that introduces a lot of noise, it is modeled only with the help of stochastic methods and algorithms across the literature, and the only way to extract a kinetic formula for its expression is fitting experimental data. The questions that arise are: “Does tyrosine behave the same way?” and “What are the differences in tyrosine’s expression kinetics?”. In our modeling, all lac operon’s expression kinetics are deterministic equations based on mass balance principles, and we make no exception for tyrosine, or GFP. In fact, our initial equations are more likely to predict tyrosine’s kinetics accurately (that usually is assumed to follow formulas such as Michaelis-Menten), than GFP’s vague, random expression. This way, adopting our initial equations for tyrosine too makes an assumption that is in no way far-fetched, and results in a more precise modeling overall.
Figure 9: The Reporter Module genetic circuit.
To simulate the reporter module’s first-order reactions deterministically we used Mass Action kinetics, one of the most general-purpose, versatile options SimBiology provides for a reaction. We also experimentally tuned tyrosine and tyrosinase initial values so that we see the results within a specific time frame (approx. 12 minutes). As expected, the concentration curves change very smoothly and converge with certainty to their final values, while L-DOPA quinone -our final product, whose concentration quantifies the final signal- comes to a saturation which can be translated to a steady, constant signal. An interesting observation is, that by using the ode45 solver we get curves that are more angular, less precise and the simulation program runs for longer, while at the same time when using ode15s the results are smoother, more refined and quicker, meaning that a solver made to solve stiff problems works better for this module. For more on solvers, go to the Technical Details section.
In the stochastic simulation, after some searching, we used the implicit tau solver as the best for this case (we will not expand on the implicit tau leaping method, as it would most probably drive our analysis off topic, with concepts that are not easily comprehensible). Using the same set of initial values as before, the curves appear to follow roughly the same paths, but in a linear, steeper fashion. Specifically in the L-DOPA quinone concentration we observe a spike-like rise somewhere between 2 minutes and 8 minutes most of the time. Note that the moment of appearance of that rise, as well as its slope, vary at each run of the simulation program. Also, in a respectable percentage of runs the concentration of L-DOPA quinone remains at zero level, exhibiting the effects of intrinsic noise and randomness– noise is always present, but those are the runs that it is observed at full potential.
Figure 11: The results of stochastic simulation with the implicit tau solver, for 25 runs. Paths that appear more intense and bold are more frequently followed.
Having solved the problem in two different ways, we are now able to compare them. We are confident that the fruit of such a comparison will at least provide a helpful tool for our better understanding of the problem, if not the ideal solution. The deterministic solution guarantees smooth predictable curves. They can be easily discretized, quantized down to measurement points at given moments, and then recreated with no – or little – loss of information with mathematical methods (interpolation, regression etc.). This makes an effortlessly processable final signal. Nevertheless, it is not as realistic as it should be, considering it treats the problem as a textbook one, isolated from any source of external noise or disturbance, a factor that is well integrated in the stochastic solution. Stochasticity depicts fairly the effects of randomness in the genetic circuit. In spite of that, it doesn’t grant us high-quality, precise data: when a circuit’s components are few in number and the circuit is relatively small and simple, the possible routes the curve is about to follow on the 2D plane – or, formally, the potential states of the system – reduce drastically, and so does noise, making a stochastic simulation seem inaccurate and pointless. That happens because the number of factors contributing to any divergence from a standard curve are very few, just like the number of the circuit’s components. This proportionality between the number of components and the “realism” of noise discourages us from using a purely stochastic solution, in the same way that the lack of random phenomena or noise prevents us from applying a genuinely deterministic one. If forced to make a choice between the two, we would opt for determinism, for its quality of data, consistency and ease of processing. We regard both methods as suitable or unsuitable for different reasons, though, and this is why the ideal approach that would offer optimal data quality would be a hybrid approach, one that would combine the two methods. This remark certainly inspires a next future version of our modeling of the Reporter Module.
Signal Processing: A Noise Study
Thermal Noise: Also known as Johnson-Nyquist noise, is the electronic noise caused by the movement of the electrons in an electrical conductor, and is present in electronic circuits regardless of any applied voltage.
Biological Noise: Our main field of study. The biological noise is unwanted divergence in a measurement, behavior or process on the gene level due to the instability and randomness of protein translation or transcription. In the work of Swain, Elowitz & Siggia (2002), a clear distinction is being made between intrinsic and extrinsic noise. As intrinsic noise we name the inherent stochasticity of translation and transcription, and as extrinsic the variations caused by fluctuations in the amount or states of other cellular components.
Roundoff Noise: Purely computational term. Roundoff noise is noise produced by quantization errors within a filter when it is applied on a signal. Simply put, it is the information that is being lost because of tolerances in the design of a digital filter - after all, nothing is 100% accurate in computer simulations. We will explain the definition and use of filters later in this section.
This study primarily focuses on biological noise, and especially intrinsic biological noise. Even though it may be normally hard to distinguish it from extrinsic, knowing our model only simulates isolated processes of gene expression makes it easy to understand that the only simulation noise observed will be intrinsic. We will also very briefly concern ourselves with roundoff noise.
The next question that would arise is “How and why do we process a signal?”. Of course, there are numerous techniques, but if we were to pick one point to start that would definitely be no other than Fourier analysis. Fourier analysis is the representation and analysis of a signal with Fourier series, a mathematical way to describe a signal which initially is a function of time, "s(t)" with respect to the signal’s spectrum. Or, formally:
Basically, it uses separate periodical signals -sinusoidal, actually- in the form of complex exponentials e(ik2πt)/P where k is the number of samples in an interval of length P), each one multiplied by a different coefficient, and adds them up to make a new, approximative mathematical representation of the signal. It is a linear combination of periodical functions. But what is the spectrum quantity, or else, the coefficients of that linear combination, S[k], and what does it represent?
This is the Continuous Time Fourier Transform (CTFT), also known as the spectrum of a signal. Complex exponentials have by nature the ability to describe periodically changing quantities with ease -since they are, too- and that is one good intuitive reason to justify their presence in the formula. Each individual complex exponential function is a periodical signal that oscillates in time with a certain frequency; as a piece of information represents a particular frequency that can be observed in the initial signal - the initial signal hardly ever “contains” or, better yet, ranges over just one frequency. An infinite summation (i.e. integration) of all these “coefficient” elementary signals would give us the initial signal’s spectrum, a new signal, that represents the initial one as a function of frequency, and not time. With this algebraic manoeuvre we officially leave the time domain and enter the frequency domain. Of course, this is a completely invertible process, so we can retrieve the signal’s time-domain representation at will.
Having distinguished “continuous” from “discrete”, we should be able to notice that the nature of the quantity shown above is purely continuous with respect to time, being a weighted summation of complex exponentials. Naturally, quantities that contain a sum are discrete sequences, while ones that integrate over a variable in their formula are continuous. Bearing this informal yet on-point observation in mind, one would say that the Fourier Series formula is discrete, but that is not true, since it is an infinite sum. Hence, both representations are continuous-time representations, something we could not claim for these two representations when they are examined with respect to frequency: as we noted before, k is the number of samples in an interval of length P, which means that the initial signal is assumed to be periodical and its frequency content is sampled. Generally, every method is used in correspondence with the data we process. The CTFT is applied only on continuous time data, for example. To move on, we will need a more computationally realistic representation that not only is discrete with respect to frequency, but also is the same regarding time, and that would be the Discrete Fourier Transform (DFT):
Admittedly, it differs from the previous representations thanks to little, yet crucial details. In order to be able to compute it efficiently, we used the Fast Fourier Transform (FFT). FFT, as its name suggests, is an algorithm designed to compute the DFT in a much smaller execution time and is the most popular way to perform this computation programmatically, rightfully earning the title of the most ingenious algorithm developed in the 20th century.
As the DFT expression indicates, the calculations it does are linear, polynomial, it is a summation of different components multiplied by specific coefficients. Without immersing our analysis in linear algebra concepts, we will pay our brief homage to the Fast Fourier Transform simply by emphasizing that the ingenuity of this algorithm lies in the way it represents these components as polynomials using vector notation and optimizes the process of multiplying and adding, breaking the two operations into smaller and smaller ones; this is a technique widely known as “divide and conquer”, however there are many possible variations and implementations of this algorithm. In terms of Algorithm Analysis and complexity, FFT performs O(Nlog2(N)) operations, at the same time that the conventional DFT would take O(N2) operations. Roughly, this means that for the computation of DFT on N points the number of algebraic operations needed by the FFT is proportional to Nlog2(N) , orders of magnitude lower than the N2 of operations the DFT would require.
As we already have mentioned, the simulation data we process are not only categorized as continuous or discrete, but also as deterministic or stochastic. Formalistically speaking, the methods examined above are strictly applied on deterministic signals, so, on paper, we had to follow another route: the autocorrelation function:
In this case, our information signal is represented as a stochastic process {Xt} in which the bar notation represents complex conjugation and E[] is the expected value. The autocorrelation function is a mathematical function that measures the statistical correlation between any two consecutive values of a given stochastic process, meaning that it quantifies the extent to which two discrete, chronologically neighboring values of that process are codependent and statistically affect one another. Intuitively, it is a representation that helps us understand how abruptly the signal of interest changes and a crude measure of how random it is. The part where it contributes to our analysis, though, except for the fact that it provides us with a time-domain representation of the signal’s information, is when it enables us to extract a formal frequency-domain representation using Power Spectral Density:
This is a usual tactic worth mentioning that fills the shoes of the Fourier Transform and covers the need for a tool that allows us to study signals in the frequency domain in many disciplines that deal with stochastic phenomena. Theoretically, this was the way to go, but in computing terms things are a little different. In order to write a MATLAB script which calculates the autocorrelation function, we had to use a function which on low level uses Fourier transform (the FFT algorithm) to compute the autocorrelation in the frequency domain and then converts back to the time domain using an inverse Fourier transform, as we learned from the function’s documentation (according to “Autocorrelation function in the MATLAB language”, see References). Consequently, there was no use in implementing our code this way, since we had no profit in the sense of performance or quality of the solutions. That said, it is easy for one to see why we chose to use FFT directly from the beginning instead of using the other route, and, to be fair, this method is certainly not unusual in applications that address non-deterministic signal processing such as speech processing and sound processing.
The application of a filter in the domain of frequency (known as filtering) can be as simple as a multiplication of two signal functions. As a curve, the filter is equal to zero at the frequency bands where the noise is present, and non-zero in the bands that carry the actual information. This multiplication represents the process during which the signal passes through the filter. A full explanation of this “passage” would also carry a heavy intuitive load of information and a time-relative approach, but it’s a process we will describe only in the frequency domain for the sake of convenience, given that time domain makes things far more complicated and requires the unnecessary introduction of concepts new to the reader.
To see the optimal technique for our analysis, we once again did a bibliographical research, and we found out there is an extremely wide range of filtering applications on biological signals for a variety of purposes. We observed that in the field of attenuating intrinsic noise created by a stochastic simulation, high end techniques mostly use non-linear filtering (i.e. filters whose outputs are not linear functions of their inputs). As for the way they cut off the undesirable frequencies, these techniques feature low-pass (cutting off high frequencies), band-pass (cutting off all frequencies except for a specific band) and band-stop (cutting off a specific band) filters. The vast majority of the methods we encountered dictated the application of a low-pass one, which in some special occasions comes with the risk of a non-negligible signal distortion on account of accuracy issues. Particularly popular filtering techniques are: Bessel filters (Chung & Kennedy (1991)), Bierwirth & Goellner (2007)), Kalman filters (Andrews, Yi & Inglesias, (2006)) and Butterworth filters used for E.coli chemotaxis optimization (Chung & Kennedy (1991)). Also some honorable mentions would be: Gaussian and Laplacian filters used for image analysis (Walker, Nghe & Tans (2016), Bokde (2015)) and Least Mean Squares filters used for cancelling noise in ECG signals (Bokde (2015)).
After considering all the possible options we had and consulting Oppenheim & Schafer (2010) , we came to the conclusion that a very good solution to the noise problem is to design and use an Infinite Impulse Response (IIR) low-pass Butterworth filter, since we realized it is a type of filter that combines simplicity, popularity and relatively small loss of information in various applications (general-purpose or not). And so we did, by constructing the following data pipeline:
SimBiology offers us the ability to run multiple simulations, store all the results into datasheets, extract them to an .xlsx file, and save them there as independent and static data. With this in mind, we developed a MATLAB script that performs an automated analysis on them. Firstly, it parses the data, meaning it “reads” them and stores them into data structures the MATLAB language can manipulate. Then, it stores the average of the reporter module’s output values of all the runs for each specific moment (time value) in the simulation - averaging is a useful tool here, considering we are talking about a stochastic simulation. Finally, it transforms the data to frequency-domain data, applies the Butterworth filter as we previously described, converts them back to their time-domain representation and plots them accordingly, leaving us with smoother, more regular curves.
There is one important thing to observe here: as mentioned in the Reporter Module section, due to its simplicity, there are few possible sources contributing to noise and therefore a stochastic model gives us really steep, switch-like curves that are zero until they linearly ascend to a final value with a significant rate. This results in many samples of the curves being equal to or near zero, which really affects the averaging process and brings the whole outcome near to zero. This way we prove the point made in the previous section about complexity in the reporter module, and confirms that a purely stochastic simulation for the reporter module takes a toll on accuracy. Please also note that this lack of accuracy is genuinely a consequence of the stochastic simulation and not the noise reduction process we designed, which makes the latter a tool that can be applied independently. Any magnitude distortion observed is a consequence of the filtering process, but, again, may be negligible for a more complex reaction network. At last, the seemingly irrational fluctuations of the curve in the beginning and the end of the measurements that also become negative at particular intervals, is most probably an overshoot caused by the peculiar manner in which the Fourier series behaves at jump discontinuities, widely known as Gibbs phenomenon. Of course, it is reduced as we manually increase the accuracy of the FFT, consequently increasing execution time.
Technical Details
Scope |
Assumption |
lac operon |
Repressor dimer binds to operator with 1-1 stoichiometry |
Inducer binding reaction with repressor vs repressor-operator |
Cooperativity is equal to 2 |
lac operon |
Operator-repressor complex does not degrade |
lac operon |
IPTG is non metabolizable, it does not degrade |
lac operon |
Repressor does not degrade when bound to the operator |
lac operon |
All degradation reactions follow first-order kinetics |
lac operon |
The total DNA and operator concentration are constant, the cell’s volume is constant. Cell division and growth are neglected |
lac operon |
Extracellular IPTG concentration is constant |
lac operon |
All the species follow Mass-Action kinetics |
Reporter Module |
The charge present is directly proportional to L-DOPA concentration |
Noise Study |
The intrinsic biological noise is white and Gaussian |
To solve these problems, we had to begin with opting for the right simulation platform, and, certainly, there are numerous tools to choose from when it comes to genetic circuits simulation (Appleton et al. (2017), MacDonald et al. (2011), Marchisio, Mario & Stelling (2009)). The main criteria were: level of abstraction, language, usability, compatibility and popularity. We chose MATLAB’s SimBiology tool: its level of abstraction is on the genetic circuit scale (a tool that would work on the DNA sequence scale would be too “low-level”, and a tool that would not be able to simulate on the genetic scale would be too “high level”), the programmatic language it uses is the well known MATLAB language, it is very user-friendly, compatible with many document types (SBML files is a very good example) and, finally, very popular for synthetic biology applications with a big community supporting it. Lastly, it gives us the option of choosing between programmatically creating the model and the simulation programs with MATLAB scripts and using its Graphical User Interface – we used each option as we saw fit.
Now, in the numerical methods domain, the capriciousness of the biological systems we mentioned before is manifested with stiffness. An ODE problem is stiff when the equation(s) describing it are very sensitive to specific terms they include, which results in great, non-predictable variations between nearby solutions (solutions found at iterations close to one another, chronologically). This, in turn, can possibly make the error skyrocket, so the method of interest needs to adjust (reduce, basically) its step to follow the rapid changes in the dynamically improving solutions. Stiffness has a major impact on a method’s stability – many methods are ruled out as inappropriate when a stiff problem is at hand, because of their failure to converge when solving one, or to do so efficiently. The methods that are indeed appropriate, make many small steps, and not few big ones, so, mainly, stiffness is an efficiency issue, considering we can solve such problems but with the cost of a significant rise in simulation time.
Choosing a numerical solving method in our case means choosing a solver. Starting off to build a deterministic model for the lac operon using a set of ODEs, we used the standard ode45 (Dormand-Prince) solver, which under the hood uses the classic Runge-Kutta method which is pretty much one’s very first choice for solving an ODE problem (Runge-Kutta is basically a family of iterative methods that numerically solve ODE problems, the “RK4” member of which is labeled as the “classic Runge–Kutta method” or simply “the Runge–Kutta method”). Nevertheless, we observed the enormous simulation times that choice brought, and soon suspected the stiffness of our equations, so we considered experimentally changing our solvers. The ones that stood out as the wisest options were ode23s and ode15s. The “s” in their names stands for “stiff”, meaning they specialize in solving such problems. Dr Stamatakis later confirmed our switch on the solvers, as well as our suspicion about the nature of the problem. Although among the two of them there is no easy choice, we stuck with ode15s, sacrificing a little speed for some accuracy.
The species-oriented philosophy of our model, as well as the useful advice of Dr. Stamatakis , led to us choosing Mass Action kinetics to depict its dynamics. Mass Action Kinetics is a technique popular among transcription units whose proteins (monomers), once synthesized, dimerize and it can provide a model receptive to assumptions which simplify it without distorting the information we extract. Mass Action, based on the homonymous law, introduces a set of ODEs, each of which is associated with each species separately and then it reduces it to a linear system represented by a stoichiometric matrix that is solved numerically. Certain reactions which were enzymatic were modeled with Michaelis-Menten kinetics, which are usually employed when we have a reaction where an enzyme converts a substrate into a product. In these cases, calculation via Mass Action Kinetics would require the solution of a standard system of ODEs, something that is not necessary with Michaelis-Menten, since it allows us to make simplifications: when the binding reaction of the enzyme to the substrate is much faster than both the dissociation of the enzyme from substrate and the conversion of the complex into the product with the release of free enzyme molecules, the enzyme-substrate complex changes so slowly over time that it can be treated as if it were at the equilibrium (Marchisio M. A., (2018)).
References
- Amalthea: A modular platform for monitoring gastrointestinal health. (2020). https://2020.Igem.org/. https://2020.igem.org/Team:Thessaly/Model
- Andrews, B. W., Yi, T. M., & Iglesias, P. A. (2006). Optimal noise filtering in the chemotactic response of Escherichia coli. PLoS Computational Biology, 2(11), 1407–1418. https://doi.org/10.1371/journal.pcbi.0020154
- Appleton, E., Madsen, C., Roehner, N., & Densmore, D. (2017). Design automation in synthetic biology. Cold Spring Harbor Perspectives in Biology, 9(4). https://doi.org/10.1101/cshperspect.a023978
- Bartley, B. A., Kim, K., Medley, J. K., & Sauro, H. M. (2017). Synthetic Biology: Engineering Living Systems from Biophysical Principles. Biophysical Journal, 112(6), 1050–1058. https://doi.org/10.1016/j.bpj.2017.02.013
- Bierwirth, M., & Goellner, J. (2007). Modeling of electrochemical noise transients. Materials and Corrosion, 58(12), 992–996. https://doi.org/10.1002/maco.200704093
- Bokde, P. R. (2015). Implementation of Adaptive filtering algorithms for removal of Noise from ECG signal. 6(1), 51–56.
- Cabrita, L. D., Gilis, D., Robertson, A. L., Dehouck, Y., Rooman, M., & Bottomley, S. P. (2007). Enhancing the stability and solubility of TEV protease using in silico design. Protein Science, 16(11), 2360–2367.https://doi.org/10.1110/ps.072822507
- Chung, S. H., & Kennedy, R. A. (1991). Forward-backward non-linear filtering technique for extracting small biological signals from noise. Journal of Neuroscience Methods, 40(1), 71–86.https://doi.org/10.1016/0165-0270(91)90118-J
- Dogra, S., Sona, C., Kumar, A., & Yadav, P. N. (2016). Tango assay for ligand-induced GPCR-β-arrestin2 interaction: Application in drug discovery. In Methods in Cell Biology (Vol. 132). Elsevier Ltdhttps://doi.org/10.1016/bs.mcb.2015.11.001
- Gomez-Uribe, C., Verghese, G. C., & Mirny, L. A. (2007). Operating regimes of signaling cycles: Statics, dynamics, and noise filtering. PLoS Computational Biology, 3(12), 2487–2497. https://doi.org/10.1371/journal.pcbi.0030246
- Heitzler, D., Durand, G., Gallay, N., Rizk, A., Ahn, S., Kim, J., Violin, J. D., Dupuy, L., Gauthier, C., Piketty, V., Crépieux, P., Poupon, A., Clément, F., Fages, F., Lefkowitz, R. J., & Reiter, E. (2012). Competing G protein-coupled receptor kinases balance G protein and β-arrestin signaling. Molecular Systems Biology, 8(590).https://doi.org/10.1038/msb.2012.22
- V. Hsiao, A. Swaminathan and R. M. Murray (2018), "Control Theory for Synthetic Biology: Recent Advances in System Characterization, Control Design, and Controller Implementation for Synthetic Biology,". IEEE Control Systems Magazine, vol. 38, no. 3, pp. 32-62, June, doi: 10.1109/MCS.2018.2810459
- Ji, Z., Yan, K., Li, W., Hu, H., & Zhu, X. (2017). Mathematical and Computational Modeling in Complex Biological Systems. BioMed Research International, 2017.https://doi.org/10.1155/2017/5958321
- Li, C., Donizelli, M., Rodriguez, N., Dharuri, H., Endler, L., Chelliah, V., Li, L., He, E., Henry, A., Stefan, M. I., Snoep, J. L., Hucka, M., Le Novère, N., & Laibe, C. (2010). BioModels Database: An enhanced, curated and annotated resource for published quantitative kinetic models. BMC Systems Biology, 4https://doi.org/10.1186/1752-0509-4-92
- MacDonald, J. T., Barnes, C., Kitney, R. I., Freemont, P. S., & Stan, G. B. V. (2011). Computational design approaches and tools for synthetic biology. Integrative Biology, 3(2), 97–108https://doi.org/10.1039/c0ib00077a
- Marchisio M. A., & Stelling, J. (2009). Computational design tools for synthetic biology. Current Opinion in Biotechnology, 20(4), 479–485.https://doi.org/10.1016/j.copbio.2009.08.007
- Marchisio M. A.. (2018). Learning Materials in Biosciences. Introduction in Synthetic Biology. http://www.springer.com/series/15430
- Nam, H., Hwang, B. J., Choi, D. Y., Shin, S., & Choi, M. (2020). Tobacco etch virus (TEV) protease with multiple mutations to improve solubility and reduce self-cleavage exhibits enhanced enzymatic activity. FEBS Open Bio, 10(4), 619–626.https://doi.org/10.1002/2211-5463.12828
- Stamatakis, M., & Mantzaris, N. V. (2009). Comparison of deterministic and stochastic models of the lac operon genetic network. Biophysical Journal, 96(3), 887–906. https://doi.org/10.1016/j.bpj.2008.10.028
- Swain, P. S., Elowitz, M. B., & Siggia, E. D. (2002). Intrinsic and extrinsic contributions to stochasticity in gene expression. Proceedings of the National Academy of Sciences of the United States of America, 99(20), 12795–12800https://doi.org/10.1073/pnas.162041399
- Vanarsdale, E., Hörnström, D., Sjöberg, G., Järbur, I., Pitzer, J., Payne, G. F., Van Maris, A. J. A., & Bentley, W. E. (2020). A coculture based tyrosine-tyrosinase electrochemical gene circuit for connecting cellular communication with electronic networks. ACS Synthetic Biology, 9(5), 1117–1128https://doi.org/10.1021/acssynbio.9b00469
- Walker, N., Nghe, P., & Tans, S. J. (2016). Generation and filtering of gene expression noise by the bacterial cell cycle. BMC Biology, 14(1), 1–10. https://doi.org/10.1186/s12915-016-0231-z
- A.V. Oppenheim and R.W. Schafer (2010). Discrete-Time Signal Processing, 3rd edition, Pearson.
- O'Connor, C. M. & Adams, J. U. (2010). Essentials of Cell Biology. Cambridge, MA: NPG Education.
- https://doi.org/10.1186/s12915-016-0231-z autocorrelation function in the MATLAB language
- https://doi.org/10.1186/s12915-016-0231-z Solvers on Simbiology