Difference between revisions of "Team:Warwick/Model"

Line 326: Line 326:
 
       </ul>
 
       </ul>
 
     </li>
 
     </li>
     <li><a href="#modelling-with-nupack">Modelling with NUPACK</a>
+
     <li><a href="#modelling-with-nupack">Modelling with NUPACK</a></li>
      <ul>
+
    <li><a href="#references">References</a></li>
        <li><a href="#references">References</a></li>
+
      </ul>
+
    </li>
+
 
   </ul>
 
   </ul>
 
</nav>
 
</nav>
Line 465: Line 462:
 
                 <tr>
 
                 <tr>
 
                     <td>
 
                     <td>
                         <p>This option has been removed</p>
+
                         <input class="sliderInput" id="initially_infected_slider" max="50" min="0" oninput="initially_infected_text.value=initially_infected_slider.value" step="1" type="range" value="10"/>
 +
                        <input class="textInput" id="initially_infected_text" max="50" min="0" oninput="initially_infected_slider.value=initially_infected_text.value" type="number" value="10"/>
 +
                        <br/><button onclick="model.Params.INITIALLY_INFECTED = parseInt(document.getElementById('initially_infected_slider').value)">
 +
                            Set the number of people initially infected</button>
 
                     </td>
 
                     </td>
 
                     <td>
 
                     <td>
Line 804: Line 804:
 
</div>
 
</div>
  
<p>From this, we can derive a set of differential equations which express the system, which is self-evident based on the above state transitions. Note that this is expressed as occurring of continuous time, however, this is just the limit of the time increments size going to zero.
+
<p>From this, we can derive a set of differential equations which express the system, which is self-evident based on the above state transitions. Note that this is expressed as occurring of continuous time, however, this is just the limit of the time increments size going to zero.</p>
$$
+
<p>$$
 
\frac{d[S]}{dt} = - \beta [S][I]
 
\frac{d[S]}{dt} = - \beta [S][I]
 
$$</p>
 
$$</p>
Line 890: Line 890:
 
      
 
      
 
      
 
      
     <img alt="The state transition diagram of every state a person within the population can take" class="centered-image" decoding="async" src="../assets/content/model/general.png"/>
+
     <img alt="The state transition diagram of every state a person within the population can take" class="centered-image" decoding="async" src="https://static.igem.org/mediawiki/2021/f/f2/T--Warwick--content--model--general.png"/>
 
      
 
      
 
      
 
      
Line 914: Line 914:
  
 
<h4 id="2-treatment-and-mutation">2. Treatment and mutation</h4>
 
<h4 id="2-treatment-and-mutation">2. Treatment and mutation</h4>
<p>Antibiotics are used in a specific order, which are numbered accordingly for clarity (with $1$ being the first administered, and $n$ being the last for antibiotics $1..n$ ). This is to simulate the real-world, where different antibiotics are used in a tiered system, reserving the last for highly dangerous, multi-drug resistant pathogens - and is an important aspect of our model, as our product attempts to identify <em>Carbapenem resistant Enterobacteriaceae</em> (CRE), which are a type of these resistant pathogens.</p>
+
<p>Antibiotics are used in a specific order, which are numbered accordingly for clarity (with $1$ being the first administered, and $n$ being the last for antibiotics $1..n$ ). This is to simulate the real-world, where different antibiotics are used in a tiered system, reserving the last for highly dangerous, multi-drug resistant pathogens - and is an important aspect of our model, as our product attempts to identify Carbapenem resistant Enterobacteriaceae (CRE), which are a type of these resistant pathogens.</p>
 
<p>Pathogens have a small chance of mutating to develop resistance to antibiotics being used to treat them, as such strains will only become dominant when there is a pressure giving them a survival advantage.</p>
 
<p>Pathogens have a small chance of mutating to develop resistance to antibiotics being used to treat them, as such strains will only become dominant when there is a pressure giving them a survival advantage.</p>
 
<div class="highlight"><div style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">
 
<div class="highlight"><div style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">
Line 1,151: Line 1,151:
 
</div>
 
</div>
  
<p>Above shows a block diagram of steps in model design - taken from “Testing and Validation of Computer Simulation Models: Principles, Methods and Applications” [6]</p>
 
 
<p>We went through three iterative design stages of increasing complexity and proximity to real life before settling on our production code:</p>
 
<p>We went through three iterative design stages of increasing complexity and proximity to real life before settling on our production code:</p>
 
<ol>
 
<ol>
Line 1,231: Line 1,230:
 
</div>
 
</div>
 
</div><p>If the tests are run many times, with many different resulting random number inputs, these unit tests can now be thought of as property based tests. This refers to checking that a function fulfils a property by randomly providing it with values from its input domain, and checking that the resultant outputs fulfil the property. This is a strategy which was pioneered in the functional programming language Haskell [7], and is often considered preferable to unit based tests [8].</p>
 
</div><p>If the tests are run many times, with many different resulting random number inputs, these unit tests can now be thought of as property based tests. This refers to checking that a function fulfils a property by randomly providing it with values from its input domain, and checking that the resultant outputs fulfil the property. This is a strategy which was pioneered in the functional programming language Haskell [7], and is often considered preferable to unit based tests [8].</p>
 +
<p>The whole set of property based tests we wrote can then be run using the <code>pytest</code> command:</p>
 +
<div class="text-center">
 +
   
 +
   
 +
    <img alt="A screenshot of the output of running the pytest command" class="centered-image" decoding="async" src="https://static.igem.org/mediawiki/2021/a/a0/T--Warwick--content--model--pytest.png"/>
 +
   
 +
   
 +
        <p class="hugo-figure-caption">A screenshot of the output of running the <code>pytest</code> command</p>
 +
   
 +
</div>
 +
 
<h4 id="version-control-and-cicd">Version control and CI/CD</h4>
 
<h4 id="version-control-and-cicd">Version control and CI/CD</h4>
 
<p>Having implemented a robust testing strategy, we now had all the building blocks for a continuous integration/continuous development workflow, as shown below:</p>
 
<p>Having implemented a robust testing strategy, we now had all the building blocks for a continuous integration/continuous development workflow, as shown below:</p>
Line 1,244: Line 1,254:
  
 
<p>The build phase is relatively simple - writing the code in an editor of your choice, and running it with the Python interpreter, and the testing phase is discussed above.</p>
 
<p>The build phase is relatively simple - writing the code in an editor of your choice, and running it with the Python interpreter, and the testing phase is discussed above.</p>
<p>Throughout the entire project, we used <code>git</code> as version control. From this, we linked the project to a remote repository on GitHub, which forms the main way to access the most up to data code - forming the merge and continuous delivery steps.</p>
+
<p>Throughout the entire project, we used <code>git</code> as version control. From this, we linked the project to a remote repository on GitHub, which is the main way to access the most up to data code, so forms both the merge and continuous delivery steps. When code is pushed to the remote repository, actions are automatically run to ensure that the code is both syntactically and conceptually correct, using the static analysis tool <a href="https://flake8.pycqa.org/en/latest/">flake8</a> for the former, and by running <code>pytest</code> on all the code as discussed above for the latter.</p>
<p>We chose not to automate publishing the code to PyPI (discussed below), which could be considered the production aspect of the modelling, as the project was under active development, and minor changes to the repository should not necessarily be pushed, as their general stability and usefulness is not fully known.</p>
+
<p>We chose not to automate publishing the code to PyPI (discussed below), which could be considered the production aspect of the modelling, as the project was under active development, and minor changes to the repository should not necessarily be pushed - as their general stability and usefulness is not fully known.</p>
 
<h4 id="transpilation-to-javascript">Transpilation to Javascript</h4>
 
<h4 id="transpilation-to-javascript">Transpilation to Javascript</h4>
 
<p>In order to create the web-based version of the model, we needed to use a language which can be run client side in the browser. Since Python cannot do this, we needed to convert the source code into a language which can - with the obvious choice being Javascript, due to its ubiquitous use in this environment.</p>
 
<p>In order to create the web-based version of the model, we needed to use a language which can be run client side in the browser. Since Python cannot do this, we needed to convert the source code into a language which can - with the obvious choice being Javascript, due to its ubiquitous use in this environment.</p>
Line 1,253: Line 1,263:
 
<h4 id="uploading-to-pypi">Uploading to PyPI</h4>
 
<h4 id="uploading-to-pypi">Uploading to PyPI</h4>
 
<p>Since we developed our model in python, we uploaded it to PyPI, the Python package index. We did this as it greatly simplifies the way in which the package can be distributed. Instead of cloning the repository, and running the code directly through that:</p>
 
<p>Since we developed our model in python, we uploaded it to PyPI, the Python package index. We did this as it greatly simplifies the way in which the package can be distributed. Instead of cloning the repository, and running the code directly through that:</p>
<div class="highlight"><pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-bash" data-lang="bash">git clone https://github.com/Warwick-iGEM-2021/modelling
+
<div class="highlight"><div style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">
 +
<table style="border-spacing:0;padding:0;margin:0;border:0;width:auto;overflow:auto;display:block;"><tbody><tr><td style="vertical-align:top;padding:0;margin:0;border:0;">
 +
<pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code><span style="margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f">1
 +
</span><span style="margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f">2
 +
</span><span style="margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f">3
 +
</span></code></pre></td>
 +
<td style="vertical-align:top;padding:0;margin:0;border:0;;width:100%">
 +
<pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-bash" data-lang="bash">git clone https://github.com/Warwick-iGEM-2021/modelling
 
cd modelling/tiered-antibiotic-resistance-model
 
cd modelling/tiered-antibiotic-resistance-model
 
python3 model.py
 
python3 model.py
</code></pre></div><p>The module can be installed using <code>pip</code> on the command line, then just imported directly in a Python file:</p>
+
</code></pre></td></tr></tbody></table>
<div class="highlight"><pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-bash" data-lang="bash">pip install tiered-antibiotic-resistance-model
+
</div>
</code></pre></div><div class="highlight"><div style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">
+
</div><p>The module can be installed using <code>pip</code> on the command line, then just imported directly in a Python file:</p>
 +
<div class="highlight"><div style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">
 +
<table style="border-spacing:0;padding:0;margin:0;border:0;width:auto;overflow:auto;display:block;"><tbody><tr><td style="vertical-align:top;padding:0;margin:0;border:0;">
 +
<pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code><span style="margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f">1
 +
</span></code></pre></td>
 +
<td style="vertical-align:top;padding:0;margin:0;border:0;;width:100%">
 +
<pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-bash" data-lang="bash">pip install tiered-antibiotic-resistance-model
 +
</code></pre></td></tr></tbody></table>
 +
</div>
 +
</div><div class="highlight"><div style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">
 
<table style="border-spacing:0;padding:0;margin:0;border:0;width:auto;overflow:auto;display:block;"><tbody><tr><td style="vertical-align:top;padding:0;margin:0;border:0;">
 
<table style="border-spacing:0;padding:0;margin:0;border:0;width:auto;overflow:auto;display:block;"><tbody><tr><td style="vertical-align:top;padding:0;margin:0;border:0;">
 
<pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code><span style="margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f">1
 
<pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code><span style="margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f">1
Line 1,550: Line 1,576:
 
<p>While the purpose of our product was to decrease the overall number of deaths, we cannot know it has the intended effect on infection, mortality and death rates until we have tested it. Hence, we conducted two-sided hypothesis tests. This means that our alternative hypothesis (as opposed to the null hypothesis) was that the mean values for infection, mortality and death rates were <em>either higher or lower</em> when using the product than when not using it.</p>
 
<p>While the purpose of our product was to decrease the overall number of deaths, we cannot know it has the intended effect on infection, mortality and death rates until we have tested it. Hence, we conducted two-sided hypothesis tests. This means that our alternative hypothesis (as opposed to the null hypothesis) was that the mean values for infection, mortality and death rates were <em>either higher or lower</em> when using the product than when not using it.</p>
 
<p>We can assume that the outcomes of the model follow a normal distribution. However, we do not know the standard deviation of outcomes. Therefore, we were left with two options: to approximate a normal distribution or to calculate the test based on a student’s t-distribution. Since we ran the simulations 10 times using and 10 times not using the product respectively, we have a sample size of 10 to calculate the mean values. This is a very low sample size, which suggested the most appropriate distribution was a student’s t-distribution. To know the degrees of freedom (DoF) of the t-test, we had to conduct F-tests of equality of variances test. The null hypothesis of these F-tests was that the variances were equal, while the two-sided alternative hypothesis was that they were not equal. For these tests we used a significance level of 10%.</p>
 
<p>We can assume that the outcomes of the model follow a normal distribution. However, we do not know the standard deviation of outcomes. Therefore, we were left with two options: to approximate a normal distribution or to calculate the test based on a student’s t-distribution. Since we ran the simulations 10 times using and 10 times not using the product respectively, we have a sample size of 10 to calculate the mean values. This is a very low sample size, which suggested the most appropriate distribution was a student’s t-distribution. To know the degrees of freedom (DoF) of the t-test, we had to conduct F-tests of equality of variances test. The null hypothesis of these F-tests was that the variances were equal, while the two-sided alternative hypothesis was that they were not equal. For these tests we used a significance level of 10%.</p>
<p>To test equality of variances, we use this formula if $S_1^2 &gt; S_2^2$ :
+
<p>To test equality of variances, we use this formula if $S_1^2 &gt; S_2^2$ :</p>
$$
+
<p>$$
 
P \left( F_{n_1-1, n_2-1} &gt; \frac{S_1^2}{S_2^2} \right)
 
P \left( F_{n_1-1, n_2-1} &gt; \frac{S_1^2}{S_2^2} \right)
$$
+
$$</p>
And if $S_1^2 &lt; S_2^2$ , we use:
+
<p>And if $S_1^2 &lt; S_2^2$ , we use:</p>
$$
+
<p>$$
 
P \left( F_{n_2-1, n_1-1} &gt; \frac{S_2^2}{S_1^2} \right)
 
P \left( F_{n_2-1, n_1-1} &gt; \frac{S_2^2}{S_1^2} \right)
$$
+
$$</p>
Where $S_1^2$ is the sample variance of any given outcome variable when not using the product and $S_2^2$ is the equivalent when using the product.</p>
+
<p>Where $S_1^2$ is the sample variance of any given outcome variable when not using the product and $S_2^2$ is the equivalent when using the product.</p>
<p>If we the find the variances to be equal, we calculate the probability of a Type I error using this formula:
+
<p>If we the find the variances to be equal, we calculate the probability of a Type I error using this formula:</p>
$$
+
<p>$$
 
P \left( t_{DoF} &gt; \frac{\overline{x_1} - \overline{x_2}}{\sqrt{\frac{S_0^2}{n_1} + \frac{S_0^2}{n_2}}} \right)
 
P \left( t_{DoF} &gt; \frac{\overline{x_1} - \overline{x_2}}{\sqrt{\frac{S_0^2}{n_1} + \frac{S_0^2}{n_2}}} \right)
$$
+
$$</p>
With $DoF=n_1+n_2-2$.</p>
+
<p>With $DoF=n_1+n_2-2$.</p>
<p>If we the find the variances to not be equal, we calculate the probability of a Type I error using the same formula but changing the degrees of freedom. With equal sample sizes we can calculate and simplify the degrees of freedom when variances are not equal as such:
+
<p>If we the find the variances to not be equal, we calculate the probability of a Type I error using the same formula but changing the degrees of freedom. With equal sample sizes we can calculate and simplify the degrees of freedom when variances are not equal as such:</p>
$$
+
<p>$$
 
DoF = \frac{(\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2})^2}{\frac{(\frac{s_1^2}{n_1})^2}{n_1-1} + \frac{(\frac{s_2^2}{n_2})^2}{n_2-1}} = (n-1) \frac{(S_1^2 + S_2^2)^2}{S_1^4 + S_2^4}
 
DoF = \frac{(\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2})^2}{\frac{(\frac{s_1^2}{n_1})^2}{n_1-1} + \frac{(\frac{s_2^2}{n_2})^2}{n_2-1}} = (n-1) \frac{(S_1^2 + S_2^2)^2}{S_1^4 + S_2^4}
$$
+
$$</p>
We let $\overline{x_1}$ be the mean value for any given outcome variable when not using the product and $\overline{x_2}$ the mean value when using the product. $n_1$ and $n_2$ were the sample sizes, which was 10 in both cases. Since the initial assumption is that the null hypothesis holds, $S_0^2$ is the hypothesised variance of the hypothesised real distribution, or in other words the square of the standard deviation of the hypothesised distribution.</p>
+
<p>We let $\overline{x_1}$ be the mean value for any given outcome variable when not using the product and $\overline{x_2}$ the mean value when using the product. $n_1$ and $n_2$ were the sample sizes, which was 10 in both cases. Since the initial assumption is that the null hypothesis holds, $S_0^2$ is the hypothesised variance of the hypothesised real distribution, or in other words the square of the standard deviation of the hypothesised distribution.</p>
 
<p>Since the sample sizes are equal, we calculate the hypothesised variance using the formula:</p>
 
<p>Since the sample sizes are equal, we calculate the hypothesised variance using the formula:</p>
 
<p>$$
 
<p>$$
Line 2,106: Line 2,132:
 
<div class="content-page">
 
<div class="content-page">
  
<h3 id="references">References</h3>
+
<h2 id="references">References</h2>
 
<p>[1] Kermack, W O. McKendrick, A G. 1927. <em>A contribution to the mathematical theory of epidemics</em>. Proc. R. Soc. Lond. A 115: 700–721 <a href="http://doi.org/10.1098/rspa.1927.0118">http://doi.org/10.1098/rspa.1927.0118</a></p>
 
<p>[1] Kermack, W O. McKendrick, A G. 1927. <em>A contribution to the mathematical theory of epidemics</em>. Proc. R. Soc. Lond. A 115: 700–721 <a href="http://doi.org/10.1098/rspa.1927.0118">http://doi.org/10.1098/rspa.1927.0118</a></p>
 
<p>[2] Simon, C., 2020. <em>The SIR dynamic model of infectious disease transmission and its analogy with chemical kinetics</em>. PeerJ Physical Chemistry, 2, p.e14.</p>
 
<p>[2] Simon, C., 2020. <em>The SIR dynamic model of infectious disease transmission and its analogy with chemical kinetics</em>. PeerJ Physical Chemistry, 2, p.e14.</p>

Revision as of 23:57, 20 October 2021

Model