Difference between revisions of "Team:Vilnius-Lithuania/Engineering"

 
(7 intermediate revisions by 4 users not shown)
Line 77: Line 77:
 
             <h3 class="index-headline">Overview</h3>
 
             <h3 class="index-headline">Overview</h3>
 
             <p>
 
             <p>
               Lorem ipsum dolor sit amet, consectetur adipiscing elit. Phasellus
+
               In the process of our project development, we performed many engineering cycles.
               ac enim id metus rutrum blandit sed non dolor. Pellentesque
+
               We decided to describe five of them, that had the most distinguished stages.
              feugiat odio eu imperdiet rutrum. Duis consectetur porttitor enim,
+
            </p>
              id elementum nibh tempus in. Nulla ut massa rutrum, ullamcorper
+
            <p>
               dui et, posuere velit. Cras viverra, tortor at porta pulvinar,
+
               Since engineering stages are four (design, build, test, and learn),  <b>we imagined engineering process as a unit circle, which can also be represented in the form of sinusoid. </b> One sinusoid wave is equal to one engineering cycle.
              libero eros pulvinar orci, sed pulvinar neque felis eget dui.
+
            </p>
              Fusce laoreet libero vitae nunc hendrerit consequat. Nunc eget
+
            <p>
               bibendum turpis. Vestibulum pulvinar interdum mauris nec congue.
+
               Below you can see the general animation describing our vision. It shows the
               Etiam id nunc ac risus dictum semper sed in nisl.
+
               engineering process that contains three engineering cycles.
 
             </p>
 
             </p>
 
             <svg viewBox="0 0 710 350" id="overview-animation"></svg>  
 
             <svg viewBox="0 0 710 350" id="overview-animation"></svg>  
Line 93: Line 93:
 
               <h4>Design</h4>
 
               <h4>Design</h4>
 
               <p>
 
               <p>
                 At first we looked for a protein marker that could be used to identify
+
                 At first, we looked for a protein marker that could be used to identify
                 <i>Entamoeba histolytica</i>, we found a few promising candidates and in the end we chose
+
                 <i>Entamoeba histolytica.</i> We found a few promising candidates and, in the end, we chose cysteine proteinase 5 (CP5) and <b>pyruvate phosphate dikinase (PPDK)</b> as our analytes because they are unique markers of the <i>E. histolytica</i>. One of them has even been already considered as a marker for a test in previous papers [1,2]. We decided to put it in the<b> pET28a(+)
                cysteine proteinase 5 (CP5) and pyruvate phosphate dikinase (PPDK) as our analytes because
+
                 plasmid</b> as in that way we could manipulate the insert in various ways and get the construct in both
                they are unique markers of the <i>E. histolytica</i> and one of them has even been already
+
                considered as a marker for a test in previous papers [1]. We decided to put it in the pET28a(+)
+
                 plasmid because we could manipulate the insert in various ways and get the construct in both
+
 
                 N-His and C-His forms.
 
                 N-His and C-His forms.
 
               </p>
 
               </p>
 
               <h4>Build</h4>
 
               <h4>Build</h4>
 
               <p>
 
               <p>
                 The cloning of PPDK was unsuccessful as the sequencing showed that there were a few mutations
+
                 The cloning of PPDK was unsuccessful - the sequencing showed that there were <b>a few mutations in our gene of interest that made it impossible to use the protein.</b> We tried to clone it again.
                in our gene of interest that made it impossible to use the protein. We tried to clone it again,
+
                 After four failed attempts we were running out of our ordered construct, so we decided to
                 but, after four failed attempts we were running out of our ordered construct so we decided to
+
                 <b>clone it into a pUC19 vector</b> with simple blunt-end cloning. Since that worked at the first time,
                 clone it into a pUC19 vector with simple blunt-end cloning. Since that worked at the first time
+
                 we decided that there was probably something wrong with our plasmid backbone. Finally, <b>we successfully cloned PPDK into the pET28a(+) vector</b>.  
                 we decided that there was probably something wrong with our plasmid backbone and finally successfully
+
                cloned PPDK into the pET28a(+) vector.  
+
 
               </p>
 
               </p>
 
               <h4>Test</h4>
 
               <h4>Test</h4>
 
               <p>
 
               <p>
                 We tested a few different E coli strains and growing conditions for the induction, some had no
+
                 We tested a few different <i>E. coli</i> strains and growing conditions for the induction. Some had no
                 visible protein at all, some had a lot, but none of them had any appreciable amount in the soluble
+
                 visible protein at all, some had a lot, but none of them had a substantial amount in the soluble
                 fraction. After determining that the BL21 (DE3) strain had the most protein in the insoluble
+
                 fraction. After determining that <b>the BL21 (DE3) strain had the most protein</b> in the insoluble
                 fraction we also tested out different lysis conditions warrying the saccharose,
+
                 fraction, we tested out different lysis conditions varrying the saccharose,
 
                 NP-40 and Triton X-100 concentrations in the lysis buffers.  
 
                 NP-40 and Triton X-100 concentrations in the lysis buffers.  
 
               </p>
 
               </p>
 
               <h4>Learn</h4>
 
               <h4>Learn</h4>
 
               <p>
 
               <p>
                 After all the testing we found out that the best conditions for induction are in the <i>E. coli</i>
+
                 After all the testing we found out that <b>the best conditions for induction are in the <i>E. coli</i>
                 strain BL21 (DE3) with a 0.6 mM IPTG concentration for 3 hours in 37°C in the TB medium [2].
+
                 strain BL21 (DE3) with a 1 mM IPTG concentration for 3 hours at 37°C in the TB medium [2].</b>
                 And the most protein was found in the soluble fraction when we used a lysis buffer with 50 mM
+
                 The most protein was found in the soluble fraction when we used a lysis buffer with 50 mM
                 NaH<sub>2</sub>PO<sub>4</sub>, 500 mM NaCl, 10 mM imidazole and 0.5 % NP-40. The protein
+
                 NaH<sub>2</sub>PO<sub>4</sub>, 500 mM NaCl, 10 mM imidazole, and 0.5 % NP-40. The protein
                 was not clean enough to start the SELEX and there wasn’t much of it so we needed to find an alternative.
+
                 was not pure enough to start the SELEX and there was not much of it, therefore we needed to find an alternative.
 
               </p>
 
               </p>
  
Line 130: Line 125:
 
               <h4>Design</h4>
 
               <h4>Design</h4>
 
               <p>
 
               <p>
                 Since we could not synthesise an appreciable amount of our target protein we decided to perform
+
                 Since we could not synthesize a significant amount of our target protein, we decided to <b>perform
                 our SELEX process on a denatured (and refolded) protein and the denatured the proteins in the
+
                 our SELEX process on a denatured (and refolded) protein.</b> We decided to denature the proteins in the
 
                 sample blood on our test.
 
                 sample blood on our test.
 
               </p>
 
               </p>
 
               <h4>Build</h4>
 
               <h4>Build</h4>
 
               <p>
 
               <p>
                 At first, we needed to find out at what specific concentration of a strong detergent. Then we needed to find a buffer solution which could keep the protein soluble and would also not interfere with the SELEX process downstream.
+
                 At first, we needed to <b>find out the specific concentration of a needed strong detergent.</b> Then we  
 +
                required <b>to find a buffer solution, </b>which could keep the protein soluble and would not interfere  
 +
                with the SELEX process downstream.
 
               </p>
 
               </p>
 
               <h4>Test</h4>
 
               <h4>Test</h4>
 
               <p>
 
               <p>
                 We tried to solubilise the PPDK containing biomass with a gradient of urea buffers to find out at what concentration does the protein solubilise the best. After determining the denaturing lysis conditions we dialysed the lysate against a soluble lysis buffer containing 5 different additives - arginine (0.375 M), trehalose(0.75 M), proline(0.5 M), mannitol(0.5 M) and CuCl<sub>2</sub> (10 mM) [3, 4].
+
                 We tried to dissolve the PPDK containing biomass with a gradient of urea buffers to find out  
 +
                at what concentration the protein dissolves the best. After determining the denaturing  
 +
                lysis conditions, we dialyzed the lysate against a soluble lysis buffer containing  
 +
                <b>5 different additives </b>- arginine (0.375 M), trehalose (0.75 M), proline (0.5 M), mannitol
 +
                (0.5 M) and CuCl<sub>2</sub> (10 mM) [3, 4].
 
               </p>
 
               </p>
 
               <h4>Learn</h4>
 
               <h4>Learn</h4>
 
               <p>
 
               <p>
                 We found out that PPDK becomes soluble when a lysis buffer supplemented with 6 M urea is used. It is possible to transfer the solubilised protein in a phosphate buffer containing 500 mM of NaCl and 0.375 M arginine. The protein stays soluble in this solution and can be used in downstream applications.
+
                 We found out that<b> PPDK becomes soluble when a lysis buffer supplemented with 6 M urea is used.</b>
 +
                It is possible to transfer the solubilized protein in a phosphate buffer containing 500 mM of  
 +
                NaCl and 0.375 M arginine. The protein stays soluble in this solution and can be used in downstream applications.
 
               </p>
 
               </p>
 
             </div>
 
             </div>
Line 152: Line 155:
 
               <h4>Design</h4>
 
               <h4>Design</h4>
 
               <p>
 
               <p>
                 At first we looked for a protein marker that could be used to identify <i>Entamoeba histolytica</i>, we found a few promising candidates and in the end chose cysteine proteinase 5 (CP5) and pyruvate phosphate dikinase (PPDK) as our analytes because they are unique markers of the entamoeba and one of them has even been considered as a marker for a test already in previous papers [1]. We decided to put it in the pET28a(+) plasmid because we could manipulate the insert in various ways and get the construct in both N-His and C-His forms.
+
                 At first, we looked for a protein marker that could be used to identify <i>Entamoeba histolytica</i>, we  
 +
                found a few promising candidates. In the end, we chose<b> cysteine proteinase 5 (CP5)</b> and pyruvate phosphate  
 +
                dikinase (PPDK) as our analytes because they are unique markers of <i>E. histolytica</i> and one of them  
 +
                has even been considered as a marker for a test already in previous papers [1,2]. We decided to put it in  
 +
              <b> the pET28a(+) plasmid </b>because we could manipulate the insert in various ways and get the construct in  
 +
                both N-His and C-His forms.
 
               </p>
 
               </p>
 
               <h4>Build</h4>
 
               <h4>Build</h4>
 
               <p>
 
               <p>
                 The cloning of CP5 into the pET28a(+) vector was successful with both N-His and C-His tags, then it was transformed into the <i>E. coli</i> BL21(DE3) strain.
+
                 The cloning of CP5 into the pET28a(+) vector was successful with both N-His and C-His tags, then it  
 +
                was transformed into the <i>E. coli</i> BL21 (DE3) strain.
 
               </p>
 
               </p>
 
               <h4>Test</h4>
 
               <h4>Test</h4>
 
               <p>
 
               <p>
                 The protein synthesis was successful and we didn’t need to optimize the growing conditions besides the IPTG concentration – we compared the protein expression at 1 mM and 0.6 mM IPTG and we found that the lower one works better. We ran into a bunch of problems while trying to process the insoluble protein because it had to be denatured, renatured and then activated [5, 6]. The protein was solubilized and purified successfully, but we were not able to refold or activate it. After a few consultations with relevant experts from the lab of our PI we tried to increase our refolding buffer’s glycerol concentration, adding saccharose, changing the refolding conditions from 4 degrees to 37 degrees, but none of it worked. The only condition left that we could change was the ratio of buffer to protein, but that wasn’t feasible because we did not have the equipment to concentrate large volumes of buffer after refolding, the method we did try was pulling out all the water from the buffer through a dialysis membrane with carboxymethyl cellulose, but it took more than 4 days with constant cellulose cleanup and change for the buffer to go from 100 ml to about 20-30 mL.
+
                 The protein synthesis was successful and we did not need to optimize the growing conditions  
 +
                besides the IPTG concentration – <b>we compared the protein expression at 1 mM and 0.6 mM IPTG  
 +
                and we found that the lower one works better.</b> We ran into a bunch of problems while trying  
 +
                to process the insoluble protein because it had to be denatured, renatured, and then activated [5, 6].  
 +
              <b> The protein was dissolved and purified successfully, but we were not able to refold or activate it.</b>
 +
                After a few consultations with relevant experts from the lab of our PI, we tried to increase our  
 +
                refolding buffer’s glycerol concentration by adding saccharose, changing the refolding conditions  
 +
                from 4 °C to 37 °C, but none of it worked. The only condition left that  
 +
                we could change was the ratio of buffer to protein. However, that was not feasible because we did  
 +
                not have the equipment to concentrate large volumes of the buffer after refolding.
 +
                The method we did try was pulling out all the water from the buffer through a dialysis membrane  
 +
                with carboxymethyl cellulose. It took more than 4 days with constant cellulose clean-up and  
 +
                change for the buffer to go from 100 ml to about 20-30 mL.
 
               </p>
 
               </p>
 
               <h4>Learn</h4>
 
               <h4>Learn</h4>
 
               <p>
 
               <p>
                 In the end we could not find out where the problem was in the CP5 processing because it is not possible to determine if the protein was folded correctly and the activation process was the problem or if the protein didn’t refold at all because all we could see is the 32 kDa band on our SDS gels. Since we could not produce the soluble protein, but could purify insoluble protein relatively easily we decided to change our approach to the test itself. We did our selex process on the denatured CP5 protein and then made a test in which we would denature the proteins in the testes blood sample too.
+
                 In the end, we could not find out, what was the problem with CP5 processing due to the following reasons.
 +
                It is not possible to determine if the protein folded correctly or did not fold at all, and whether the  
 +
                problem is in the protein activation process - all we could see was the 32 kDa band on  
 +
                our SDS gels. Since we could not produce the soluble protein, but could purify insoluble protein  
 +
                relatively easily, we decided to change our approach to the test itself.<b> We considered doing our SELEX process on the denatured
 +
                CP5 protein and then made a test in which we would denature the proteins in the blood sample too.</b>
 
               </p>
 
               </p>
 
               <h3>Cycle 2</h3>
 
               <h3>Cycle 2</h3>
 
               <h4>Design</h4>
 
               <h4>Design</h4>
 
               <p>
 
               <p>
                 The design of the test still stayed the same, we just needed a different approach to get the soluble and active CP5 protein.  
+
                 <b>The design of the test stayed the same,</b> we just needed a different approach to get the soluble and active CP5 protein.  
 
               </p>
 
               </p>
 
               <h4>Build</h4>
 
               <h4>Build</h4>
 
               <p>
 
               <p>
                 Since we could not manage to produce an active and soluble protein we consulted with a relevant expert in our PI’s lab and got advised that we maybe could find some fraction of the soluble protein inside the cells. Every time we grew our E. coli cells with the CP5 construct we saw a noticeable decrease in the weight of the biomass compared to other proteins so there must have been some fraction of the proteinase that lysed the cells from the inside.
+
                 Since we did not manage to produce an active and soluble protein, we consulted with a relevant expert in our PI’s lab and got advised that we maybe could find some fraction of the soluble protein inside of the cells. <b>Every  
 +
                time we grew our <i>E. coli</i> cells with the CP5 construct we saw a noticeable decrease in the weight of  
 +
                the biomass compared to other proteins.</b> Therefore we made a conclusion that there must have been some  
 +
                fraction of the proteinase that lysed the cells from the inside.
 
               </p>
 
               </p>
 
               <h4>Test</h4>
 
               <h4>Test</h4>
 
               <p>
 
               <p>
                 We tried seven different E. coli strains: BL21(DE3), rosetta gami (De3), rosetta (De3), C41, Arctic express, HMS(17) and KRX. None of them produced any soluble CP5 protein in any appreciable amount although the biomass was smaller in every instance.
+
                 We tried <b>seven different <i>E. coli</i> strains:</b> BL21(DE3), Rosetta-gami (De3), Rosetta (De3), C41, Arctic express,  
 +
                HMS(17) and KRX. <b>None of them produced any soluble CP5 protein in any considerable amount,</b> although the biomass was  
 +
                smaller in every instance.
 
               </p>
 
               </p>
 
               <h4>Learn</h4>
 
               <h4>Learn</h4>
 
               <p>
 
               <p>
                 The protein was obviously activated and folded in correctly in some cells because every could only grow about half as much biomass compared to other proteins with identical growth conditions. Since a foreign proteinase was activated inside the E. coli cell the cells were lysed and the active protein was probably denatured and lost in the growth medium.
+
                 The <b>protein was obviously activated and folded correctly in some cells</b> because every could only grow about half  
 +
                as much biomass compared to other proteins with identical growth conditions. Since a foreign proteinase was activated inside
 +
                <i>E. coli</i> cell, the cells were lysed and the active protein was probably denatured and  
 +
                lost in the growth medium.
 
               </p>
 
               </p>
 
               <h3>Cycle 3</h3>
 
               <h3>Cycle 3</h3>
 
               <h4>Design</h4>
 
               <h4>Design</h4>
 
               <p>
 
               <p>
                 Since we could not produce any active protein for the selex process we tried to produce some soluble denatured protein in the pro-form and do the selex process with it and then process the blood sample in such a way that the proteins in it would be similar to our denatured analites in the lab.
+
                 As we could not produce any active protein for the SELEX process, we tried to <b>produce some soluble  
 +
                denatured protein in the pro-form</b> and perform the SELEX process with it and then process the blood  
 +
                sample in such way that the proteins in it would be similar to our denatured analytes in the lab.
 
               </p>
 
               </p>
 
               <h4>Build</h4>
 
               <h4>Build</h4>
 
               <p>
 
               <p>
                 At first we lysed the cells in denaturing guanidine-hydrochloride and urea buffers to solubilise the inclusion bodies. The we decided to just dialise the whole cell lysate against a the soluble lysis buffer with various different additives that might help to stabilize the protein in solution then centrifugate the insoluble cell debris and use this solution for downstream selex application as the lysate.
+
                 At first, we <b>lysed the cells in denaturing guanidine-hydrochloride and urea buffers to solubilized inclusion bodies.</b>
 +
                Then we decided to just dialyze the whole cell lysate against the soluble lysis buffer with various  
 +
                different additives that might help to stabilize the protein in solution. Then we centrifuged the insoluble cell  
 +
                debris and considered to use this solution for downstream SELEX application as the lysate.
 
               </p>
 
               </p>
 
               <h4>Test</h4>
 
               <h4>Test</h4>
 
               <p>
 
               <p>
                 The additives we tried out were arginine (0.375 M), trehalose (0.75 M), proline (0.5 M), mannitol (0.5 M) and CuCl2 (10 mM) [5,6]. Some of them were clear as water after the centrifugation which showed that nothing from our lisate was soluble in them, but a few were cloudy to some degree which was promising.
+
                 The additives we tried out were arginine (0.375 M), trehalose (0.75 M), proline (0.5 M), mannitol (0.5 M)  
 +
                and CuCl<sub>2</sub> (10 mM) [5,6]. Some of them were clear as water after the centrifugation - this result showed that nothing  
 +
                from our lysate was soluble in them, but<b> a few were cloudy to some degree which was promising.</b>
 
               </p>
 
               </p>
 
               <h4>Learn</h4>
 
               <h4>Learn</h4>
 
               <p>
 
               <p>
                 We found out that the protein was stable in a soluble lysis buffer containing 10mM CuCl2, but the amount of the stable protein was very small and not enough for further applications downstream so in the end we decided to not use CP5 as a biomarker for entamoeba histolytica.
+
                 We found out that the protein was stable in a soluble lysis buffer containing 10mM CuCl<sub>2</sub>,  
 +
                but the amount of the stable protein was very small and not enough for further applications.
 +
                In the end, <b>we decided not to use CP5 as a biomarker for <i>Entamoeba histolytica.</i></b>
 
               </p>
 
               </p>
 
             </div>
 
             </div>
Line 206: Line 249:
 
               <h4>Design</h4>
 
               <h4>Design</h4>
 
               <p>
 
               <p>
                 Since one of the parts of our project is to create a fusion protein system for the bottleneck
+
                 Since one of the parts of our project was to create a fusion protein system for the bottleneck
 
                 reaction of naringenin synthesis, we decided to model it <i>in silico</i>. The model’s primary purpose is
 
                 reaction of naringenin synthesis, we decided to model it <i>in silico</i>. The model’s primary purpose is
                to check whether the shorter distance between active sites leads to the higher production of naringenin.
+
              <b> to check whether the shorter distance between active sites leads to the higher production of naringenin.</b>
 
                 In the process, we found out that even the modeling part required to cover the main steps of engineering:
 
                 In the process, we found out that even the modeling part required to cover the main steps of engineering:
 
                 design, build, test, and learn. Therefore, we began with the already existing modeling workflow described
 
                 design, build, test, and learn. Therefore, we began with the already existing modeling workflow described
Line 227: Line 270:
 
               <p>
 
               <p>
 
                 The building stage of the modeling flow consisted of performing the initial design flow.
 
                 The building stage of the modeling flow consisted of performing the initial design flow.
                 In this cycle we have raised a hypothesis that homology-based protein modeling approaches
+
                 In this cycle we have raised a hypothesis that <b>homology-based protein modeling approaches
                 should not be the most suitable method for our case, since there are few fusion models existing.
+
                 should not be the most suitable method for our case,</b> since there are few fusion models existing.
 
               </p>
 
               </p>
 
               <ol>
 
               <ol>
 
                 <li>The protein candidates were chosen to be 4-coumarate-CoA ligase 2 from <i>Glycine max</i> and chalcone synthase from <i>Arabidopsis thaliana</i></li>
 
                 <li>The protein candidates were chosen to be 4-coumarate-CoA ligase 2 from <i>Glycine max</i> and chalcone synthase from <i>Arabidopsis thaliana</i></li>
                 <li>Seven linkers were chosen for our fusion system. Four of them were flexible glycine and serine linkers (GSG, (GGGGS)n, where n is equal to 1, 2, and 3) and the other three were rigid ((EAAAK)n, where n is equal to 1, 2, and 3)</li>
+
                 <li><b>Seven linkers were chosen for our fusion system.</b> Four of them were flexible glycine and serine linkers (GSG, (GGGGS)n, where n is equal to 1, 2, and 3) and the other three were rigid ((EAAAK)n, where n is equal to 1, 2, and 3)</li>
                 <li>For a primary structural analysis we studied PDB: 3TSY structure - an experimentally determined structure of the fusion system that has 4CL protein in it. We took this structure as a starting point of what we can expect from our models</li>
+
                 <li>For a primary structural analysis we studied PDB: <b>3TSY structure - an experimentally determined structure of the fusion system that has 4CL protein in it.</b> We took this structure as a starting point of what we can expect from our models</li>
                 <li>Protein modeling was initially attempted to perform using homology-based modeling program SWISS-MODEL [8]</li>
+
                 <li>Protein modeling was initially attempted to perform using <b>homology-based modeling program SWISS-MODEL </b>[8]</li>
 
                 <li>The modeled structures in the first cycle were evaluated using VoroMQA [9], QMEAN [10], and QMEANDisCo [11] scores</li>
 
                 <li>The modeled structures in the first cycle were evaluated using VoroMQA [9], QMEAN [10], and QMEANDisCo [11] scores</li>
 
                 <li>The plot was not drawn in this cycle</li>
 
                 <li>The plot was not drawn in this cycle</li>
Line 242: Line 285:
 
               <h4>Test</h4>
 
               <h4>Test</h4>
 
               <p>
 
               <p>
                 The output of SWISS-MODEL showed that sequence identity between the templates found and the sequence
+
                 The output of SWISS-MODEL showed that <b>sequence identity between the templates found and the sequence
                 that is modeled is 66.42 - 69.36%. According to the specialists in protein modeling,
+
                 that is modeled is 66.42 - 69.36%.</b> According to the specialists in protein modeling,
 
                 homology-based methods are suitable, when the sequence identity between the template
 
                 homology-based methods are suitable, when the sequence identity between the template
                 and the modeled sequence is more than 25%, belonging to the “daylight" zone [13], thus, in theory,
+
                 and the modeled sequence is more than 25%, belonging to the “daylight" zone [13], thus, <b>in theory,
                 the method could be applied.
+
                 the method could be applied.</b>
 
               </p>
 
               </p>
 
               <p>
 
               <p>
                 The modeling flow was tested with PyMOL [12] visualization program using commands
+
                 The <b>modeling flow was tested with PyMOL</b> [12] visualization program using commands
 
                 `cealign`, `set seq_view`. We were looking for an accurate representation of the
 
                 `cealign`, `set seq_view`. We were looking for an accurate representation of the
 
                 linked proteins in the fusion system by comparing them to their distinct versions.
 
                 linked proteins in the fusion system by comparing them to their distinct versions.
                 The main focus on the system was the linker region, which in the homology-based modeling
+
                 <b>The main focus on the system was the linker region</b>, which in the homology-based modeling
 
                 case was composed of the uncoiled domain of 4CL that made up a highly disordered massive linker region.
 
                 case was composed of the uncoiled domain of 4CL that made up a highly disordered massive linker region.
                 We stopped the flow after the visualization because we decided that structure refinement using molecular
+
                 We stopped the flow after the visualization because we decided that <b>structure refinement using molecular
                 dynamics will not introduce any significant changes to the structure.
+
                 dynamics will not introduce any significant changes to the structure.</b>
 
               </p>
 
               </p>
 
               <h4>Learn</h4>
 
               <h4>Learn</h4>
Line 261: Line 304:
 
                 Nonetheless in theory the homology-based modeling method could be applied,
 
                 Nonetheless in theory the homology-based modeling method could be applied,
 
                 these modeling results proved that this approach could not be applied to our fusion protein system.
 
                 these modeling results proved that this approach could not be applied to our fusion protein system.
                 Therefore, we decided to try running <i>ab initio</i> protein modeling programs.
+
                 Therefore, <b>we decided to try running <i>ab initio</i> protein modeling programs.</b>
 
               </p>
 
               </p>
 
               <h3>Cycle 2</h3>
 
               <h3>Cycle 2</h3>
Line 274: Line 317:
 
               <h4>Build</h4>
 
               <h4>Build</h4>
 
               <ol>
 
               <ol>
                 <li>Protein candidates stayed the same as in the first cycle</li>
+
                 <li><b>Protein candidates stayed the same</b> as in the first cycle</li>
                 <li>Linkers stayed the same as in the first cycle</li>
+
                 <li><b>Linkers stayed the same</b> as in the first cycle</li>
                 <li>Primary structure analysis was not required to be redone in the second cycle</li>
+
                 <li><b>Primary structure analysis was not required </b>to be redone in the second cycle</li>
                 <li>Protein modeling was performed using trRosetta and RoseTTAFold simultaneously</li>
+
                 <li>Protein modeling was performed<b> using trRosetta and RoseTTAFold</b> simultaneously</li>
                 <li>The modeled structures were evaluated using VoroMQA score</li>
+
                 <li>The modeled structures were evaluated <b>using VoroMQA score</b></li>
                 <li>Energy minimization with Yasara.</li>
+
                 <li><b>Energy minimization with Yasara</b></li>
 
                 <li>The Ramachandran plots were drawn before molecular dynamics (MD) simulations</li>
 
                 <li>The Ramachandran plots were drawn before molecular dynamics (MD) simulations</li>
                 <li>We performed short MD simulations for the trivial linker (GSG) case</li>
+
                 <li>We performed <b>short MD simulations for the trivial linker (GSG)</b> case</li>
 
                 <li>The Ramachandran plots were drawn before molecular dynamics (MD) simulations</li>
 
                 <li>The Ramachandran plots were drawn before molecular dynamics (MD) simulations</li>
 
               </ol>
 
               </ol>
 
               <h4>Test</h4>
 
               <h4>Test</h4>
 
               <p>
 
               <p>
                 The modeled structures visually were not satisfactory - the disordered domain of 4CL
+
                 The modeled structures visually were not satisfactory - <b>the disordered domain of 4CL
                 protein was still present in the structure. The energy minimization step with Yasara
+
                 protein was still present in the structure.</b> The energy minimization step with Yasara
 
                 and MD simulations did not introduce any significant changes to the structure.  
 
                 and MD simulations did not introduce any significant changes to the structure.  
 
               </p>
 
               </p>
 
               <h4>Learn</h4>
 
               <h4>Learn</h4>
 
               <p>
 
               <p>
                 After this step we decided to consult the bioinformatician who works with protein modeling.
+
                 After this step we decided to <b>consult the bioinformatician</b> who works with protein modeling.
                 He suggested we include multiple sequence alignment files into our modeling flow. Additionally,
+
                 He suggested we <b>include multiple sequence alignment files</b> into our modeling flow. Additionally,
                 we got advice to use more scoring functions in the evaluation step.
+
                 we got advice to use <b>more scoring functions in the evaluation step.</b>
 
                 Yet also we consulted the protein molecular dynamics specialist to get a professional
 
                 Yet also we consulted the protein molecular dynamics specialist to get a professional
 
                 insight into how we should run the MD for our system to get a more significant output.
 
                 insight into how we should run the MD for our system to get a more significant output.
                 After the consultation we adjusted the box size of the system and the duration of the simulations.
+
                 After the consultation <b>we adjusted the box size of the system and the duration of the simulations.</b>
 
               </p>
 
               </p>
 
               <h3>Cycle 3</h3>
 
               <h3>Cycle 3</h3>
 
               <h4>Design</h4>
 
               <h4>Design</h4>
 
               <p>
 
               <p>
                 In the third engineering cycle we included our own generated multiple sequence alignment (MSA)
+
                 In the third engineering cycle, we included <b>our own generated multiple sequence alignment (MSA)</b>
 
                 files as input to the protein modeling programs of our choice.
 
                 files as input to the protein modeling programs of our choice.
                 In addition, we took into account the advice from the specialists and included
+
                 In addition, we took into account the advice from the specialists and <b>included
                 QMEAN and QMEANDisCo scores into the evaluation of the structures. The structure
+
                 QMEAN and QMEANDisCo scores </b>into the evaluation of the structures. The structure
 
                 refinement was not performed in this cycle due to the lack of computational resources at the time.
 
                 refinement was not performed in this cycle due to the lack of computational resources at the time.
 
               </p>
 
               </p>
 
               <h4>Build</h4>
 
               <h4>Build</h4>
 
               <ol>
 
               <ol>
                 <li>Protein candidates stayed the same as in the previous cycle</li>
+
                 <li>Protein candidates <b>stayed the same</b> as in the previous cycle</li>
                 <li>Linkers stayed the same as in the previous cycle</li>
+
                 <li>Linkers <b>stayed the same</b> as in the previous cycle</li>
 
                 <li>Primary structure analysis was not required to be redone</li>
 
                 <li>Primary structure analysis was not required to be redone</li>
                 <li>Protein modeling was performed using RoseTTAFold</li>
+
                 <li>Protein modeling was performed<b> using RoseTTAFold</b></li>
                 <li>The modeled structures were evaluated using QMEAN, QMEANDisCo, and VoroMQA scores</li>
+
                 <li>The modeled structures were evaluated <b>using QMEAN, QMEANDisCo, and VoroMQA scores</b></li>
 
                 <li>The plot was not drawn in this cycle</li>
 
                 <li>The plot was not drawn in this cycle</li>
 
                 <li>The structure refinement was not performed in this cycle</li>
 
                 <li>The structure refinement was not performed in this cycle</li>
Line 323: Line 366:
 
               <p>
 
               <p>
 
                 The modeled structures were visually satisfactory -
 
                 The modeled structures were visually satisfactory -
                 the disordered domain of 4CL protein was less disordered in the modeled structure.
+
                 the <b>disordered domain of 4CL protein was less disordered </b>in the modeled structure.
 
               </p>
 
               </p>
 
               <h4>Learn</h4>
 
               <h4>Learn</h4>
Line 332: Line 375:
 
               <h4>Design</h4>
 
               <h4>Design</h4>
 
               <p>
 
               <p>
                 According to the output of the third cycle, the fourth engineering cycle was not required.
+
                 According to the output of the third cycle,<b> the fourth engineering cycle was not required.</b>
                 However, after the latter cycle was finished, AlphaFold2 [17] became available for public use.
+
                 However, after the latter cycle was finished, <b>AlphaFold2 [17] became available for public use.</b>
                 Therefore, we decided to apply this highly evaluated tool in our modeling workflow.
+
                 Therefore, we decided to <b>apply this highly evaluated tool</b> in our modeling workflow.
 
                 The structure refinement was not performed in this cycle due to the insignificant impact of
 
                 The structure refinement was not performed in this cycle due to the insignificant impact of
 
                 MD for structures recorded in the studies [18].
 
                 MD for structures recorded in the studies [18].
Line 340: Line 383:
 
               <h4>Build</h4>
 
               <h4>Build</h4>
 
               <ol>
 
               <ol>
                 <li>Protein candidates stayed the same as in the previous cycle</li>
+
                 <li>Protein candidates <b>stayed the same</b> as in the previous cycle</li>
                 <li>Linkers stayed the same as in the previous cycle</li>
+
                 <li>Linkers <b>stayed the same</b> as in the previous cycle</li>
 
                 <li>Primary structure analysis was not required to be redone</li>
 
                 <li>Primary structure analysis was not required to be redone</li>
                 <li>Protein modeling was performed using AlphaFold2</li>
+
                 <li>Protein modeling was performed <b>using AlphaFold2</b></li>
                 <li>The modeled structures were evaluated using QMEAN, QMEANDisCo, ProQ2D, ProQRosCenD, ProQRosFAD, ProQ3D scores</li>
+
                 <li>The modeled structures were evaluated <b>using QMEAN, QMEANDisCo, ProQ2D, ProQRosCenD, ProQRosFAD, ProQ3D scores</b></li>
 
                 <li>The plots were drawn in this cycle</li>
 
                 <li>The plots were drawn in this cycle</li>
 
                 <li>The structure refinement was not performed in this cycle</li>
 
                 <li>The structure refinement was not performed in this cycle</li>
                 <li>The distances between active sites in fusion protein systems with rigid linkers were calculated</li>
+
                 <li><b>The distances between active sites</b> in fusion protein systems with rigid linkers were calculated</li>
 
               </ol>
 
               </ol>
 
               <h4>Test</h4>
 
               <h4>Test</h4>
 
               <p>
 
               <p>
                 The output of the fourth cycle had the less disordered domain just as the output of the third cycle.
+
                 The output of the fourth cycle had the <b>less disordered domain</b> just as the output of the third cycle.
 
               </p>
 
               </p>
 
               <h4>Learn</h4>
 
               <h4>Learn</h4>
 
               <p>
 
               <p>
                 We were visually satisfied with the modeled complex.
+
                 We were <b>visually satisfied </b>with the modeled complex.
                 Nonetheless, AlphaFold2 models do not require the usage of molecular
+
                 Nonetheless, <b>AlphaFold2 models do not require the usage of molecular
                 dynamics for structure refinement, we considered applying them to get a
+
                 dynamics</b> for structure refinement, we considered applying them to get a
 
                 better insight of the system with flexible linkers.  
 
                 better insight of the system with flexible linkers.  
 
               </p>
 
               </p>
Line 366: Line 409:
 
               <h4>Design</h4>
 
               <h4>Design</h4>
 
               <p>
 
               <p>
                 Before beginning SELEX with purified soluble proteins we chose a few SELEX protocols available from the internet. One in industry called XELEX [19] and another from an article specifically intended for fool-proof aptamer discovery[20].
+
                 Before beginning SELEX with purified soluble proteins we chose a few SELEX protocols available from the internet. One in the industry called XELEX [19] and another from an article specifically intended for fool-proof aptamer discovery[20].
 
               </p>
 
               </p>
 
               <h4>Build</h4>
 
               <h4>Build</h4>
 
               <p>
 
               <p>
                 We tried the initial SELEX runs with the default parameters but for some reason it kept failing, so we chose as many parameters as we could in order to optimise the process. We settled on the bead and protein amount in the SELEX mixture, the heating of aptamers before the incubation length, the storage buffer of the aptamers post-cycle, and the polymerase itself.  
+
                 We tried the <b>initial SELEX runs with the default parameters </b>but for some reason, it kept failing, so we chose as many parameters as we could in order to optimize the process. We settled on <b>the bead and protein amount </b>in the SELEX mixture, <b>the heating of aptamers</b> before the incubation length, <b>the storage buffer</b> of the aptamers post-cycle, and the <b>polymerase</b> itself.  
 
               </p>
 
               </p>
 
               <h4>Test</h4>
 
               <h4>Test</h4>
 
               <p>
 
               <p>
                 We lowered the amount of protein and beads because we suspected that it might interfere with both the SELEX process and the downstream PCR reaction. We also lowered the heating of the aptamers before the selection round from 10 to 5 minutes based on another source  [21]. We also substituted the TE exchange buffer with distilled water because the EDTA in the TE buffer interferes with the PCR reaction polymerase. Finally, we tested the efficiency of the PCR reactions with both DreamTaq and Phusion polymerases.
+
                 We <b>lowered the amount of protein and beads</b> because we suspected that it might interfere with both the SELEX process and the downstream PCR reaction. We also <b>lowered the heating</b> of the aptamers before the selection round from 10 to 5 minutes based on another source  [21]. We also <b>substituted the TE exchange buffer with distilled water</b> because the EDTA in the TE buffer interferes with the PCR reaction polymerase. Finally, we <b>tested the efficiency of the PCR</b> reactions with both DreamTaq and Phusion polymerases.
 
               </p>
 
               </p>
 
               <h4>Learn</h4>
 
               <h4>Learn</h4>
 
               <p>
 
               <p>
                 In the end we did manage to optimize the SELEX process enough for it to produce aptamers. We determined that more protein and beads result in obstruction of SELEX. Half of the initial quantity should be applicable for second and further rounds. we also found out that we should heat aptamer pool for 5 minutes instead of 10 minutes after first round of SELEX. Heating for too long results in degradation or other form of loss of our dsDNA from previous round. The aptamer pool post-selection round should be stored in water, not in TE buffer as the EDTA interferes with the functions of polymerase, Phusion polymerase seems to not be affected as much as DreamTaq, and works better in both water and TE buffer.
+
                 In the end, <b>we did manage to optimize the SELEX process</b> enough for it to produce aptamers. We determined that more protein and beads result in obstruction of SELEX. Half of the initial quantity should be applicable for second and further rounds. we also found out that <b>we should heat the aptamer pool for 5 minutes instead of 10 minutes</b> after the first round of SELEX. Heating for too long results in degradation or other forms of loss of our dsDNA from the previous round. The aptamer pool post-selection round should be stored in water, not in TE buffer as the EDTA interferes with the functions of polymerase, <b>Phusion polymerase seems to not be affected as much as DreamTaq,</b> and works better in both water and TE buffer.
 
               </p>
 
               </p>
 
               <h3>Cycle 2</h3>
 
               <h3>Cycle 2</h3>
 
               <h4>Design</h4>
 
               <h4>Design</h4>
 
               <p>
 
               <p>
                 For the second cycle we researched that less bead and protein complex should have better effect in regenerating aptamers. The first SELEX round has more quantity of beads for making sure that all possible aptamers bind, however second and further rounds should have half of initial volume. It also increases stringency parallel to other parameters.
+
                 For the second cycle, we researched that <b>less bead and protein complex should have better effect in regenerating aptamers.</b> The first SELEX round should have<b> higher quantity of beads</b> for making sure that all possible aptamers bind. However, second and further rounds should have<b> half of the initial volume. </b>It also increases stringency parallel to other parameters.
 
               </p>
 
               </p>
 
               <h4>Build</h4>
 
               <h4>Build</h4>
 
               <p>
 
               <p>
                 In order to check which quantity of complex suits best we developed a plan to try changes on separate initial and move on to further rounds.
+
                 In order to check which quantity of complex suits us best, we developed a plan to try changes on separate initial and move on to further rounds.
 
               </p>
 
               </p>
 
               <h4>Test</h4>
 
               <h4>Test</h4>
 
               <p>
 
               <p>
                 We tested unchanged bead and protein complex quantities for a few rounds and side by side tried various complex volumes for second and further rounds. Cutting volume by ⅓, ½ and ⅔ . The change to half of initial volume suited our SELEX protocol the best.
+
                 We tested unchanged bead and protein complex quantities for a few rounds and tried various complex volumes for second and further rounds side by side. Comparing volumes reduced by ⅓, ½ and ⅔, we found that <b>change to half of the initial volume suited our SELEX protocol the best.</b>
 
               </p>
 
               </p>
 
               <h4>Learn</h4>
 
               <h4>Learn</h4>
 
               <p>
 
               <p>
                 First round should have larger volume of bead complex and lower in later. More protein and beads result in obstruction of SELEX. Half of the initial quantity should be applicable for second and further rounds.
+
                 We found that <b>first round should have larger volume of bead complex and lower in later rounds.</b> More protein and beads result in obstruction of SELEX. Half of the initial quantity should be applicable for second and further rounds.
 
               </p>
 
               </p>
 
             </div>
 
             </div>
Line 403: Line 446:
 
               <h4>Design</h4>
 
               <h4>Design</h4>
 
               <p>
 
               <p>
                 One of our project’s parts is genetically modified probiotics which are capable of performing <i>de novo</i> synthesis of natural flavonoid naringenin. This flavonoid has reached our attention as it remarkably reduces <i>Entamoeba histolytica</i> viability [22] and does not cause any harm to humans. Conversely, it has been found that naringenin has antioxidant, antitumor, antiviral, antibacterial, anti-inflammatory, anti-adipogenic and cardioprotective features [23]. In order to create an efficiently working naringenin production system, we have been searching for various tools, such as the synthetic protein quality control (ProQC) system, the most efficient metabolic pathway enzymes, promoters of particular strength, etc. However, it is not only important to construct a fully functional system but also ensure its stability in the chosen chassis. For this reason we chose to insert naringenin synthesis metabolic pathway into our probiotic strains: E. coli Nissle 1917 and <i>Lactobacillus casei</i> BL23. Insertion into genomic DNA has several advantages. Firstly, it provides stability to the system. If the new metabolic pathway does not cause too much burden to the chassis, it should be maintained in genom active (other way there might be some mutations suppressing its functionality, reduced cell growth, enhanced cell sensitivity to nutrient deficiency [24]). Surprisingly the integration of the genetic circuit of the metabolic pathway into the chassis genome contributes to the burden reduction as it serves as DNA copy number tuning. Secondly, by inserting the desired sequences in the genome we eliminate the need of antibiotic or any other environmental pressure usage in order to maintain needed genetic circuits in the chosen chassis. This is very important when the designed naringenin producing bacteria will live in the human intestine.  
+
                 One of our project’s parts is <b>genetically modified probiotics </b>that are capable of performing <i>de novo</i> synthesis of natural flavonoid naringenin. This flavonoid has reached our attention as it remarkably reduces <i>Entamoeba histolytica</i> viability [22] and does not cause any harm to humans. Conversely, it has been found that naringenin has <b>antioxidant, antitumor, antiviral, antibacterial, anti-inflammatory, anti-adipogenic, and cardioprotective features [23].</b> To create an efficiently working naringenin production system, we have been searching for various tools - the synthetic protein quality control (ProQC) system, the most efficient metabolic pathway enzymes, promoters of particular strength, etc. However, it is not only important to construct a fully functional system but also to ensure its stability in the chosen chassis. For this reason, we chose to insert naringenin synthesis metabolic pathway into our probiotic strains: <i>E. coli</i> Nissle 1917 and <i>Lactobacillus casei</i> BL23. Insertion into genomic DNA has several advantages. Firstly, it <b>provides stability to the system.</b> If the new metabolic pathway does not cause too much burden to the chassis, it should be maintained in genome active (otherwise there might be some mutations suppressing its functionality, reduced cell growth, enhanced cell sensitivity to nutrient deficiency [24]). Surprisingly the integration of the genetic circuit of the metabolic pathway into the chassis genome contributes to the burden reduction as it serves as DNA copy number tuning. Secondly, by inserting the desired sequences in the genome we <b>eliminate the need for antibiotics</b> or any other environmental pressure usage to maintain needed genetic circuits in the chosen chassis. This is very important when the designed naringenin-producing bacteria will live in the human intestine.  
 
               </p>
 
               </p>
 
               <p>
 
               <p>
                 In order to reach this goal we did a huge research in the genome editing techniques field. With the help of our advisor Inga Songailienė, who is focusing on CRISPR-Cas systems in her PhD thesis, we have chosen to use the pCas-pTarget plasmids system which enables the usage of CRISPR-Cas9 and Lambda Red recombination combinment into the one system to acquire the efficient <i>E. coli</i> Nissle 1917 genome editing [25]. For <i>L. casei</i> BL23 we have chosen a single plasmid pLCNICK based system relying on CRISPR-Cas9 D10A Nickase-Assisted genome editing [26]. For the genomic insertion site in <i>L. casei</i> BL23 we decided to follow the pLCNICK already provided sequences required for genome editing because we have run out of resources for new DNA sequences, and for this chassis there is a need of longer homology arms. Meanwhile, we turned our all focus on E coli Nissle 1917 in order to design the robust metabolic pathway functionality. In the future we would adapt it to the L. casei BL23, too.  
+
                 In order to reach this goal, we did a huge research in the genome editing techniques field. With the help of our advisor Inga Songailienė, who is focusing on CRISPR-Cas systems in her Ph.D. thesis, we have <b>chosen to use the pCas-pTarget plasmids system</b> which enables the usage of CRISPR-Cas9 and Lambda Red recombination combinment into the one system to acquire the efficient <i>E. coli</i> Nissle 1917 genome editing [25]. For <i>L. casei</i> BL23 we have chosen <b>a single plasmid pLCNICK based system </b>relying on CRISPR-Cas9 D10A Nickase-Assisted genome editing [26]. For the genomic insertion site in <i>L. casei</i> BL23 we decided to follow the pLCNICK already provided sequences required for genome editing. We followed this approach because we have run out of resources for new DNA sequences, and for this chassis, there is a need for longer homology arms. Meanwhile, we turned our all focus on <i.>E. coli</i> Nissle 1917 to design the functional and robust metabolic pathway. In the future, we would adapt it to the <i>L. casei</i> BL23, too.  
 
               </p>
 
               </p>
 
               <p>
 
               <p>
                 During the design stage firstly we considered which genomic sequences would be used for metabolic pathway insertion. From literature we learned that 16S rRNA genes might be used for this purpose [27], however it would not let us implement a ProQC system as 16S rRNA transcript is created as polycistronic RNA. For this reason the targeted non-essential gene - ekolin. Also, we have considered possible metabolic flux enhancement toward naringenin synthesis. It covers acetate kinase (ackA), phosphate acetyltransferase (pta) knockouts creation to channel carbon flows toward acetyl-CoA [28] subsequently leading to the enhanced malonyl-CoA amounts needed for bottle-neck reaction in naringenin production [29]. Also, tyrosine specific transporter (tyrP) knockout mutants are potentially able to produce 10% higher amounts of L-tyrosine (naringenin synthesis precursor) than that of the original strains [30]. In addition, aldehyde-alcohol dehydrogenase (adhE) knockout disrupts the conversion of acetyl-CoA to ethanol in the cell, therefore more acetyl-CoA can be converted to malonyl-CoA [31] further enhancing naringenin synthesis.
+
                 During the design stage, we first <b>considered genomic sequences </b>that would be used for metabolic pathway insertion. From literature, we learned that<b> 16S rRNA genes might be used for this purpose</b> [27]. However, it would not let us implement a ProQC system due to 16S rRNA transcript being created as polycistronic RNA. For this reason, we targeted <b>non-essential gene</b> - <i>colicin</i>. We have also considered possible metabolic flux enhancement toward naringenin synthesis. It covers acetate kinase (ackA), phosphate acetyltransferase (pta) knockouts creation to channel carbon flows toward acetyl-CoA [28]. Subsequently, it leads to the enhanced malonyl-CoA amounts needed for bottle-neck reaction in naringenin production [29]. Also, tyrosine-specific transporter (tyrP) knockout mutants are potentially able to produce<b> 10% higher amounts of L-tyrosine </b>(naringenin synthesis precursor) than that of the original strains [30]. In addition, aldehyde-alcohol dehydrogenase (adhE) knockout disrupts the conversion of acetyl-CoA to ethanol in the cell, therefore more acetyl-CoA can be converted to malonyl-CoA [31] further enhancing naringenin synthesis.
 
               </p>
 
               </p>
 
               <h4>Build</h4>
 
               <h4>Build</h4>
 
               <p>
 
               <p>
                 To fulfill our experimental plan, we have designed sgRNA for genomic insertion and earlier mentioned knockouts generation. For this reason we used CRISPOR tool [32] to assess the probability of sgRNA efficiency and off-targets generation. Furthermore, after selecting sgRNA for each gene we have checked how many other similar DNA sequences exist in the <i>E. coli</i> Nissle 1917 genome. For this we have used NCBI Blast tool [33] to analyse whether there are any other identical sequences to the seed region of the chosen sgRNA. It is important to be aware of it because it might contribute to possible off-targets [34]. Also, for genome insertion we decided to use approximately 70-90 bp long homology sequences flanking the inserts as it should work on this system [35]. By following these considerations we have constructed sgRNAs specific to <i>ackA</i> (<a href="http://parts.igem.org/Part:BBa_K3904401">BBa_K3904401</a>), <i>pta</i> (<a href="http://parts.igem.org/Part:BBa_K3904402">BBa_K3904402</a>), <i>adhE</i> (<a href="http://parts.igem.org/Part:BBa_K3904403">BBa_K3904403</a>), <i>tyrP</i> (<a href="http://parts.igem.org/Part:BBa_K3904404">BBa_K3904404</a>), <i>ekolin</i> (<a href="http://parts.igem.org/Part:BBa_K3904405">BBa_K3904405</a>) genes.
+
                 To fulfill our experimental plan, we have designed <b>sgRNA for genomic insertion</b> and earlier mentioned knockouts generation. For this reason, we used <b>CRISPOR tool </b>[32] to assess the probability of sgRNA efficiency and off-targets generation. Furthermore, after selecting sgRNA for each gene we have checked how many other similar DNA sequences exist in the <i>E. coli</i> Nissle 1917 genome. For this reason, we have used <b>NCBI Blast tool</b> [33] to analyse whether there are any other identical sequences to the seed region of the chosen sgRNA. It is important to be aware of this because it might contribute to possible off-targets [34]. Also, for genome insertion we decided to use <b>approximately 70-90 bp long homology sequences</b> flanking the inserts as it should work on this system [35]. By following these considerations we have constructed sgRNAs specific to <i>ackA</i> (<a href="http://parts.igem.org/Part:BBa_K3904401">BBa_K3904401</a>), <i>pta</i> (<a href="http://parts.igem.org/Part:BBa_K3904402">BBa_K3904402</a>), <i>adhE</i> (<a href="http://parts.igem.org/Part:BBa_K3904403">BBa_K3904403</a>), <i>tyrP</i> (<a href="http://parts.igem.org/Part:BBa_K3904404">BBa_K3904404</a>), <i>colicin</i> (<a href="http://parts.igem.org/Part:BBa_K3904405">BBa_K3904405</a>) genes.
 
               </p>
 
               </p>
 
               <h4>Test</h4>
 
               <h4>Test</h4>
 
               <p>
 
               <p>
                 We have performed genome editing as it is described in the literature [25]. However, genome editing has not been very efficient as very few cells have survived and not all of them contained desired modifications.
+
                 We have performed genome editing as it is described in the literature [25]. However, genome editing has not been very efficient as very <b>few cells have survived</b> and not all of them contained desired modifications.
 
               </p>
 
               </p>
 
               <h4>Learn</h4>
 
               <h4>Learn</h4>
 
               <p>
 
               <p>
                 As we have received these results, we have hypothesized that some of the cells might lose pCas plasmid as it has a temperature sensitive replicon.  
+
                 As we have received these results, we have hypothesized that some of the cells <b>might lose pCas plasmid</b> as it has a temperature-sensitive replicon.  
 
               </p>
 
               </p>
 
               <h3>Cycle 2</h3>
 
               <h3>Cycle 2</h3>
 
               <h4>Design</h4>
 
               <h4>Design</h4>
 
               <p>
 
               <p>
                 In the second genome editing cycle we have used the same parts but changed some experimental conditions. We have contacted our advisor Inga Songailienė, who gave us some insights about this system.  
+
                 In the second genome editing cycle we have used the same parts but <b>changed some experimental conditions.</b> We have contacted our advisor Inga Songailienė, who gave us some insights about this system.  
 
               </p>
 
               </p>
 
               <h4>Build</h4>
 
               <h4>Build</h4>
Line 433: Line 476:
 
               </p>
 
               </p>
 
               <ol>
 
               <ol>
                 <li>Lower growth temperature from 30°C to 27.5°C because a water filled incubator might show a bit lower temperature than it is in it.</li>
+
                 <li>Lower growth temperature<b> from 30°C to 27.5°C </b>because a water filled incubator might show a bit lower temperature than it is in it.</li>
                 <li>Cells have been induced for recombination at 0.5-0.8 OD instead of 0.4-0.6 OD during preparation for co-electroporation with pTarget and recombination template.</li>
+
                 <li>Cells have been induced for recombination at <b>0.5-0.8 OD instead of 0.4-0.6 OD</b> during preparation for co-electroporation with pTarget and recombination template.</li>
                 <li>Enhanced the L-arabinose concentration till 0.2 % (previously it was 0.15 %).</li>
+
                 <li>Enhanced the L-arabinose concentration till<b> 0.2 % (previously it was 0.15 %).</b></li>
 
               </ol>
 
               </ol>
 
               <h4>Test</h4>
 
               <h4>Test</h4>
 
               <p>
 
               <p>
                 We get <i>pta</i> and <i>ackA</i> genes knockouts. However, even after several repeats and recombination template amount increase (600 ng instead of 400 ng), new recombination template preparation, <i>adhE</i> and <i>tyrP</i> genes left unmodified.
+
                 We get <i>pta</i> and <i>ackA</i> gene knockouts. However, even after several repeats and recombination template amount increase (600 ng instead of 400 ng), new recombination template preparation,<b> <i>adhE</i> and <i>tyrP</i> genes left unmodified.</b>
 
               </p>
 
               </p>
 
               <h4>Learn</h4>
 
               <h4>Learn</h4>
 
               <p>
 
               <p>
                 By these changes we learned that it is very important the timing of recombination induction. However, we still had some unsolved problems with two left genes.
+
                 By these changes, we learned that the <b>timing of recombination induction is very important.</b> However, we still had some unsolved problems with two left genes.
 
               </p>
 
               </p>
 
               <h3>Cycle 3</h3>
 
               <h3>Cycle 3</h3>
 
               <h4>Design</h4>
 
               <h4>Design</h4>
 
               <p>
 
               <p>
                 As we see there are still some undetected problems. After further literature research we found some alteration of this system [26]. Also, by this time we had constructed our kill-switch system, so we decided to insert it and superfolder green fluorescent protein (sfGFP) into the <i>ekolin</i> gene. sfGFP is to check how well the designed system is working and kill-switch VapXD system integration into genomic DNA is important for evaluation of how this system would perform in the final product.
+
                 As we saw there were still some undetected problems. After further literature research, we found some alterations to this system [26]. Also, by this time we had constructed our<b> kill-switch system, </b>so we decided to insert it and superfolder green fluorescent protein (sfGFP) into the <i>colicin</i> gene. We used sfGFP to check how well the designed system is working and kill-switch VapXD system integration into genomic DNA is important for evaluation of how this system would perform in the final product.
 
               </p>
 
               </p>
 
               <h4>Build</h4>
 
               <h4>Build</h4>
 
               <p>
 
               <p>
                 In this engineering cycle we have changed the induction timing - L-arabinose to final 0.2 % concentration has been added at 0.2-0.3 OD600 and cells harvested by centrifugation then the OD600 has reached 0.6-0.8. Recombination template amount for kill-switch constructs, GFP - 400 ng, while for <i>tyrP</i> and <i>adhE</i> - 800 ng as previously there were many colonies resistant to both antibiotics but none of them insertion-positive.
+
                 In this engineering cycle, we have changed the induction timing - <b>L-arabinose to final 0.2 % concentration has been added at 0.2-0.3 OD600 </b>and cells harvested by centrifugation then <b>the OD600 has reached 0.6-0.8.</b> Recombination template amount for kill-switch constructs, GFP - 400 ng, while for <i>tyrP</i> and <i>adhE</i> - 800 ng as previously there were many colonies resistant to both antibiotics but <b>none of them insertion-positive.</b>
 
               </p>
 
               </p>
 
               <h4>Test</h4>
 
               <h4>Test</h4>
 
               <p>
 
               <p>
                 Both kill-switch systems and sfGFP constructs have been inserted into genomic DNA with almost 100 % accuracy leading to 7-12 insertion-positive colonies and just two survived colonies in the control plate. Meanwhile, <i>adhE</i> and <i>tyrP</i> genes knockout mutants still have not been generated. In control plates of those two mutant creation experiments there were from hundreds to thousands of colonies. A bit smaller numbers occurred in plates where mutants were expected but not found.
+
                 Both kill-switch systems and sfGFP constructs have been inserted into genomic DNA with almost 100 % accuracy leading to<b> 7-12 insertion-positive colonies</b> and just <b>two survived colonies</b> in the control plate. Meanwhile, <i>adhE</i> and <i>tyrP</i> genes knockout mutants still have not been generated. There were from hundreds to thousands of colonies in control plates, of those two mutant creation experiments. A bit smaller numbers occurred in plates where mutants were expected but not found.
 
               </p>
 
               </p>
 
               <h4>Learn</h4>
 
               <h4>Learn</h4>
 
               <p>
 
               <p>
                 The results of kill-switch and sfGFP insertion into the genome proves the system and chosen sgRNA for the <i>ekolin</i> gene is working. However, just a bit of GFP fluorescence could be seen by eye after blue light enlightening, This might be because very little GFP is produced from only one DNA copy of this construct. Further investigation needs to be done.
+
                 The results of kill-switch and sfGFP insertion into the genome proved the system and chosen sgRNA for the <i>colicin</i> gene is working. However, just a bit of GFP fluorescence could be seen by the eye after blue light use. This might be because very little GFP is produced from only one DNA copy of this construct. Thus,urther investigation needed to be done.
 
               </p>
 
               </p>
 
               <p>
 
               <p>
                 The amount of colonies raised after several tries to generate <i>tyrP</i> and <i>adhE</i> genes knockouts leads to two possible explanations. Firstly, there is a huge possibility that chosen sgRNA is not capable to properly lead Cas9 endonuclease to a particular genomic site and cause double-strand break. This leads to almost all bacteria surviving even if they have Cas9 and gene specific sgRNA inside. Second possible explanation would be that those two genes are too important for bacteria surveillance, so either it escapes double strand break occurrence, or it dies because of nonfunctional gene creation after successful genomic recombineering.
+
                 The amount of colonies raised after several tries to generate <i>tyrP</i> and <i>adhE</i> genes knockouts lead to two possible explanations. Firstly, there was a huge possibility that chosen <b>sgRNA is incapable to properly lead Cas9 endonuclease to a particular genomic site</b> and causing double-strand break. This lead to almost all bacteria surviving even if they have Cas9 and gene-specific sgRNA inside. Second possible explanation would be that those<b> two genes are too important for bacteria surveillance,</b> so either it escapes double strand break occurrence, or it dies because of nonfunctional gene creation after successful genomic recombineering.
 
               </p>
 
               </p>
 
               <h3>Cycle 4</h3>
 
               <h3>Cycle 4</h3>
 
               <h4>Design</h4>
 
               <h4>Design</h4>
 
               <p>
 
               <p>
                 After taking all earlier mentioned considerations together, we decided to design new sgRNA for <i>adhE</i> and <i>tyrP</i> genes to test whether our previously designed sgRNA are truly non efficient or can not be created mutants of <i>tyrP</i> and <i>adhE</i> genes knockouts. Moreover, as GFP levels from colonies with genomic insertion of its construct do not generate any clear fluorescence, we looked deeper into the pattern of genome wide <i>E. coli</i> transcriptional map [27]. From the literature and consultation with our advisor Inga Songailienė, we have chosen another genomic target for metabolic pathway insertion. <i>nupG</i> gene is responsible for broad-specific nucleoside permease synthesis.
+
                 After taking earlier mentioned considerations together, we decided to design <b>new sgRNA for <i>adhE</i> and <i>tyrP</i> genes</b> to test whether our previously designed <b>sgRNA are truly inefficient </b>or can not be created mutants of <i>tyrP</i> and <i>adhE</i> genes knockouts. Moreover, as GFP levels from colonies with genomic insertion of its construct <b>did not generate any clear fluorescence,</b> we looked deeper into the pattern of genome wide <i>E. coli</i> transcriptional map [27]. From the literature and consultation with our advisor Inga Songailienė, we have chosen <b>another genomic target for metabolic pathway insertion.</b> <i>nupG</i> gene is responsible for broad-specific nucleoside permease synthesis.
 
               </p>
 
               </p>
 
               <h4>Build</h4>
 
               <h4>Build</h4>
 
               <p>
 
               <p>
                 <i>nupG</i> specific sgRNA have been selected by CRISPOR [22] tool. Recombination template created by three steps PCR with long overhangs.
+
                 <i>nupG</i> <b>specific sgRNA </b>have been selected by CRISPOR [22] tool. Recombination template created by three steps PCR with long overhangs.
 
               </p>
 
               </p>
 
               <h4>Test</h4>
 
               <h4>Test</h4>
 +
              <p>
 +
                GFP, linked 4CL-CHS, TAL and CHI genes have been inserted into <i>nupG</i> and <i>colicin</i> genes. <i>tyrP, adhE</i> knockouts have not been obtained.
 +
              </p>
 
               <h4>Learn</h4>
 
               <h4>Learn</h4>
 +
              <p>
 +
              From GFP fluorescence measurements we have learned that <i>nupG</i> it is transcriptionally more active. Also, we have found out that our designed primers pair for the third PCR round of recombination template creation for insertion into <i>nupG</i> gene are not suitable. These primers do not generate any PCR product as they probably do not anneal to the genomic DNA. This might be because of some nucleotides variation in the <i>nupG</i> gene in the Nissle genome. In order to evade this problem firstly for genomic editing chosen gene should be sequenced and only after that all over needed DNA constructs should be designed.
 +
              </p>
 
             </div>
 
             </div>
 
           </div>
 
           </div>

Latest revision as of 03:42, 22 October 2021

ENGINEERING SUCCESS

Header

Overview

In the process of our project development, we performed many engineering cycles. We decided to describe five of them, that had the most distinguished stages.

Since engineering stages are four (design, build, test, and learn), we imagined engineering process as a unit circle, which can also be represented in the form of sinusoid. One sinusoid wave is equal to one engineering cycle.

Below you can see the general animation describing our vision. It shows the engineering process that contains three engineering cycles.

Protein production - PPDK

Cycle 1

Design

At first, we looked for a protein marker that could be used to identify Entamoeba histolytica. We found a few promising candidates and, in the end, we chose cysteine proteinase 5 (CP5) and pyruvate phosphate dikinase (PPDK) as our analytes because they are unique markers of the E. histolytica. One of them has even been already considered as a marker for a test in previous papers [1,2]. We decided to put it in the pET28a(+) plasmid as in that way we could manipulate the insert in various ways and get the construct in both N-His and C-His forms.

Build

The cloning of PPDK was unsuccessful - the sequencing showed that there were a few mutations in our gene of interest that made it impossible to use the protein. We tried to clone it again. After four failed attempts we were running out of our ordered construct, so we decided to clone it into a pUC19 vector with simple blunt-end cloning. Since that worked at the first time, we decided that there was probably something wrong with our plasmid backbone. Finally, we successfully cloned PPDK into the pET28a(+) vector.

Test

We tested a few different E. coli strains and growing conditions for the induction. Some had no visible protein at all, some had a lot, but none of them had a substantial amount in the soluble fraction. After determining that the BL21 (DE3) strain had the most protein in the insoluble fraction, we tested out different lysis conditions varrying the saccharose, NP-40 and Triton X-100 concentrations in the lysis buffers.

Learn

After all the testing we found out that the best conditions for induction are in the E. coli strain BL21 (DE3) with a 1 mM IPTG concentration for 3 hours at 37°C in the TB medium [2]. The most protein was found in the soluble fraction when we used a lysis buffer with 50 mM NaH2PO4, 500 mM NaCl, 10 mM imidazole, and 0.5 % NP-40. The protein was not pure enough to start the SELEX and there was not much of it, therefore we needed to find an alternative.

Cycle 2

Design

Since we could not synthesize a significant amount of our target protein, we decided to perform our SELEX process on a denatured (and refolded) protein. We decided to denature the proteins in the sample blood on our test.

Build

At first, we needed to find out the specific concentration of a needed strong detergent. Then we required to find a buffer solution, which could keep the protein soluble and would not interfere with the SELEX process downstream.

Test

We tried to dissolve the PPDK containing biomass with a gradient of urea buffers to find out at what concentration the protein dissolves the best. After determining the denaturing lysis conditions, we dialyzed the lysate against a soluble lysis buffer containing 5 different additives - arginine (0.375 M), trehalose (0.75 M), proline (0.5 M), mannitol (0.5 M) and CuCl2 (10 mM) [3, 4].

Learn

We found out that PPDK becomes soluble when a lysis buffer supplemented with 6 M urea is used. It is possible to transfer the solubilized protein in a phosphate buffer containing 500 mM of NaCl and 0.375 M arginine. The protein stays soluble in this solution and can be used in downstream applications.

Protein production - CP5

Cycle 1

Design

At first, we looked for a protein marker that could be used to identify Entamoeba histolytica, we found a few promising candidates. In the end, we chose cysteine proteinase 5 (CP5) and pyruvate phosphate dikinase (PPDK) as our analytes because they are unique markers of E. histolytica and one of them has even been considered as a marker for a test already in previous papers [1,2]. We decided to put it in the pET28a(+) plasmid because we could manipulate the insert in various ways and get the construct in both N-His and C-His forms.

Build

The cloning of CP5 into the pET28a(+) vector was successful with both N-His and C-His tags, then it was transformed into the E. coli BL21 (DE3) strain.

Test

The protein synthesis was successful and we did not need to optimize the growing conditions besides the IPTG concentration – we compared the protein expression at 1 mM and 0.6 mM IPTG and we found that the lower one works better. We ran into a bunch of problems while trying to process the insoluble protein because it had to be denatured, renatured, and then activated [5, 6]. The protein was dissolved and purified successfully, but we were not able to refold or activate it. After a few consultations with relevant experts from the lab of our PI, we tried to increase our refolding buffer’s glycerol concentration by adding saccharose, changing the refolding conditions from 4 °C to 37 °C, but none of it worked. The only condition left that we could change was the ratio of buffer to protein. However, that was not feasible because we did not have the equipment to concentrate large volumes of the buffer after refolding. The method we did try was pulling out all the water from the buffer through a dialysis membrane with carboxymethyl cellulose. It took more than 4 days with constant cellulose clean-up and change for the buffer to go from 100 ml to about 20-30 mL.

Learn

In the end, we could not find out, what was the problem with CP5 processing due to the following reasons. It is not possible to determine if the protein folded correctly or did not fold at all, and whether the problem is in the protein activation process - all we could see was the 32 kDa band on our SDS gels. Since we could not produce the soluble protein, but could purify insoluble protein relatively easily, we decided to change our approach to the test itself. We considered doing our SELEX process on the denatured CP5 protein and then made a test in which we would denature the proteins in the blood sample too.

Cycle 2

Design

The design of the test stayed the same, we just needed a different approach to get the soluble and active CP5 protein.

Build

Since we did not manage to produce an active and soluble protein, we consulted with a relevant expert in our PI’s lab and got advised that we maybe could find some fraction of the soluble protein inside of the cells. Every time we grew our E. coli cells with the CP5 construct we saw a noticeable decrease in the weight of the biomass compared to other proteins. Therefore we made a conclusion that there must have been some fraction of the proteinase that lysed the cells from the inside.

Test

We tried seven different E. coli strains: BL21(DE3), Rosetta-gami (De3), Rosetta (De3), C41, Arctic express, HMS(17) and KRX. None of them produced any soluble CP5 protein in any considerable amount, although the biomass was smaller in every instance.

Learn

The protein was obviously activated and folded correctly in some cells because every could only grow about half as much biomass compared to other proteins with identical growth conditions. Since a foreign proteinase was activated inside E. coli cell, the cells were lysed and the active protein was probably denatured and lost in the growth medium.

Cycle 3

Design

As we could not produce any active protein for the SELEX process, we tried to produce some soluble denatured protein in the pro-form and perform the SELEX process with it and then process the blood sample in such way that the proteins in it would be similar to our denatured analytes in the lab.

Build

At first, we lysed the cells in denaturing guanidine-hydrochloride and urea buffers to solubilized inclusion bodies. Then we decided to just dialyze the whole cell lysate against the soluble lysis buffer with various different additives that might help to stabilize the protein in solution. Then we centrifuged the insoluble cell debris and considered to use this solution for downstream SELEX application as the lysate.

Test

The additives we tried out were arginine (0.375 M), trehalose (0.75 M), proline (0.5 M), mannitol (0.5 M) and CuCl2 (10 mM) [5,6]. Some of them were clear as water after the centrifugation - this result showed that nothing from our lysate was soluble in them, but a few were cloudy to some degree which was promising.

Learn

We found out that the protein was stable in a soluble lysis buffer containing 10mM CuCl2, but the amount of the stable protein was very small and not enough for further applications. In the end, we decided not to use CP5 as a biomarker for Entamoeba histolytica.

Fusion protein modeling

Cycle 1

Design

Since one of the parts of our project was to create a fusion protein system for the bottleneck reaction of naringenin synthesis, we decided to model it in silico. The model’s primary purpose is to check whether the shorter distance between active sites leads to the higher production of naringenin. In the process, we found out that even the modeling part required to cover the main steps of engineering: design, build, test, and learn. Therefore, we began with the already existing modeling workflow described for a fusion construct with a malaria pre-erythrocytic vaccine candidate [7]. The described modeling approach was initially designed as follows:

  1. Choosing the protein candidates
  2. Selecting linkers
  3. Primary structural analysis of the fusion system
  4. Protein modeling
  5. Scoring modeled complexes to choose the model for visualization
  6. Plotting Ramachandran graphs before structure refinement
  7. Structure refinement
  8. Plotting Ramachandran graphs after structure refinement

Build

The building stage of the modeling flow consisted of performing the initial design flow. In this cycle we have raised a hypothesis that homology-based protein modeling approaches should not be the most suitable method for our case, since there are few fusion models existing.

  1. The protein candidates were chosen to be 4-coumarate-CoA ligase 2 from Glycine max and chalcone synthase from Arabidopsis thaliana
  2. Seven linkers were chosen for our fusion system. Four of them were flexible glycine and serine linkers (GSG, (GGGGS)n, where n is equal to 1, 2, and 3) and the other three were rigid ((EAAAK)n, where n is equal to 1, 2, and 3)
  3. For a primary structural analysis we studied PDB: 3TSY structure - an experimentally determined structure of the fusion system that has 4CL protein in it. We took this structure as a starting point of what we can expect from our models
  4. Protein modeling was initially attempted to perform using homology-based modeling program SWISS-MODEL [8]
  5. The modeled structures in the first cycle were evaluated using VoroMQA [9], QMEAN [10], and QMEANDisCo [11] scores
  6. The plot was not drawn in this cycle
  7. The structure refinement was not performed in this cycle
  8. The plot was not drawn in this cycle

Test

The output of SWISS-MODEL showed that sequence identity between the templates found and the sequence that is modeled is 66.42 - 69.36%. According to the specialists in protein modeling, homology-based methods are suitable, when the sequence identity between the template and the modeled sequence is more than 25%, belonging to the “daylight" zone [13], thus, in theory, the method could be applied.

The modeling flow was tested with PyMOL [12] visualization program using commands `cealign`, `set seq_view`. We were looking for an accurate representation of the linked proteins in the fusion system by comparing them to their distinct versions. The main focus on the system was the linker region, which in the homology-based modeling case was composed of the uncoiled domain of 4CL that made up a highly disordered massive linker region. We stopped the flow after the visualization because we decided that structure refinement using molecular dynamics will not introduce any significant changes to the structure.

Learn

Nonetheless in theory the homology-based modeling method could be applied, these modeling results proved that this approach could not be applied to our fusion protein system. Therefore, we decided to try running ab initio protein modeling programs.

Cycle 2

Design

The design (main steps) of the modeling flow stayed the same as it was in the first cycle, yet also we included one more step - energy minimization with Yasara [16] - that was inserted after the protein modeling step. The main changes in the second cycle were the choice of protein modeling programs. In this engineering cycle we took ab initio modeling programs trRosetta [14] and RoseTTAFold [15] to model our proteins.

Build

  1. Protein candidates stayed the same as in the first cycle
  2. Linkers stayed the same as in the first cycle
  3. Primary structure analysis was not required to be redone in the second cycle
  4. Protein modeling was performed using trRosetta and RoseTTAFold simultaneously
  5. The modeled structures were evaluated using VoroMQA score
  6. Energy minimization with Yasara
  7. The Ramachandran plots were drawn before molecular dynamics (MD) simulations
  8. We performed short MD simulations for the trivial linker (GSG) case
  9. The Ramachandran plots were drawn before molecular dynamics (MD) simulations

Test

The modeled structures visually were not satisfactory - the disordered domain of 4CL protein was still present in the structure. The energy minimization step with Yasara and MD simulations did not introduce any significant changes to the structure.

Learn

After this step we decided to consult the bioinformatician who works with protein modeling. He suggested we include multiple sequence alignment files into our modeling flow. Additionally, we got advice to use more scoring functions in the evaluation step. Yet also we consulted the protein molecular dynamics specialist to get a professional insight into how we should run the MD for our system to get a more significant output. After the consultation we adjusted the box size of the system and the duration of the simulations.

Cycle 3

Design

In the third engineering cycle, we included our own generated multiple sequence alignment (MSA) files as input to the protein modeling programs of our choice. In addition, we took into account the advice from the specialists and included QMEAN and QMEANDisCo scores into the evaluation of the structures. The structure refinement was not performed in this cycle due to the lack of computational resources at the time.

Build

  1. Protein candidates stayed the same as in the previous cycle
  2. Linkers stayed the same as in the previous cycle
  3. Primary structure analysis was not required to be redone
  4. Protein modeling was performed using RoseTTAFold
  5. The modeled structures were evaluated using QMEAN, QMEANDisCo, and VoroMQA scores
  6. The plot was not drawn in this cycle
  7. The structure refinement was not performed in this cycle
  8. The plot was not drawn in this cycle
  9. The distances between active sites in fusion protein systems with rigid linkers were calculated

Test

The modeled structures were visually satisfactory - the disordered domain of 4CL protein was less disordered in the modeled structure.

Learn

The multiple sequence alignment (MSA) files have a significant impact on the modeling output.

Cycle 4

Design

According to the output of the third cycle, the fourth engineering cycle was not required. However, after the latter cycle was finished, AlphaFold2 [17] became available for public use. Therefore, we decided to apply this highly evaluated tool in our modeling workflow. The structure refinement was not performed in this cycle due to the insignificant impact of MD for structures recorded in the studies [18].

Build

  1. Protein candidates stayed the same as in the previous cycle
  2. Linkers stayed the same as in the previous cycle
  3. Primary structure analysis was not required to be redone
  4. Protein modeling was performed using AlphaFold2
  5. The modeled structures were evaluated using QMEAN, QMEANDisCo, ProQ2D, ProQRosCenD, ProQRosFAD, ProQ3D scores
  6. The plots were drawn in this cycle
  7. The structure refinement was not performed in this cycle
  8. The distances between active sites in fusion protein systems with rigid linkers were calculated

Test

The output of the fourth cycle had the less disordered domain just as the output of the third cycle.

Learn

We were visually satisfied with the modeled complex. Nonetheless, AlphaFold2 models do not require the usage of molecular dynamics for structure refinement, we considered applying them to get a better insight of the system with flexible linkers.

SELEX

Cycle 1

Design

Before beginning SELEX with purified soluble proteins we chose a few SELEX protocols available from the internet. One in the industry called XELEX [19] and another from an article specifically intended for fool-proof aptamer discovery[20].

Build

We tried the initial SELEX runs with the default parameters but for some reason, it kept failing, so we chose as many parameters as we could in order to optimize the process. We settled on the bead and protein amount in the SELEX mixture, the heating of aptamers before the incubation length, the storage buffer of the aptamers post-cycle, and the polymerase itself.

Test

We lowered the amount of protein and beads because we suspected that it might interfere with both the SELEX process and the downstream PCR reaction. We also lowered the heating of the aptamers before the selection round from 10 to 5 minutes based on another source [21]. We also substituted the TE exchange buffer with distilled water because the EDTA in the TE buffer interferes with the PCR reaction polymerase. Finally, we tested the efficiency of the PCR reactions with both DreamTaq and Phusion polymerases.

Learn

In the end, we did manage to optimize the SELEX process enough for it to produce aptamers. We determined that more protein and beads result in obstruction of SELEX. Half of the initial quantity should be applicable for second and further rounds. we also found out that we should heat the aptamer pool for 5 minutes instead of 10 minutes after the first round of SELEX. Heating for too long results in degradation or other forms of loss of our dsDNA from the previous round. The aptamer pool post-selection round should be stored in water, not in TE buffer as the EDTA interferes with the functions of polymerase, Phusion polymerase seems to not be affected as much as DreamTaq, and works better in both water and TE buffer.

Cycle 2

Design

For the second cycle, we researched that less bead and protein complex should have better effect in regenerating aptamers. The first SELEX round should have higher quantity of beads for making sure that all possible aptamers bind. However, second and further rounds should have half of the initial volume. It also increases stringency parallel to other parameters.

Build

In order to check which quantity of complex suits us best, we developed a plan to try changes on separate initial and move on to further rounds.

Test

We tested unchanged bead and protein complex quantities for a few rounds and tried various complex volumes for second and further rounds side by side. Comparing volumes reduced by ⅓, ½ and ⅔, we found that change to half of the initial volume suited our SELEX protocol the best.

Learn

We found that first round should have larger volume of bead complex and lower in later rounds. More protein and beads result in obstruction of SELEX. Half of the initial quantity should be applicable for second and further rounds.

Genome editing

Cycle 1

Design

One of our project’s parts is genetically modified probiotics that are capable of performing de novo synthesis of natural flavonoid naringenin. This flavonoid has reached our attention as it remarkably reduces Entamoeba histolytica viability [22] and does not cause any harm to humans. Conversely, it has been found that naringenin has antioxidant, antitumor, antiviral, antibacterial, anti-inflammatory, anti-adipogenic, and cardioprotective features [23]. To create an efficiently working naringenin production system, we have been searching for various tools - the synthetic protein quality control (ProQC) system, the most efficient metabolic pathway enzymes, promoters of particular strength, etc. However, it is not only important to construct a fully functional system but also to ensure its stability in the chosen chassis. For this reason, we chose to insert naringenin synthesis metabolic pathway into our probiotic strains: E. coli Nissle 1917 and Lactobacillus casei BL23. Insertion into genomic DNA has several advantages. Firstly, it provides stability to the system. If the new metabolic pathway does not cause too much burden to the chassis, it should be maintained in genome active (otherwise there might be some mutations suppressing its functionality, reduced cell growth, enhanced cell sensitivity to nutrient deficiency [24]). Surprisingly the integration of the genetic circuit of the metabolic pathway into the chassis genome contributes to the burden reduction as it serves as DNA copy number tuning. Secondly, by inserting the desired sequences in the genome we eliminate the need for antibiotics or any other environmental pressure usage to maintain needed genetic circuits in the chosen chassis. This is very important when the designed naringenin-producing bacteria will live in the human intestine.

In order to reach this goal, we did a huge research in the genome editing techniques field. With the help of our advisor Inga Songailienė, who is focusing on CRISPR-Cas systems in her Ph.D. thesis, we have chosen to use the pCas-pTarget plasmids system which enables the usage of CRISPR-Cas9 and Lambda Red recombination combinment into the one system to acquire the efficient E. coli Nissle 1917 genome editing [25]. For L. casei BL23 we have chosen a single plasmid pLCNICK based system relying on CRISPR-Cas9 D10A Nickase-Assisted genome editing [26]. For the genomic insertion site in L. casei BL23 we decided to follow the pLCNICK already provided sequences required for genome editing. We followed this approach because we have run out of resources for new DNA sequences, and for this chassis, there is a need for longer homology arms. Meanwhile, we turned our all focus on E. coli Nissle 1917 to design the functional and robust metabolic pathway. In the future, we would adapt it to the L. casei BL23, too.

During the design stage, we first considered genomic sequences that would be used for metabolic pathway insertion. From literature, we learned that 16S rRNA genes might be used for this purpose [27]. However, it would not let us implement a ProQC system due to 16S rRNA transcript being created as polycistronic RNA. For this reason, we targeted non-essential gene - colicin. We have also considered possible metabolic flux enhancement toward naringenin synthesis. It covers acetate kinase (ackA), phosphate acetyltransferase (pta) knockouts creation to channel carbon flows toward acetyl-CoA [28]. Subsequently, it leads to the enhanced malonyl-CoA amounts needed for bottle-neck reaction in naringenin production [29]. Also, tyrosine-specific transporter (tyrP) knockout mutants are potentially able to produce 10% higher amounts of L-tyrosine (naringenin synthesis precursor) than that of the original strains [30]. In addition, aldehyde-alcohol dehydrogenase (adhE) knockout disrupts the conversion of acetyl-CoA to ethanol in the cell, therefore more acetyl-CoA can be converted to malonyl-CoA [31] further enhancing naringenin synthesis.

Build

To fulfill our experimental plan, we have designed sgRNA for genomic insertion and earlier mentioned knockouts generation. For this reason, we used CRISPOR tool [32] to assess the probability of sgRNA efficiency and off-targets generation. Furthermore, after selecting sgRNA for each gene we have checked how many other similar DNA sequences exist in the E. coli Nissle 1917 genome. For this reason, we have used NCBI Blast tool [33] to analyse whether there are any other identical sequences to the seed region of the chosen sgRNA. It is important to be aware of this because it might contribute to possible off-targets [34]. Also, for genome insertion we decided to use approximately 70-90 bp long homology sequences flanking the inserts as it should work on this system [35]. By following these considerations we have constructed sgRNAs specific to ackA (BBa_K3904401), pta (BBa_K3904402), adhE (BBa_K3904403), tyrP (BBa_K3904404), colicin (BBa_K3904405) genes.

Test

We have performed genome editing as it is described in the literature [25]. However, genome editing has not been very efficient as very few cells have survived and not all of them contained desired modifications.

Learn

As we have received these results, we have hypothesized that some of the cells might lose pCas plasmid as it has a temperature-sensitive replicon.

Cycle 2

Design

In the second genome editing cycle we have used the same parts but changed some experimental conditions. We have contacted our advisor Inga Songailienė, who gave us some insights about this system.

Build

Here is experimental conditions which we have decided to change:

  1. Lower growth temperature from 30°C to 27.5°C because a water filled incubator might show a bit lower temperature than it is in it.
  2. Cells have been induced for recombination at 0.5-0.8 OD instead of 0.4-0.6 OD during preparation for co-electroporation with pTarget and recombination template.
  3. Enhanced the L-arabinose concentration till 0.2 % (previously it was 0.15 %).

Test

We get pta and ackA gene knockouts. However, even after several repeats and recombination template amount increase (600 ng instead of 400 ng), new recombination template preparation, adhE and tyrP genes left unmodified.

Learn

By these changes, we learned that the timing of recombination induction is very important. However, we still had some unsolved problems with two left genes.

Cycle 3

Design

As we saw there were still some undetected problems. After further literature research, we found some alterations to this system [26]. Also, by this time we had constructed our kill-switch system, so we decided to insert it and superfolder green fluorescent protein (sfGFP) into the colicin gene. We used sfGFP to check how well the designed system is working and kill-switch VapXD system integration into genomic DNA is important for evaluation of how this system would perform in the final product.

Build

In this engineering cycle, we have changed the induction timing - L-arabinose to final 0.2 % concentration has been added at 0.2-0.3 OD600 and cells harvested by centrifugation then the OD600 has reached 0.6-0.8. Recombination template amount for kill-switch constructs, GFP - 400 ng, while for tyrP and adhE - 800 ng as previously there were many colonies resistant to both antibiotics but none of them insertion-positive.

Test

Both kill-switch systems and sfGFP constructs have been inserted into genomic DNA with almost 100 % accuracy leading to 7-12 insertion-positive colonies and just two survived colonies in the control plate. Meanwhile, adhE and tyrP genes knockout mutants still have not been generated. There were from hundreds to thousands of colonies in control plates, of those two mutant creation experiments. A bit smaller numbers occurred in plates where mutants were expected but not found.

Learn

The results of kill-switch and sfGFP insertion into the genome proved the system and chosen sgRNA for the colicin gene is working. However, just a bit of GFP fluorescence could be seen by the eye after blue light use. This might be because very little GFP is produced from only one DNA copy of this construct. Thus,urther investigation needed to be done.

The amount of colonies raised after several tries to generate tyrP and adhE genes knockouts lead to two possible explanations. Firstly, there was a huge possibility that chosen sgRNA is incapable to properly lead Cas9 endonuclease to a particular genomic site and causing double-strand break. This lead to almost all bacteria surviving even if they have Cas9 and gene-specific sgRNA inside. Second possible explanation would be that those two genes are too important for bacteria surveillance, so either it escapes double strand break occurrence, or it dies because of nonfunctional gene creation after successful genomic recombineering.

Cycle 4

Design

After taking earlier mentioned considerations together, we decided to design new sgRNA for adhE and tyrP genes to test whether our previously designed sgRNA are truly inefficient or can not be created mutants of tyrP and adhE genes knockouts. Moreover, as GFP levels from colonies with genomic insertion of its construct did not generate any clear fluorescence, we looked deeper into the pattern of genome wide E. coli transcriptional map [27]. From the literature and consultation with our advisor Inga Songailienė, we have chosen another genomic target for metabolic pathway insertion. nupG gene is responsible for broad-specific nucleoside permease synthesis.

Build

nupG specific sgRNA have been selected by CRISPOR [22] tool. Recombination template created by three steps PCR with long overhangs.

Test

GFP, linked 4CL-CHS, TAL and CHI genes have been inserted into nupG and colicin genes. tyrP, adhE knockouts have not been obtained.

Learn

From GFP fluorescence measurements we have learned that nupG it is transcriptionally more active. Also, we have found out that our designed primers pair for the third PCR round of recombination template creation for insertion into nupG gene are not suitable. These primers do not generate any PCR product as they probably do not anneal to the genomic DNA. This might be because of some nucleotides variation in the nupG gene in the Nissle genome. In order to evade this problem firstly for genomic editing chosen gene should be sequenced and only after that all over needed DNA constructs should be designed.

References

1.
Wong, W. K., Tan, Z. N., Othman, N., Lim, B. H., Mohamed, Z., Olivos Garcia, A., & Noordin, R. (2011). Analysis of Entamoeba histolytica Excretory-Secretory Antigen and Identification of a New Potential Diagnostic Marker. Clinical and Vaccine Immunology, 18(11), 1913–1917. To the article.
2.
Saidin, S., Yunus, H. M., Zakaria, D. N., Razak, A. K., Huat, B. L., Othman, N., & Noordin, R. (2014). Erratum to: Production of recombinant Entamoeba histolytica pyruvate phosphate dikinase and its application in a lateral flow dipstick test for amoebic liver abscess [BMC Infect Dis, 14, (2014), 182, doi: 10.1186/1471-2334-14-182]. BMC Infectious Diseases, 14(1), 1–9. To the article.
3.
Leibly, D. J., Nguyen, T. N., Kao, L. T., Hewitt, S. N., Barrett, L. K., & Van Voorhis, W. C. (2012). Stabilizing Additives Added during Cell Lysis Aid in the Solubilization of Recombinant Proteins. PLoS ONE, 7(12), e52482. To the article.
4.
Tsumoto, K., Umetsu, M., Kumagai, I., Ejima, D., Philo, J., & Arakawa, T. (2004). Role of Arginine in Protein Refolding, Solubilization, and Purification. Biotechnology Progress, 20(5), 1301–1308. To the article.
5.
Cornick, S., Moreau, F., & Chadee, K. (2016). Entamoeba histolytica Cysteine Proteinase 5 Evokes Mucin Exocytosis from Colonic Goblet Cells via αvβ3 Integrin. PLOS Pathogens, 12(4), e1005579. To the article.
6.
Hellberg, A., Nowak, N., Leippe, M., Tannich, E., & Bruchhaus, I. (2002). Recombinant expression and purification of an enzymatically active cysteine proteinase of the protozoan parasite Entamoeba histolytica. Protein Expression and Purification, 24(1), 131–137. To the article.
7.
Shamriz, S., & Ofoghi, H. (2016). Design, structure prediction and molecular dynamics simulation of a fusion construct containing malaria pre-erythrocytic vaccine candidate, Pf CelTOS, and human interleukin 2 as adjuvant. BMC bioinformatics, 17(1), 1-15. To the article.
8.
Waterhouse, A., Bertoni, M., Bienert, S., Studer, G., Tauriello, G., Gumienny, R., ... & Schwede, T. (2018). SWISS-MODEL: homology modelling of protein structures and complexes. Nucleic acids research, 46(W1), W296-W303. To the article.
9.
Olechnovič, K., & Venclovas, Č. (2019). VoroMQA web server for assessing three-dimensional structures of proteins and protein complexes. Nucleic acids research, 47(W1), W437-W442. To the article.
10.
Benkert, P., Tosatto, S. C. E., & Schomburg, D. (2008). QMEAN: A comprehensive scoring function for model quality assessment. Proteins: Structure, Function, and Bioinformatics, 71(1), 261–277. doi:10.1002/prot.21715 To the article.
11.
Studer, G., Rempfer, C., Waterhouse, A. M., Gumienny, R., Haas, J., & Schwede, T. (2020). QMEANDisCo—distance constraints applied on model quality estimation. Bioinformatics, 36(6), 1765-1771. To the article.
12.
DeLano, W. L. (2002). Pymol: An open-source molecular graphics tool. CCP4 Newsletter on protein crystallography, 40(1), 82-92. To the article.
13.
Venclovas, Č. (2011). Methods for sequence–structure alignment. Homology Modeling, 55-82. To the article.
14.
Yang, J., Anishchenko, I., Park, H., Peng, Z., Ovchinnikov, S., & Baker, D. (2020). Improved protein structure prediction using predicted interresidue orientations. Proceedings of the National Academy of Sciences, 117(3), 1496-1503. To the article.
15.
Baek, M., DiMaio, F., Anishchenko, I., Dauparas, J., Ovchinnikov, S., Lee, G. R., ... & Baker, D. (2021). Accurate prediction of protein structures and interactions using a 3-track network. bioRxiv. , To the article.
16.
Land H., Humble M.S. (2018) YASARA: A Tool to Obtain Structural Guidance in Biocatalytic Investigations. In: Bornscheuer U., Höhne M. (eds) Protein Engineering. Methods in Molecular Biology, vol 1685. Humana Press, New York, NY. To the article.
17.
Jumper, J., Evans, R., Pritzel, A., Green, T., Figurnov, M., Ronneberger, O., ... & Hassabis, D. (2021). Highly accurate protein structure prediction with AlphaFold. Nature, 1-11. To the article.
18.
Simpkin, A. J., Rodríguez, F. S., Mesdaghi, S., Kryshtafovych, A., & Rigden, D. J. (2021). Evaluation of model refinement in CASP14. Proteins: Structure, Function, and Bioinformatics. To the article.
19.
Roboklon. (2019, May). XELEX DNA Core Kit. To the article.
20.
Wang, T., Yin, W., AlShamaileh, H., Zhang, Y., Tran, P., Nguyen, T., … Duan, W. (2019). A detailed protein-SELEX protocol allowing visual assessments of individual steps for high success rate. Human Gene Therapy Methods. To the article.
21.
Sefah, K., Shangguan, D., Xiong, X., O’Donoghue, M. B., & Tan, W. (2010). Development of DNA aptamers using Cell-SELEX. Nature Protocols, 5(6), 1169–1185. To the article.
22.
Quintanilla-Licea, R., Vargas-Villarreal, J., Verde-Star, M. J., Rivas-Galindo, V. M., & Torres-Hernández, Á. D. (2020). Antiprotozoal activity against Entamoeba histolytica of flavonoids isolated from Lippia graveolens kunth. Molecules, 25(11), 2464. To the article.
23.
Salehi, B., Fokou, P., Sharifi-Rad, M., Zucca, P., Pezzani, R., Martins, N., & Sharifi-Rad, J. (2019). The Therapeutic Potential of Naringenin: A Review of Clinical Trials. Pharmaceuticals (Basel, Switzerland), 12(1), 11. To the article.
24.
Wu, G., Yan, Q., Jones, J. A., Tang, Y. J., Fong, S. S., & Koffas, M. A. (2016). Metabolic burden: cornerstones in synthetic biology and metabolic engineering applications. Trends in biotechnology, 34(8), 652-664. To the article.
25.
Jiang, Y., Chen, B., Duan, C., Sun, B., Yang, J., & Yang, S. (2015). Multigene editing in the Escherichia coli genome via the CRISPR-Cas9 system. Applied and environmental microbiology, 81(7), 2506-2514. To the article.
26.
Song, X., Huang, H., Xiong, Z., Ai, L., & Yang, S. (2017). CRISPR-Cas9D10A nickase-assisted genome editing in Lactobacillus casei. Applied and environmental microbiology, 83(22), e01259-17. To the article.
27.
Riedel, C. U., Casey, P. G., Mulcahy, H., O'Gara, F., Gahan, C. G., & Hill, C. (2007). Construction of p16Slux, a novel vector for improved bioluminescent labeling of gram-negative bacteria. Applied and environmental microbiology, 73(21), 7092–7095. To the article.
28.
Ku, J. T., Chen, A. Y., & Lan, E. I. (2020). Metabolic engineering design strategies for increasing acetyl-CoA flux. Metabolites, 10(4), 166. To the article.
29.
Dunstan, M. S., Robinson, C. J., Jervis, A. J., Yan, C., Carbonell, P., Hollywood, K. A., ... & Scrutton, N. S. (2020). Engineering Escherichia coli towards de novo production of gatekeeper (2 S)-flavanones: naringenin, pinocembrin, eriodictyol and homoeriodictyol. Synthetic Biology, 5(1), ysaa012. To the article.
30.
Wang, Q., Zeng, W., & Zhou, J. (2019). Effect of gene knockout of L-tyrosine transport system on L-tyrosine production in Escherichia coli. Sheng wu gong cheng xue bao= Chinese journal of biotechnology, 35(7), 1247-1255. To the article.
31.
Wu, J., Du, G., Chen, J., & Zhou, J. (2015). Enhancing flavonoid production by systematically tuning the central metabolic pathways based on a CRISPR interference system in Escherichia coli. Scientific reports, 5(1), 1-14. To the article.
32.
Concordet, J. P., & Haeussler, M. (2018). CRISPOR: intuitive guide selection for CRISPR/Cas9 genome editing experiments and screens. Nucleic acids research, 46(W1), W242-W245. To the article.
33.
Altschul, S.F., Gish, W., Miller, W., Myers, E.W. & Lipman, D.J. (1990) "Basic local alignment search tool." J. Mol. Biol. 215:403-410. To the article.
34.
Wu, X., Kriz, A. J., & Sharp, P. A. (2014). Target specificity of the CRISPR-Cas9 system. Quantitative biology (Beijing, China), 2(2), 59–70. To the article.
35.
Datsenko, K. A., & Wanner, B. L. (2000). One-step inactivation of chromosomal genes in Escherichia coli K-12 using PCR products. Proceedings of the National Academy of Sciences of the United States of America, 97(12), 6640–6645. To the article.
36.
Hou, M., Sun, S., Feng, Q., Dong, X., Zhang, P., Shi, B., Liu, J., & Shi, D. (2020). Genetic editing of the virulence gene of Escherichia coli using the CRISPR system. PeerJ, 8, e8881. To the article.
37.
Scholz, S. A., Diao, R., Wolfe, M. B., Fivenson, E. M., Lin, X. N., & Freddolino, P. L. (2019). High-resolution mapping of the Escherichia coli chromosome reveals positions of high and low transcription. Cell systems, 8(3), 212-225. To the article.