Team:Tec-Monterrey/Software

SOFTWARE

Overview

One of the main features regarding synthetic biology is the capability to use digital tools such as programming to design biological systems, the interaction between Dry-Lab and Wet-Lab is key to the success of our endeavor and an important part of our project; Dry-Lab designed and code a software that allowed us to conduct in silico creation and analysis of the toeholds structures used in our detection system, predicting and screening the best candidates between several synthetic RNA structures. Our code is open source and is also available in the following GitHub repository . Feel free to fork it, use it, and improve it.

Primer design by PrimedRPA

First, it is necessary for the correct effectiveness of our system to detect an specific DNA fragment in our biological sample, for this purpose it is necessary to amplify it using one of the many techniques available, in order to perform an isothermal amplification we decided to use Recombinase Polymerase Amplification (RPA). To create and filter RPA primers we decided to use PrimedRPA, a python-based package. In this package the user provides 3 things: input sequence, sequence files and filtering parameters. The parameters are: amplicon, primer and probe length, GC content, tolerance of binding to background DNA and ability to form a secondary structure and heterodimerize. The end product from PrimedRPA contains three documents, only one of those is of interest to us [1].

Our software utilices the file containing the primer pairs and their respective parameters to create a pandas dataframe. The primer selection process was performed according to the TwistAmp Assay Design Manual from TwistDx [2], which indicates selecting a target region, for this it is recommended to opt for a sequence with average characteristics, like a GC content between 30% and 70%. We also selected an amplicon size between 100-200 pb and a MaxDim score < 40. This last one property explains the affinity of the sequence to bind with itself, the smaller the parameter is, the less possibilities are for the primer to form dimers. So taking this into consideration, the software filters the dataframe for optimal primer parameters resulting in a reduced list of primer pairs to consider for the downstream process which involves defining its amplicon and assembling the toeholds from it.

Toehold Design with Toehold Switch Creator

Our phytopathogen detection system is based on the implementation of synthetic RNA riboswitches called Toehold Switches, for the design of these parts we made use of a python script that combines original code with several open source softwares in order to aid the design process in response to the dispersion of the tools used generally for designing these riboswitches. In terms of its functionality our software requires 3 inputs for it to work: the first is a parameters file, this file contains the parameters that PrimedRPA needs to compute the RPA primers, secondly we require a FASTA file which is of crucial importance and contains the sequence to be detected and used by the program to compute the toehold switches as well as the primers, thirdly and lastly we need the energetic range wich will be used to calculate the suboptimal structures of the RNA toehold switches. With these 3 inputs we generate the toehold structures by a two step process in which we assemble and evaluate the toehold’s viability to perform as expected.

The first step in the toehold computation is the generation of the sequences themselves, for this task we do not generate the toeholds entirely de novo , nor we compute thermodynamics without a reference, instead we use standardized parts given by Pardee et al (2016) as a starting point. Firstly we take their dot-bracket secondary structure as a reference, and create a 36 base-pair iterative window for each amplicon generated using the primers dataframe and the FASTA file. Each 36 base-pair fragment will be turned into its reversed complement to be concatenated along with the standardized “loop” and “linker” sections, resulting in a toehold library for each amplicon of the original sequence [3].

These two standardized elements along with our sequence window gives us the the primary library of the toeholds sequences being mainly composed in three sections: the loop sequence, that contains the ribosome binding site (RBS) along with the start codon, the linker, which acts as a bridge between the hairpin structure and the reporter gene, and finally the toehold sequence itself made out from the nt windows that where processed from the FASTA file given by the user.







Among all other things we need to take certain precautions while generating the toehold sequences, for this we eliminate the options that include a stop codon adding filters at the very beginning of the computation, after the sequences without the stop codons are discarded the software assembles the different parts into the final structures that will be tested with several score and thermodynamic parameters in order to get our final toeholds (See more in Modeling). It should be noted that some of the functions used in Toehold Switch Creator for the toehold assemble process come from the software Toehold Designer created by iGEM 2017 EPFL team [4].

Thermodynamic Analysis with ViennaRNA

ViennaRNA is an open source package that consists on a set of standalone programs and libraries used for prediction and analysis of RNA secondary structures [5], with this program we were able to calculate several thermodynamical parameters that evaluate the stability of our toeholds.

After making all the necessary calculations we filtered the toehold library, selecting the toehold switches in the following order: the lowest score punctuation, the highest Gibbs free energy from the RBS to the linker region in the toehold-target complex, and the lowest minimum free energy (see more about these parameters on Model). Once these sequences were selected, their minimum free energy structure was calculated, as well as their suboptimal structures in the energy range given by the user (usually 1 kcal/mol). This last step allows us to evaluate the stability of the toehold in this energy range, meaning that if the toehold structure does not vary within that energy range it will maintain its functionality.

Web App

In order to facilitate the design and in response to the complexity of the installation and execution of operating a linux based program, we are currently developing a web app that enables the user to generate toeholds to whatever sequence she or he wants, all within a user friendly interface.

​​It is required 3 input files for the web app to work: the first is a parameters file, containing the data PrimedRPA needs to compute to generate the RPA primers, secondly we require a FASTA file which is of crucial importance and contains the sequence to be detected and used by the program to compute the toehold switches as well as the primers, thirdly and lastly we need the energetic range wich will be used to calculate the suboptimal structures of the RNA toehold switches.

Output

The results are delivered to the user via a .csv file where in each tab of the file the user will find a pair of RPA primers with its given parameters such as amplicon, amplicon size, GC percentage, etc. Additionally, in each tab it is also included a table with the resulting toeholds for that given ranked in descending order according to the calculated thermodynamic parameters. Finally, below that section the user can find the suboptimal structures for each toehold within the energy range that he provided to the software.

All these elements will allow the user to select their desired toehold switches for the given sequence he wants to detect. It is strongly recommended to employ a secondary structure visualization tool such as a NUPACK for a better understanding of the RNA structures, it is also recommended to select the structures with less suboptimal structures for being employed on the experimental validation. It is worth noting that further development of the software is being planned for the user to be able to see images of each toehold free energy secondary structure such as the ones seen in Modeling > Results.

References

  1. Higgins, M., Ravenhall, M., Ward, D., Phelan, J., Ibrahim, A., Forrest, M. S., Clark, T. G., & Campino, S. (2019). PrimedRPA: primer design for recombinase polymerase amplification assays. Bioinformatics (Oxford, England), 35(4), 682–684. https://doi.org/10.1093/bioinformatics/bty701
  2. TwistDx. (2018). TwistAmp® DNA Amplification Kits Assay Design Manual. Available at: https://www.twistdx.co.uk/docs/default-source/RPA-assay-design/twistamp-assay-design-manual-v2-5.pdf
  3. Pardee, K., Green, A. A., Takahashi, M. K., Braff, D., Lambert, G., Lee, J. W., Ferrante, T., Ma, D., Donghia, N., Fan, M., Daringer, N. M., Bosch, I., Dudley, D. M., O’Connor, D. H., Gehrke, L., & Collins, J. J. (2016). Rapid, Low-Cost Detection of Zika Virus Using Programmable Biomolecular Components. Cell, 165(5), 1255–1266. https://doi.org/10.1016/j.cell.2016.04.059
  4. EPFL. (2017). Toehold Designer. Available at: https://2017.igem.org/Team:EPFL/Software
  5. Lorenz, R., Bernhart, S. H., Höner, C., & Hofacker, I. L. (2011, November 24). ViennaRNA package 2.0. ResearchGate; BioMed Central. https://www.researchgate.net/publication/51828551_ViennaRNA_package_20