Interoperability: MoClo, ProClo
To enable rapid, high-throughput and simple construction of genetic devices that can inexpensively screened in B. subtilis for recombinant protein expression, secretion, and purification, we sought to develop a wetware toolkit that could interoperate with and leverage parts of the FreeGenes E. coli Protein Expression Toolkit (PET). PET contains, among other things a large collection of peptide tags including periplasmic export tags, affinity purification tags (for purification with nickel-NTA resin, silica, starch/maltose, chitin, cellulose or calmodulin), fluorescent and chromoprotein reporter tags, and cleavage tags that proteolytically cleave other tags off the CDS.
The ability to quickly and easily fuse composite tags to a CDS is potentially very useful for our project of democratizing the optimization of recombinant protein production and purification, because it could allow us to increase the throughput and decrease the cost of screening for a desired function (e.g. secretion, solubility, or retention on a purification substrate), accelerating the design-build-test cycle for developing high-expressing strains for a protein of interest. For instance, the combination of a silica-binding affinity peptide, a fluorescent reporter, and a protease cleavage site could enable rapid measurement and testing of the expression levels of different constructs, of different protocols to purify the target protein, and of comparison of the enzymatic function of tagged and untagged versions of the purified protein.
To enable the assembly of multiple composite tags onto CDSs, an expanded and modified version of the MoClo assembly standard was designed for the PET collection (called FreeGenes Protein Cloning, or ProClo, assembly standard). Five additional 4 bp overhangs were added to the assembly standard, two of which replaced the 3’ overhangs in the ‘RBS’ and ‘CDS’ part definitions. These additions made space for up to 3 N-terminal and 3 C-terminal tag parts (defined generically as Tag1 - Tag6) to be appended to ProClo CDS parts, in a single Golden Gate reaction. As with OYC, the choice of additional overhang sequences capable of high-fidelity assembly with the standard MoClo overhangs was guided by NEB’s Golden Gate utility tools. Additionally, it was required that the ProClo RBS 3’ overhang sequence had to end with the beginning of a start codon, and that all other new overhangs had to be able to code for a pair of small, hydrophilic amino acids (glycine, serine, or threonine), so that the scar sequence between different tags would be less likely to affect the function of the tags or whatever protein they’re attached to. The resulting set of 10 overhangs, including the MoClo transcription unit overhangs, has predicted ligation fidelity of 99%.
Pfu-Sso7d sequence optimization
Since the PET collection already has most of the affinity, reporter, and cleavage tags we would want to use, we focused on (1) a model protein coding sequence, (2) a few transcriptional terminators that work in Bacillus, and (3) the secretion tags that would be required to export that protein out of B. subtilis.
For the model protein, we decided on Pfu-Sso7d as our model protein, because as a fast and high-fidelity thermostable DNA polymerase (a commercial iteration of this protein is known as Phusion) that just came off patent last year, we thought it would be useful to the world if it became exceedingly abundant and cheap. As detailed in our Software section, we built a pipeline for optimizing coding sequences by identifying potentially problematic homopolymers, low/high GC content, forbidden restriction sites, rare codons, and excessively stable predicted secondary structure; and then removing those problematic sequences. Our design pipeline optimized codon frequencies across the gene to match the frequencies in a set of genes highly expressed in B. subtilis under nutrient-limiting conditions (a codon optimization method that works well in E. coli); for generality’s sake and to potentially test multi-copy expression, we also ran an optimization to harmonize codon frequencies in between those of the first strategy above for B. subtilis, and those for E. coli (aiming this time for a CDS that could express well in both organisms).
For the transcriptional terminators, we incorporated sequences from previous iGEM competitions, including rrnBT1 T7TE (Bba_B0015), T7TE LuxtA (Bba_B0014), the Lambda T1 transcriptional terminator (Bba_K864601), and finally a B. subtilis transcriptional terminator known as Bba-K780000. Each of these was adapted for Golden Gate assembly:
-->BbsI>GGAG-->BsaI>GCTT---Term Terminator---CGCT<BsaI<--CGCT<BbsI--
Finally, for the secretion tags, we built library plasmids.