Team:Open Science Global/Wetware/On-Library-Plasmids

On Library Plasmids

For some bioengineering applications, there are many possible alternative genetic parts to do the task, but it is difficult to predict beforehand which ones will perform as desired in a specific genetic device. Screening of variants (in quantities ranging from ‘several’ to ‘many’) is required, but importantly the variants that fail to work as desired in one device could work perfectly in a different one. Selecting a ribosome binding site (RBS) to give a specific level of gene expression for a particular gene under control of a particular promoter in a particular strain of bacteria is such a task--there are many possible RBSs, and it is difficult to predict with high accuracy which one will give the desired expression level, but each RBS variant could work for some future device, even if it doesn’t for the one at hand. Similarly, as Brockmeier et al. demonstrated, there are many possible secretion tags for exporting recombinant protein out of B. subtilis cells, and which tag works best depends in non-obvious ways on the sequence and structure of the recombinant protein of interest (i.e. it is virtually impossible to predict).

These tasks call for libraries of part variants that can be stored, retrieved, and dropped in parallel into genetic devices, which are then screened for the desired function. However, synthesizing, storing and retrieving each such variant on its own unique plasmid quickly becomes onerous and expensive, especially in experiments that require purifying, combinatorially assembling and testing many variants (for instance, ~150 B. subtilis secretion tags). This problem gets worse as genetic designs become more complex, as the same part may need to be stored with multiple variants of assembly syntax in order to quickly and easily access broader regions of design space. For instance, RBSs are defined with TACT and AATG overhangs to build simple transcription units with the MoClo assembly standard, but are defined with TACT and CCAT overhangs in the ProClo standard to enable assembly of modular, composite peptide tags onto protein CDSs. Moreover, in the case of RBSs, the parts are often so small that it is difficult to cheaply buy many clonal variants of them (a major supplier of synthetic dsDNA, for instance, will not synthesize dsDNA genetic parts shorter than 300 bp).

We want to rapidly and cheaply synthesize, store, retrieve, and deploy part variants in functional screens. For this task, we don’t want a plasmid library, we want library plasmids! A library plasmid encodes a tandem array of parts that all perform a similar function. Each variant (V) on the plasmid vector (Vec) is bounded by type IIS restriction cut sites (BsaI) and identical, function-defined assembly overhangs (OH), as well as a spacing gap (Gap) between parts. Like so, for just 2 parts:

 

Vec_5’--->BsaI>-OH1--V1--OH2-<BsaI<--Gap1-->BsaI>-OH1--V2--OH2-<BsaI<--Gap2--Vec_3’

 

Library plasmids could solve several problems.

First, they reduce the need to separately store a large number of genetic parts that are designed to perform variants of the same function. Instead, large numbers of part variants can be stored on a few library plasmids in a few strains.

Second, they reduce the number of plasmid purifications that must be performed in order to run a combinatorial genetic assembly for screening the function of many part variants. Instead, purifying a couple of library plasmids and dropping them into a MoClo GoldenGate reaction should accomplish the combinatorial assembly.

Third, they make small genetic parts more synthesizable. Some companies won’t synthesize and clone genes smaller than 300 bp, but arranging many part variants in tandem ensures that even tiny parts like RBSs can be synthesized by Twist.

Potential Library Plasmid Challenges

We have struggled to find prior examples of genetic parts with this architecture in the literature. We can think of several challenges that might inhibit the development and use of library plasmids, and we’ve considered ways to overcome these challenges.

First, if restriction digestion does not run to completion, the library plasmid user could end up assembling multiple tandem part variants into a device, rather than only one. This could be addressed by running restriction digests longer and/or at higher restriction enzyme concentrations. It could also be addressed by inserting the recognition site for a blunt-cutting restriction enzyme into the gap between part variants. This restriction site would cleave apart the tandem arrays and prevent any overhangs from re-annealing and ligating. As many FreeGenes parts have already been synthesized without being domesticated to remove the recognition sites of any blunt-cutting restriction enzymes, it is important to select a highly specific blunt-cutting enzyme for this task. We selected PmeI as the gap-cutting enzyme. With its 8 bp recognition sequence (GTTT|AAAC), PmeI is unlikely to have cut sites within the vast majority of already-synthesized Golden Gate wetware, from FreeGenes or other sources. Moreover, PmeI is most active at 37ºC in NEB’s CutSmart buffer, making dual BsaI/PmeI digestion reactions possible. Finally, PmeI’s patents expired in 2011-2012, so it is free to use and manufacture.

A second potential challenge for library plasmids is they could be too repetitive to synthesize de novo with polymerase cycling assembly. For smaller parts like RBSs or signal peptides, we imagine addressing this problem by adding sequence-diverse spacer regions between each part, with the length of the sequence-diverse spacers tuned to whatever our sequence checking/fixing software and the DNA manufacturing company deems synthesizable. For tandem arrays of larger part variants that can’t be de novo synthesized even with spacer elements, library plasmids can be constructed through Golden Gate assemblies. In such an assembly, BbsI cut sites and optimized GoldenGate overhangs can be used to efficiently assemble at least 20 parts (and possibly more) into a single vector backbone, like so:

>BbsI>-OH_a--Vec_backbone--OH_b-<BbsI<

+

>BbsI>-OH_a-->BsaI>-OH1--V1--OH2-<BsaI<--Gap1a-PmeI-Gap1b--OH_c-<BbsI<

+

>BbsI>-OH_c-->BsaI>-OH1--V2--OH2-<BsaI<--Gap2a-PmeI-Gap2b--OH_b-<BbsI<

||  

BbsI Golden Gate

||

Vec_5’-->BsaI>-OH1--V1--OH2-<BsaI<--Gap1a-PmeI-Gap1b-->BsaI>-OH1--V2--OH2-<BsaI<--Gap2a-PmeI-Gap2b--...--Vec_3’

We designed, synthesized, and made freely available a set of six Bacillus subtilis SecTag library plasmids (Bsub_SecTag_Lps). Each Bsub_SecTag_Lp has a pOpen_v3 backbone, and an insert consisting of many SecTag variants (peptide sequences borrowed from Brockmeier et al.), each preceded by a custom RBS designed in Salis Lab’s Ribosome Binding Site Calculator to maximize expression in Bacillus subtilis (assuming a RiboJ-type insulator upstream, which is not included on the library plasmid, but is included in our modified pHT43 lactose/IPTG-inducible promoter).

Each RBS-SecTag pair is bounded by BsaI cut sites, with overhangs defining the tag as an ‘RBS/Loc’ part in the extended FreeGenes Protein Cloning (FG ProClo) assembly standard. Between each variant is 60bp of randomly generated DNA sequence to increase the sequence diversity and synthesizability of the library plasmid. And in the middle of each random DNA spacer is a PmeI blunt cut site, which can be cleaved during assembly reactions to reduce the chances of erroneous incorporation of two or more tandem RBS-SecTags into the finished plasmid. 148 RBS-SecTag variants are spread across 6 library plasmids, so that each insert is less than 5,000 bp and therefore (hopefully) synthesizable by Twist. The sequences have had all commonly used Type IIS restriction sites (except the flanking BsaI sites) removed, and have had Type II restriction sites used in a number of older assembly standards (BioBricks, BglBricks, SEVA, etc) removed. Any DAM/DCM methylation sites that could interfere with BsaI or PmeI digestion have been removed as well.

We plan to use the Bsub_SecTag_Lps in ProClo assembly reactions that append both the RBS/SecTag pairs and a fluorescent protein tag to an enzyme of interest, assembling a library of FP-tagged enzymes with different SecTags in a single GoldenGate reaction. The assembled plasmid library will then be transformed into B. subtilis, and secretion will be screened either by imaging for a fluorescent ‘halo’ of secreted, diffusing enzyme around colonies on agar plates, or by high-throughput culturing of colonies in 96-deep-well plates, followed by centrifugation/flocculation and measurement of supernatant fluorescence with a plate reader. Either way, the collection hopefully will enable us and others to do low-cost screening for efficient SecTag/enzyme pairs, through a 1-pot GoldenGate reaction requiring only 6 or fewer variant plasmids, instead of 148.

We submitted the sequences for the Bsub_SecTag_Lps, ~30 kb in all, to the FreeGenes project at the BioBricks Foundation, along with a description of what they are, why they are useful, and how we hope to use them. After a screening process of several weeks during which the parts were vetted for biosecurity and IP concerns, our parts were approved and FreeGenes ordered their synthesis from Twist. Today, they are synthesized and available to the world, for free, under an OpenMTA, in the FreeGenes store.

On Library PlasmidsA Clash of Overhangs