Selection of the Project
Our journey as a team in this year's iGEM competition began in January. To decide on a project as quickly as possible, we first did a lot of brainstorming and compiled all the ideas. In the following meetings, we selected which ideas would be feasible. Three projects emerged, which we divided among ourselves and worked out further. At the end of February, we were able to present the more mature ideas to our supervisors and received positive feedback for two of these project ideas. The final decision was made on March 12, 2021, where we chose our project "Storagene" for this year's iGEM competition.
The idea of using the impressive capacities and properties of DNA to store data fascinated us the most. Its remarkable properties are information density, longevity, and also stability 12. Our project is based on the work of George Church, a world-renowned scientist in the field of synthetic biology and genome engineering. His idea of enzymatic DNA synthesis with the terminal deoxynucleotidyltransferase (TdT) for storing digital information got our attention. However, this requires the synthesis of many unique DNA sequences of great length. Therefore, Church developed a photon-directed method that requires a laser, a very costly acquisition with complex handling 3.
The specific goal of our project “Storagene” is to synthesize DNA enzymatically using the TdT for long-term data storage. Therefore, we developed the DIP method, where immobilized primer on a magnetic stick is dipped into a solution containing TdT and respective nucleotides.
In the future, everybody will have their own DIPsy at home, with which DNA can be synthesized, stored, and sequenced at once. Through this, personal data can be stored on DNA in the long term. Thereby, for example, family history can be collected and passed for many generations. Read our Implementation for more information.
Characteristics of the Terminal Deoxynucleotidyl Transferase
The terminal deoxynucleotidyltransferase (TdT) is an enzyme that possesses the unusual ability to incorporate nucleotides in a template-independent manner using only single-stranded DNA as the nucleic acid substrate 3.
Like all DNA polymerases, the TdT also requires divalent metal ions for catalysis of the phosphoryl transfer reaction associated with nucleotide incorporation. However, the TdT is uniquely capable of using a variety of divalent cations such as $Co^{2+}$, $Mn^{2+}$, $Zn^{2+}$, and $Mg^{2+}$ 4.
Another advantage of using the TdT is that the enzyme utilizes a wide variety of nucleotide analogs such as 2′,3′-dideoxynucleotides, p-nitrophenylethyl triphosphate, p-nitrophenyl triphosphate, 2′-deoxy-L-ribonucleoside 5′-triphosphates, and dinucleoside 5′,5′-tetraphosphates 4. This can be used for increasing the data capacity for DNA data storage. The idea is further elaborated in our Outlook.
Since we wanted to incorporate nucleotides in a controlled manner, we had to interrupt the reaction in the standard TdT reactions. Heat and many different metal chelators can be used for inhibiting the TdT activity, such as o-phenanthroline, EDTA, triethylene-tetramine, α:α'-bipyridyl, and cysteine 5.
DNA Synthesis Methods to Date
Without synthetic DNA or RNA, almost nothing would work in the laboratory: no PCR, no adapters, no linkers and other cloning elements, no CRISPR-Cas - not to mention synthetic genes 6.
Currently, DNA is still synthesized using decades-old chemical phosphoramidite methods, which involve expensive and harsh reactions in many time-consuming steps 3. The synthesized DNA is usually only a few hundred nucleotides long because on the one hand, not every step is one hundred percent efficient. On the other hand, long sequences form secondary structures interfere with further construction 6. This limits the chemical synthesis as the demand for longer and larger quantities of DNA oligonucleotides increases 3.
Church’s team developed an enzymatic DNA synthesis method for DNA data storage using the TdT and the cofactor $Co^{2+}$. The photolabile chelator DMNP-EDTA locks the cation in place and releases it only after irradiation with UV light, activating TdT. The light beams precisely target individual positions on a chip where enzymatic DNA synthesis is to occur by attaching to a short oligo anchor. Even if the entire chip is flooded with dATP, for example, only the TdT enzymes at the points currently illuminated can incorporate nucleotides. Subsequently, the excess dATP is flushed away and the TdT incorporates the next dNTP at the illuminated positions of the chip 6.
Our idea of enzymatic DNA synthesis
The project focuses on the enzymatic synthesis of ssDNA strands using the terminal deoxynucleotidyl transferase (TdT). Our goal is to synthesize DNA strands in which an unspecific number of nucleotides are incorporated in each synthesis cycle. From this, we hope to achieve as many transitions as possible since our ternary system is based on the encoding and restoring of the information saved on the DNA. Thus, the information is not stored in a sequence of specific bases but in the transitions between bases. Therefore, a specific synthesis is not necessary. This method permits a more accessible and robust system that even allows for the addition of proofreading features. To achieve this, we invented a special Software.
The Necessity of Our Project
DNA synthesis is currently implemented using phosphoramidite synthesis. However, this is not very environmentally friendly due to its harsh reactions, the use of solvents, and toxic by-products. Enzymatic synthesis eliminates the use of solvents and toxic by-products, which is why the environmental footprint can be significantly reduced. In addition, only a few hundred nucleotides can be synthesized using the current method because the steps are not as efficient, and secondary structures can form. Enzymatic synthesis is not limited to the addition of nucleotides to a DNA strand and can enable strand extension of thousands of bases. Nevertheless, secondary structure formation is possible as well when using enzymatic DNA synthesis. But there are various methods to avoid them. On the one hand, temperature increase and variation of the ion concentration of the cofactor can lead to a reduction of secondary structures 7. On the other hand, so-called single-strand binding proteins, a hybridization with short oligonucleotides, or photolabile protection groups can be used to avoid secondary structures.
The challenges of the phosphoramidite method are to be optimized so that less solvent is used and more extended strand synthesis is possible. This could be done with a programmable microfluidic synthesis platform containing integrated valves, allowing individual manipulation and product removal 8. However, moving to enzymatic synthesis would make these problems disappear into thin air. Unfortunately, with enzymatic synthesis, other challenges appear.
The aim is to create as many transitions as possible so that only a few identical nucleotides are incorporated one after the other to increase the data storage capacity. For this purpose, standardized reaction conditions must be found since the nucleotides are incorporated differently by the TdT. For example, the nucleotide deoxythymidine triphosphate is incorporated faster than deoxyguanosine triphosphate in a standard TdT reaction, see Results. In addition, reaction conditions should be chosen such that DNA strand synthesis is reproducible and all strands receive the same nucleotide sequence. Furthermore, a system could be developed that synthesizes many different DNA strands on one surface to save time and costs.
To facilitate the work in the laboratory, our hardware should be implemented in the laboratory in the future. Additionally, a dedicated nanopore sequencer to analyze the DNA strands directly after synthesis would be beneficial in the lab. Looking ahead, a specific synthesis in which a certain number of nucleotides or even only one nucleotide is appended would be more advantageous in the future, as this would allow a much more precise procedure. In addition, the costs of enzymatic DNA synthesis will be lowered as reaction times are reduced, and the number of nucleotides used decreases.