Team:SUNY Oneonta/Description

Project Overview | iGEM SUNY_Oneonta

Project Overview


Summary of the workflow for the SNflaPs genetic testing system. Genomic DNA samples from dairy cattle will be extracted using a cellulose paper dipstick. The isothermal amplification technique RPA will then be employed to amplify target gene fragments. We also created positive control DNAs and a heat block system for use with this system. Polymorphisms are then detected using the Flappase assay and fluorescently tagged oligos, cleavage of which is detected using a Raspberry Pi

Figure 1: Summary of the workflow for the SNflaPs genetic testing system. Genomic DNA samples from dairy cattle will be extracted using a cellulose paper dipstick. The isothermal amplification technique RPA will then be employed to amplify target gene fragments. We also created positive control DNAs and a heat block system for use with this system. Polymorphisms are then detected using the Flappase assay and fluorescently tagged oligos, cleavage of which is detected using a Raspberry Pi

Preparing New Targets

Expanding on the Ca2LF Concept to Develop SNflaPs

In the project Ca2LF, we developed the Flappase assay to detect a single nucleotide polymorphism (SNP) in one gene, the beta-casein gene. The detection of the SNP would only tell us what version of the beta-casein allele is present, A1 or A2. For this year’s project we wanted to expand the usefulness of our genetic testing system. We decided to expand our system to be used to characterize the genetic profile of a cow on a variety of different traits so that farmers could make informed decisions regarding a variety of advantageous and deleterious traits when breeding their cows.

Identifying potential target genes

We began by interviewing two experts in cattle genetics and breeding, Dr.’s Dechow and Huson. During this interview, we learned about some genetic conditions that are not desirable in cattle, such as diseases, conditions that negatively impact fertility, and physical traits (for example, being horned) that farms routinely attempt to breed out of their herds. We also learned about genetic traits that are advantageous and routinely selected for by farmers when breeding, including genes involved in promoting fertility, milk production, and milk composition. We compiled a list of these genes and researched their different alleles (Table 1).

Trait ClassPhenotypeGeneDescription of PolymorphismChromosomegDNA location (c. if in cDNA)
FertilityBrachyspinaFANCIdeletion of 3328 bp2121184870 -21188198
Spontaneous abortionAPAF1C →T substitution in exon 11, produces a nonsense stop codon563150400
unclear, reduces calving successGARTA → C substitution, creates a missense amino acid substitution11277227
spontaneous abortionTFB1Mdeletion of 138kbp993,233kb to 93,371kb
Disease statespremature death due to infectionITGB2T →C substitution, produces a missense amino acid substitution1144770078
cxfCVM/SLC35A3G →T substitution, produces a missense amino acid substitution343412427
embryonic lethalDUMPSC →T substitution, produces a nonsense stop codon143412427
syndactylyLRP4CG→ AT substitution of 2 nucleotides in exon 331576800972 - 76800973
Chondrospinoplasia and stillbirthCOL2A1G →A substitution, creates an altered splice site32473300
CitrullinaemiaASS1C →T substitution, produces a nonsense stop codon11100802781
Cholesterol deficiencyAPOBInsertion of a transposable LTR element (ERV2-1), located between nucleotides 24 and 25 of APOB exon 51177 958 994
Progressive degenerative myeloencephalopathy (Weaver syndrome)PNPLA8G → A substiution, creates a missense amino acid substitution449878773
Spinal muscular atrophyKDSRG → A substiution, creates a missense amino acid substitution2462138763
Other traitspresence of hornsno specific gene involved80 kb duplication12629113 - 2709240
BSE resistancePRNP23 bp deletion on the - strand -1594bp and 12 bp deletion on the + strand +300 bp46754 -51993, includes promoter
Milk compositionalphaS1 - casein milk proteinCSN1S1 Btype sequence6
CSN1S1 CA →G substitution in exon 17 creating a missense amino acid substitution6c.619A > G
CSN1S1 IA →G substitution in exon 17 and A →T in exon 11. Both create missense amino acid substitutions6c.619A > G and c.296A > T
CSN1S1 JA →G and C →G substitutions in exon 17 creating missense amino acid substitutions6c.619A > G and c.543G > T
alphaS2 - casein milk proteinCSN1S2 AType sequence exon 36
CSN1S2 BC →T substitution in exon 3, creates a missense amino acid substitution6c.68C>T
CSN1S2 DG →T substitution, leading to skipping of exon 8 (8 amino acid deletion from the protein)6c.221G>T
CSN1S2 EG →A substitution in exon 3, creates a missense amino acid substitution6c.64G>A
beta-casein milk proteinCSN2 A1Type sequence exon 76
CSN2 A2A →C substitution in exon 7, creates a missense amino acid substitution6c.245A>C
CSN2 A3A →C and C →A substitutions in exon 7, creates missense amino acid substitution6c.245A>C, c.363C>A
CSN2 BC →G substitution in exon 7, creates a missense amino acid substitution6c.411C>G
CSN2 CG →A substitution in exon 6, creates a missense amino acid substitution6c.154G>A
CSN2 FC →T substitution in exon 7, creates a missense amino acid substitution 66c.500C>T
CSN2 IA →C and A →C substitutions in exon 7, creates two missense amino acid substitutions6c.245A>C, c.322A>C
CSN2 JG →A substitution in exon 4, creates a missense amino acid substitution6c.103G>A
CSN2 KA →C and C →G substitutions in exon 7, creates two missense amino acid substitutions6c.245A>C, c.580C>G
CSN2 LT →C substitutions in exon 7, creates a missense amino acid substitution6c.635T>C
kappa - casein milk proteinCSN3 AType sequence exon 46
CSN3 BC →T and A →C substitutions in exon 4, creates two missense amino acid substitutions6c.470C>T, c.506A>C
CSN3 EA →G substitution in exon 4, creates missense amino acid substitution6c.526A>G
CSN3 HC →T substitution in exon 4, creates missense amino acid substitution6c.467C>T
Beta-lactoglobulinPAEPC →A substitution at position 215 bp upstream of the translation initiation sit, leads to aberrant low expression11103301704
PAEP 1G →C substitution, creates missense amino acid substitution11103302553
PAEP 4T →C substitution, creates missense amino acid substitution11103304757
Alpha-lactalbuminLAA AG →A substitution, His at amino acid 105c.851G>A
LAA Btype sequence, Arg at amino acid10c.851G

Another important piece of feedback we received from the interviews with Drs Dechow and Huson is that most alleles are created by multiple SNPs, or larger changes to the gene, such as insertions, deletions, inversions, or duplications. So, to be useful, our SNflaPs genetic testing system needs to detect multiple types of polymorphisms. We decided to focus our efforts on selecting a few genes, each of which has a different type of polymorphism, to serve as proof of concept that the Flappase-based detection system can discriminate between these alleles.

Table 2. Genes selected for proof-of-concept testing of the Flappase assay for detecting polymorphisms. The genes selected for each contain different polymorphisms, or multiple SNPs found in close, medium, or far proximity to each other.

GeneDescription of the PolymorphismChromosomegDNA location (c. If in cDNA
brachyspinaFANCIdeletion of 3328 bp2121184870 -21188198
syndactylyLRP4CG→ AT substitution of 2 nucleotides in exon 331576800972 - 76800973
Cholesterol deficiencyAPOBInsertion of a transposable LTR element (ERV2-1), located between nucleotides 24 and 25 of APOB exon 51177958994
alphaS1 - casein milk proteinCSN1S1 IA →G substitution in exon 17 and A →T in exon 11. Both create missense amino acid substitutions6c.619A > G and c.296A > T
alphaS1 - casein milk proteinCSN1S1 JA →G and C →G substitutions in exon 17 creating missense amino acid substitutions6c.619A > G and c.543G > T
Beta-lactoglobulinPAEPC →A substitution at position 215 bp upstream of the translation initiation sit, leads to aberrant low expression11103301704

When deciding which genes to use for testing the detection system we used the following criteria:

  1. The length of any polymorphism should be less than 5,000 bp.
  2. If multiple SNPs are to be detected, the distance between these should be enough to fit both the oligos, as well as accommodate the Flappase protein that is approx. 22 nucleotides and 75 Angstroms apart (See modeling page).
  3. For initial testing, any gene selected should be of sufficient length to be synthesized for the creation of positive control DNAs.

Based on these characteristics, we decided to begin by pursuing LRP4 and APOB as model genes.

Designing and cloning new targets and oligos for use with the Flappase system

New Target #1- LRP4

The LRP4 gene encodes for LDL receptor related protein 4 (3). Two well-characterized allelic variants of this gene have been identified, the normal (wildtype) version, and a syndactylous version. Inheritance of the syndactylous version leads to congenital syndactyly. This condition, also known as mulefoot, refers to the fusion or non-division of the two developed digits of the bovine foot (4).

Sequence variation between the normal and syndactyl alleles of LRP4. The two versions differ by two SNPs, where nucleotides CG are substituted by AT at positions 4863 and 4864.

Figure 3: Sequence variation between the normal and syndactyl alleles of LRP4. The two versions differ by two SNPs, where nucleotides CG are substituted by AT at positions 4863 and 4864.

We found the genomic DNA sequence of the bovine LRP4 gene using The National Center for Biotechnology (NCBI) reference assembly for Bos taurus (5). We located the LRP4 gene (Gene ID 504317), downloaded the sequence, and then prepared the sequence for synthesis. To do this, we selected a segment of this gene which includes exon 33 (the location of the two SNPs and additional gDNA upstream and downstream of the SNPs). We checked this sequence to determine its RCF10 compatibility and found no illegal restriction sites. We located the sites of the two SNPs and used this information to construct a wild-type and a syndactyly version of the gene fragment. We added the RCF10 prefix and suffix and had the DNA synthesized by IDT. These DNAs are for use as positive control DNAs for testing the Flappase system.

The wildtype LRP4 positive control DNA. The nucleotides highlighted in yellow indicate the location of the CG to AT base substitutions that create the syndactyly allele. The wildtype and the syndactyly LRP4 positive control constructs have been entered into the iGEM parts registry as BBa_K3952003 and BBa_K3952004.

Figure 4: The wildtype LRP4 positive control DNA. The nucleotides highlighted in yellow indicate the location of the CG to AT base substitutions that create the syndactyly allele. The wildtype and the syndactyly LRP4 positive control constructs have been entered into the iGEM parts registry as BBa_K3952003 and BBa_K3952004.

We are currently in the process of cloning these synthetic DNAs into pSB1C3 for use as positive control DNAs in our genetic testing system .

We also designed oligos to detect each allele of the LRP4 gene. In this case, the spacer oligo will be complementary to the two target bases and the Flappase oligo will be complementary to the two target bases on the spacer oligo.

Sequences of the spacer and target oligos to detect polymorphisms in the LRP4 gene. The spacer and flap oligos are designed to anneal to the syndactylous version of the gene. They are expected to anneal in a way that creates Holiday junctions, which are recognized by Flappase.

Figure 5: Sequences of the spacer and target oligos to detect polymorphisms in the LRP4 gene. The spacer and flap oligos are designed to anneal to the syndactylous version of the gene. They are expected to anneal in a way that creates Holiday junctions, which are recognized by Flappase.

New Target ##2- APOB

The APOB gene codes for Apolipoprotein B (6), a protein involved in transport of cholesterol in the blood stream. Studies have reported an inheritable allele in the APOB gene, that causes cholesterol deficiency in cattle (7). The problematic allele is created by the insertion of a transposable element (also known as a jumping gene) into the APOB gene. This transposable element, ERV2-1, inserted into the APOB gene at exon 5 between the 24th and 25th nucleotide, resulting in an insertion polymorphism of approximately 300 bp. This polymorphism disrupts APOB function, resulting in cholesterol deficiency (CD).

Differences between the normal and CD alleles of APOB. The two versions differ due to the insertion of a 302 bp transposon, ERV2-1, between nucleotides 24 and 25 within exon 5 of the APOB gene.

Figure 6: Differences between the normal and CD alleles of APOB. The two versions differ due to the insertion of a 302 bp transposon, ERV2-1, between nucleotides 24 and 25 within exon 5 of the APOB gene.

We found the genomic DNA sequence of the bovine APOB gene using The NCBI reference assembly for Bos taurus (5). We located the APOB gene (Gene ID 494004), downloaded the sequence, and then prepared the sequence for synthesis. To do this, we selected a segment of this gene which includes exon 5 (the location of the insertion and additional gDNA upstream and downstream of the SNPs. We checked this sequence to determine its RCF10 compatibility and found no illegal restriction sites.

We annotated the sites of the insertion on the sequence and used this information to construct a wild-type version of the gene fragment. We were, however, initially unable to construct the CD allele, due to our inability to find a published sequence of this allele, or a copy of the ERV2-1 sequence. Lucky for us, we found that the ERV2-1 sequence is available on Repbase, a database of repetitive DNA elements (8 ). We reached out to Team Rochester, as their university maintains a subscription to Repbase, and they downloaded and provided us with the sequence for ERV2-1.

To construct the CD allele, we inserted this sequence into the correct location. We checked the RCF10 compatibility of this new site and found an illegal SpeI site within the ERV2-1 insertion. To remove this site the illegal SpeI site was altered in sillico from ACTAGT to GCTAGT. For the purposes of the use of this DNA as a positive control, this change will not affect the functionality of the control, if we use RPA primers and oligos that do not overlap this area. Given its location, this should not be an issue. We added the RCF10 prefix and suffix and had the DNA synthesized by IDT. These DNAs are for use as positive control DNAs for testing the Flappase system.

The CD APOB positive control DNA. The nucleotides highlighted in yellow indicate the location of the 24th and 25th nucleotides in exon 5. The ERV2-1 transposon is indicated in blue text, and the altered illegal SpeI restriction site is shown in bold blue.  The wildtype APOB positive control is the same sequence, minus the insert. The wildtype and the CD APOB positive control constructs have been entered into the iGEM parts registry as BBa_K3952000 and BBa_K3952001.

Figure 7: The CD APOB positive control DNA. The nucleotides highlighted in yellow indicate the location of the 24th and 25th nucleotides in exon 5. The ERV2-1 transposon is indicated in blue text, and the altered illegal SpeI restriction site is shown in bold blue. The wildtype APOB positive control is the same sequence, minus the insert. The wildtype and the CD APOB positive control constructs have been entered into the iGEM parts registry as BBa_K3952000 and BBa_K3952001.

We are currently in the process of cloning these synthetic DNAs into pSB1C3 . Once cloned, these DNAs will be used as positive controls in our genetic testing system.

We also designed oligos to detect each allele of the APOB gene. We designed the spacer oligos the spacer oligo will be complementary to the two target bases and the Flappase oligo complementary to the complementary two target bases on the spacer oligo.

Sequences of the spacer and target oligos to detect polymorphisms in the APOB gene. The spacer and flap oligos are designed to anneal to the CD version of the gene within the ERV2-1 insert. They are expected to anneal in a way that creates Holiday junctions, which are recognized by Flappase. To detect the wildtype version of the gene a different set of oligos were designed to form holiday junctions directly at the location of the insertion.

Figure 8: Sequences of the spacer and target oligos to detect polymorphisms in the APOB gene. The spacer and flap oligos are designed to anneal to the CD version of the gene within the ERV2-1 insert. They are expected to anneal in a way that creates Holiday junctions, which are recognized by Flappase. To detect the wildtype version of the gene a different set of oligos were designed to form holiday junctions directly at the location of the insertion.

References

  1. Haplotype tests for recessive disorders that affect fertility and other traits. Haplotype tests. (n.d.). Retrieved October 17, 2021, from https://aipl.arsusda.gov/reference/recessive_haplotypes_ARR-G3.html.
  2. Leipold HW, Adrian RW, Huston K, Trotter DM, Dennis SM, Guffy MM; (n.d.). Anatomy of hereditary bovine syndactylism. I. Osteology. Journal of dairy science. Retrieved October 17, 2021, from https://pubmed.ncbi.nlm.nih.gov/4312935/.
  3. Menzi, F., Besuchet-Schmutz, N., Fragnière, M., Hofstetter, S., Jagannathan, V., Mock, T., Raemy, A., Studer, E., Mehinagic, K., Regenscheit, N., Meylan, M., Schmitz-Hsu, F., & Drögemüller, C. (2016, January 13). A transposable element insertion in APOB causes cholesterol deficiency in Holstein cattle. Wiley Online Library. Retrieved October 17, 2021, from https://onlinelibrary.wiley.com/doi/10.1111/age.12410.
  4. Ogorevc, J., Kunej, T., Razpet, A., & Dovc, P. (2009, June 8). Database of cattle candidate genes and genetic markers for milk production and mastitis. Wiley Online Library. Retrieved October 17, 2021, from https://onlinelibrary.wiley.com/doi/10.1111/j.1365-2052.2009.01921.x.
  5. Pruitt, K. D., Tatusova, T., & Maglott, D. R. (2006, November 27). NCBI Reference sequences (refseq): A curated non-redundant sequence database of genomes, transcripts and proteins. OUP Academic. Retrieved October 17, 2021, from https://academic.oup.com/nar/article/35/suppl_1/D61/1099759.
  6. UniProt Consortium European Bioinformatics Institute Protein Information Resource SIB Swiss Institute of Bioinformatics. (2021, June 2). Apolipoprotein B. UniProt Consortium European Bioinformatics Institute Protein Information Resource SIB Swiss Institute of Bioinformatics. Retrieved October 17, 2021, from https://www.uniprot.org/uniprot/E1BNR0.
  7. UniProt Consortium European Bioinformatics Institute Protein Information Resource SIB Swiss Institute of Bioinformatics. (2021, June 2). LDL receptor related protein 4. UniProt Consortium European Bioinformatics Institute Protein Information Resource SIB Swiss Institute of Bioinformatics. Retrieved October 17, 2021, from https://www.uniprot.org/uniprot/Q00KA9.
  8. Bao, W., Kojima, K. K., & Kohany, O. (2015, June 2). Repbase update, a database of repetitive elements in eukaryotic genomes. Mobile DNA. Retrieved October 17, 2021, from https://mobilednajournal.biomedcentral.com/articles/10.1186/s13100-015-0041-9.