Project Overview | iGEM SUNY_Oneonta

Project Overview

Summary of the workflow for the SNflaPs genetic testing system. Genomic DNA samples from dairy cattle will be extracted using a cellulose paper dipstick. The isothermal amplification technique RPA will then be employed to amplify target gene fragments. We also created positive control DNAs and a heat block system for use with this system. Polymorphisms are then detected using the Flappase assay and fluorescently tagged oligos, cleavage of which is detected using a Raspberry Pi

Figure 1: Summary of the workflow for the SNflaPs genetic testing system. Genomic DNA samples from dairy cattle will be extracted using a cellulose paper dipstick. The isothermal amplification technique RPA will then be employed to amplify target gene fragments. We also created positive control DNAs and a heat block system for use with this system. Polymorphisms are then detected using the Flappase assay and fluorescently tagged oligos, cleavage of which is detected using a Raspberry Pi

Preparing New Targets

Expanding on the Ca2LF Concept to Develop SNflaPs

In the project Ca2LF, we developed the Flappase assay to detect a single nucleotide polymorphism (SNP) in one gene, the beta-casein gene. The detection of the SNP would only tell us what version of the beta-casein allele is present, A1 or A2. For this year’s project we wanted to expand the usefulness of our genetic testing system. We decided to expand our system to be used to characterize the genetic profile of a cow on a variety of different traits so that farmers could make informed decisions regarding a variety of advantageous and deleterious traits when breeding their cows.

Identifying potential target genes

We began by interviewing two experts in cattle genetics and breeding, Dr.’s Dechow and Huson. During this interview, we learned about some genetic conditions that are not desirable in cattle, such as diseases, conditions that negatively impact fertility, and physical traits (for example, being horned) that farms routinely attempt to breed out of their herds. We also learned about genetic traits that are advantageous and routinely selected for by farmers when breeding, including genes involved in promoting fertility, milk production, and milk composition. We compiled a list of these genes and researched their different alleles (Table 1).

Trait Class	Phenotype	Gene	Description of Polymorphism	Chromosome	gDNA location (c. if in cDNA)
Fertility	Brachyspina	FANCI	deletion of 3328 bp	21	21184870 -21188198
	Spontaneous abortion	APAF1	C →T substitution in exon 11, produces a nonsense stop codon	5	63150400
	unclear, reduces calving success	GART	A → C substitution, creates a missense amino acid substitution	1	1277227
	spontaneous abortion	TFB1M	deletion of 138kbp	9	93,233kb to 93,371kb
Disease states	premature death due to infection	ITGB2	T →C substitution, produces a missense amino acid substitution	1	144770078
	cxf	CVM/SLC35A3	G →T substitution, produces a missense amino acid substitution	3	43412427
	embryonic lethal	DUMPS	C →T substitution, produces a nonsense stop codon	1	43412427
	syndactyly	LRP4	CG→ AT substitution of 2 nucleotides in exon 33	15	76800972 - 76800973
	Chondrospinoplasia and stillbirth	COL2A1	G →A substitution, creates an altered splice site		32473300
	Citrullinaemia	ASS1	C →T substitution, produces a nonsense stop codon	11	100802781
	Cholesterol deficiency	APOB	Insertion of a transposable LTR element (ERV2-1), located between nucleotides 24 and 25 of APOB exon 5	11	77 958 994
	Progressive degenerative myeloencephalopathy (Weaver syndrome)	PNPLA8	G → A substiution, creates a missense amino acid substitution	4	49878773
	Spinal muscular atrophy	KDSR	G → A substiution, creates a missense amino acid substitution	24	62138763
Other traits	presence of horns	no specific gene involved	80 kb duplication	1	2629113 - 2709240
	BSE resistance	PRNP	23 bp deletion on the - strand -1594bp and 12 bp deletion on the + strand +300 bp		46754 -51993, includes promoter
Milk composition	alphaS1 - casein milk protein	CSN1S1 B	type sequence	6
		CSN1S1 C	A →G substitution in exon 17 creating a missense amino acid substitution	6	c.619A > G
		CSN1S1 I	A →G substitution in exon 17 and A →T in exon 11. Both create missense amino acid substitutions	6	c.619A > G and c.296A > T
		CSN1S1 J	A →G and C →G substitutions in exon 17 creating missense amino acid substitutions	6	c.619A > G and c.543G > T
	alphaS2 - casein milk protein	CSN1S2 A	Type sequence exon 3	6
		CSN1S2 B	C →T substitution in exon 3, creates a missense amino acid substitution	6	c.68C>T
		CSN1S2 D	G →T substitution, leading to skipping of exon 8 (8 amino acid deletion from the protein)	6	c.221G>T
		CSN1S2 E	G →A substitution in exon 3, creates a missense amino acid substitution	6	c.64G>A
	beta-casein milk protein	CSN2 A1	Type sequence exon 7	6
		CSN2 A2	A →C substitution in exon 7, creates a missense amino acid substitution	6	c.245A>C
		CSN2 A3	A →C and C →A substitutions in exon 7, creates missense amino acid substitution	6	c.245A>C, c.363C>A
		CSN2 B	C →G substitution in exon 7, creates a missense amino acid substitution	6	c.411C>G
		CSN2 C	G →A substitution in exon 6, creates a missense amino acid substitution	6	c.154G>A
		CSN2 F	C →T substitution in exon 7, creates a missense amino acid substitution 6	6	c.500C>T
		CSN2 I	A →C and A →C substitutions in exon 7, creates two missense amino acid substitutions	6	c.245A>C, c.322A>C
		CSN2 J	G →A substitution in exon 4, creates a missense amino acid substitution	6	c.103G>A
		CSN2 K	A →C and C →G substitutions in exon 7, creates two missense amino acid substitutions	6	c.245A>C, c.580C>G
		CSN2 L	T →C substitutions in exon 7, creates a missense amino acid substitution	6	c.635T>C
	kappa - casein milk protein	CSN3 A	Type sequence exon 4	6
		CSN3 B	C →T and A →C substitutions in exon 4, creates two missense amino acid substitutions	6	c.470C>T, c.506A>C
		CSN3 E	A →G substitution in exon 4, creates missense amino acid substitution	6	c.526A>G
		CSN3 H	C →T substitution in exon 4, creates missense amino acid substitution	6	c.467C>T
	Beta-lactoglobulin	PAEP	C →A substitution at position 215 bp upstream of the translation initiation sit, leads to aberrant low expression	11	103301704
		PAEP 1	G →C substitution, creates missense amino acid substitution	11	103302553
		PAEP 4	T →C substitution, creates missense amino acid substitution	11	103304757
	Alpha-lactalbumin	LAA A	G →A substitution, His at amino acid 10	5	c.851G>A
		LAA B	type sequence, Arg at amino acid	10	c.851G

Another important piece of feedback we received from the interviews with Drs Dechow and Huson is that most alleles are created by multiple SNPs, or larger changes to the gene, such as insertions, deletions, inversions, or duplications. So, to be useful, our SNflaPs genetic testing system needs to detect multiple types of polymorphisms. We decided to focus our efforts on selecting a few genes, each of which has a different type of polymorphism, to serve as proof of concept that the Flappase-based detection system can discriminate between these alleles.

Table 2. Genes selected for proof-of-concept testing of the Flappase assay for detecting polymorphisms. The genes selected for each contain different polymorphisms, or multiple SNPs found in close, medium, or far proximity to each other.

Gene	Description of the Polymorphism	Chromosome	gDNA location (c. If in cDNA
brachyspina	FANCI	deletion of 3328 bp	21	21184870 -21188198
syndactyly	LRP4	CG→ AT substitution of 2 nucleotides in exon 33	15	76800972 - 76800973
Cholesterol deficiency	APOB	Insertion of a transposable LTR element (ERV2-1), located between nucleotides 24 and 25 of APOB exon 5	11	77958994
alphaS1 - casein milk protein	CSN1S1 I	A →G substitution in exon 17 and A →T in exon 11. Both create missense amino acid substitutions	6	c.619A > G and c.296A > T
alphaS1 - casein milk protein	CSN1S1 J	A →G and C →G substitutions in exon 17 creating missense amino acid substitutions	6	c.619A > G and c.543G > T
Beta-lactoglobulin	PAEP	C →A substitution at position 215 bp upstream of the translation initiation sit, leads to aberrant low expression	11	103301704

When deciding which genes to use for testing the detection system we used the following criteria:

The length of any polymorphism should be less than 5,000 bp.
If multiple SNPs are to be detected, the distance between these should be enough to fit both the oligos, as well as accommodate the Flappase protein that is approx. 22 nucleotides and 75 Angstroms apart (See modeling page).
For initial testing, any gene selected should be of sufficient length to be synthesized for the creation of positive control DNAs.

Based on these characteristics, we decided to begin by pursuing LRP4 and APOB as model genes.

Designing and cloning new targets and oligos for use with the Flappase system

New Target #1- LRP4

The LRP4 gene encodes for LDL receptor related protein 4 (3). Two well-characterized allelic variants of this gene have been identified, the normal (wildtype) version, and a syndactylous version. Inheritance of the syndactylous version leads to congenital syndactyly. This condition, also known as mulefoot, refers to the fusion or non-division of the two developed digits of the bovine foot (4).

Sequence variation between the normal and syndactyl alleles of LRP4. The two versions differ by two SNPs, where nucleotides CG are substituted by AT at positions 4863 and 4864.

Figure 3: Sequence variation between the normal and syndactyl alleles of LRP4. The two versions differ by two SNPs, where nucleotides CG are substituted by AT at positions 4863 and 4864.

We found the genomic DNA sequence of the bovine LRP4 gene using The National Center for Biotechnology (NCBI) reference assembly for Bos taurus (5). We located the LRP4 gene (Gene ID 504317), downloaded the sequence, and then prepared the sequence for synthesis. To do this, we selected a segment of this gene which includes exon 33 (the location of the two SNPs and additional gDNA upstream and downstream of the SNPs). We checked this sequence to determine its RCF10 compatibility and found no illegal restriction sites. We located the sites of the two SNPs and used this information to construct a wild-type and a syndactyly version of the gene fragment. We added the RCF10 prefix and suffix and had the DNA synthesized by IDT. These DNAs are for use as positive control DNAs for testing the Flappase system.

The wildtype LRP4 positive control DNA. The nucleotides highlighted in yellow indicate the location of the CG to AT base substitutions that create the syndactyly allele. The wildtype and the syndactyly LRP4 positive control constructs have been entered into the iGEM parts registry as BBa_K3952003 and BBa_K3952004.

Figure 4: The wildtype LRP4 positive control DNA. The nucleotides highlighted in yellow indicate the location of the CG to AT base substitutions that create the syndactyly allele. The wildtype and the syndactyly LRP4 positive control constructs have been entered into the iGEM parts registry as BBa_K3952003 and BBa_K3952004.

We are currently in the process of cloning these synthetic DNAs into pSB1C3 for use as positive control DNAs in our genetic testing system .

We also designed oligos to detect each allele of the LRP4 gene. In this case, the spacer oligo will be complementary to the two target bases and the Flappase oligo will be complementary to the two target bases on the spacer oligo.

Sequences of the spacer and target oligos to detect polymorphisms in the LRP4 gene. The spacer and flap oligos are designed to anneal to the syndactylous version of the gene. They are expected to anneal in a way that creates Holiday junctions, which are recognized by Flappase.

Figure 5: Sequences of the spacer and target oligos to detect polymorphisms in the LRP4 gene. The spacer and flap oligos are designed to anneal to the syndactylous version of the gene. They are expected to anneal in a way that creates Holiday junctions, which are recognized by Flappase.

New Target ##2- APOB

The APOB gene codes for Apolipoprotein B (6), a protein involved in transport of cholesterol in the blood stream. Studies have reported an inheritable allele in the APOB gene, that causes cholesterol deficiency in cattle (7). The problematic allele is created by the insertion of a transposable element (also known as a jumping gene) into the APOB gene. This transposable element, ERV2-1, inserted into the APOB gene at exon 5 between the 24th and 25th nucleotide, resulting in an insertion polymorphism of approximately 300 bp. This polymorphism disrupts APOB function, resulting in cholesterol deficiency (CD).

Differences between the normal and CD alleles of APOB. The two versions differ due to the insertion of a 302 bp transposon, ERV2-1, between nucleotides 24 and 25 within exon 5 of the APOB gene.

Figure 6: Differences between the normal and CD alleles of APOB. The two versions differ due to the insertion of a 302 bp transposon, ERV2-1, between nucleotides 24 and 25 within exon 5 of the APOB gene.

We found the genomic DNA sequence of the bovine APOB gene using The NCBI reference assembly for Bos taurus (5). We located the APOB gene (Gene ID 494004), downloaded the sequence, and then prepared the sequence for synthesis. To do this, we selected a segment of this gene which includes exon 5 (the location of the insertion and additional gDNA upstream and downstream of the SNPs. We checked this sequence to determine its RCF10 compatibility and found no illegal restriction sites.

We annotated the sites of the insertion on the sequence and used this information to construct a wild-type version of the gene fragment. We were, however, initially unable to construct the CD allele, due to our inability to find a published sequence of this allele, or a copy of the ERV2-1 sequence. Lucky for us, we found that the ERV2-1 sequence is available on Repbase, a database of repetitive DNA elements (8 ). We reached out to Team Rochester, as their university maintains a subscription to Repbase, and they downloaded and provided us with the sequence for ERV2-1.

To construct the CD allele, we inserted this sequence into the correct location. We checked the RCF10 compatibility of this new site and found an illegal SpeI site within the ERV2-1 insertion. To remove this site the illegal SpeI site was altered in sillico from ACTAGT to GCTAGT. For the purposes of the use of this DNA as a positive control, this change will not affect the functionality of the control, if we use RPA primers and oligos that do not overlap this area. Given its location, this should not be an issue. We added the RCF10 prefix and suffix and had the DNA synthesized by IDT. These DNAs are for use as positive control DNAs for testing the Flappase system.

The CD APOB positive control DNA. The nucleotides highlighted in yellow indicate the location of the 24th and 25th nucleotides in exon 5. The ERV2-1 transposon is indicated in blue text, and the altered illegal SpeI restriction site is shown in bold blue. The wildtype APOB positive control is the same sequence, minus the insert. The wildtype and the CD APOB positive control constructs have been entered into the iGEM parts registry as BBa_K3952000 and BBa_K3952001.

Figure 7: The CD APOB positive control DNA. The nucleotides highlighted in yellow indicate the location of the 24th and 25th nucleotides in exon 5. The ERV2-1 transposon is indicated in blue text, and the altered illegal SpeI restriction site is shown in bold blue. The wildtype APOB positive control is the same sequence, minus the insert. The wildtype and the CD APOB positive control constructs have been entered into the iGEM parts registry as BBa_K3952000 and BBa_K3952001.

We are currently in the process of cloning these synthetic DNAs into pSB1C3 . Once cloned, these DNAs will be used as positive controls in our genetic testing system.

We also designed oligos to detect each allele of the APOB gene. We designed the spacer oligos the spacer oligo will be complementary to the two target bases and the Flappase oligo complementary to the complementary two target bases on the spacer oligo.

Sequences of the spacer and target oligos to detect polymorphisms in the APOB gene. The spacer and flap oligos are designed to anneal to the CD version of the gene within the ERV2-1 insert. They are expected to anneal in a way that creates Holiday junctions, which are recognized by Flappase. To detect the wildtype version of the gene a different set of oligos were designed to form holiday junctions directly at the location of the insertion.

Figure 8: Sequences of the spacer and target oligos to detect polymorphisms in the APOB gene. The spacer and flap oligos are designed to anneal to the CD version of the gene within the ERV2-1 insert. They are expected to anneal in a way that creates Holiday junctions, which are recognized by Flappase. To detect the wildtype version of the gene a different set of oligos were designed to form holiday junctions directly at the location of the insertion.

References

Haplotype tests for recessive disorders that affect fertility and other traits. Haplotype tests. (n.d.). Retrieved October 17, 2021, from https://aipl.arsusda.gov/reference/recessive_haplotypes_ARR-G3.html.
Leipold HW, Adrian RW, Huston K, Trotter DM, Dennis SM, Guffy MM; (n.d.). Anatomy of hereditary bovine syndactylism. I. Osteology. Journal of dairy science. Retrieved October 17, 2021, from https://pubmed.ncbi.nlm.nih.gov/4312935/.
Menzi, F., Besuchet-Schmutz, N., Fragnière, M., Hofstetter, S., Jagannathan, V., Mock, T., Raemy, A., Studer, E., Mehinagic, K., Regenscheit, N., Meylan, M., Schmitz-Hsu, F., & Drögemüller, C. (2016, January 13). A transposable element insertion in APOB causes cholesterol deficiency in Holstein cattle. Wiley Online Library. Retrieved October 17, 2021, from https://onlinelibrary.wiley.com/doi/10.1111/age.12410.
Ogorevc, J., Kunej, T., Razpet, A., & Dovc, P. (2009, June 8). Database of cattle candidate genes and genetic markers for milk production and mastitis. Wiley Online Library. Retrieved October 17, 2021, from https://onlinelibrary.wiley.com/doi/10.1111/j.1365-2052.2009.01921.x.
Pruitt, K. D., Tatusova, T., & Maglott, D. R. (2006, November 27). NCBI Reference sequences (refseq): A curated non-redundant sequence database of genomes, transcripts and proteins. OUP Academic. Retrieved October 17, 2021, from https://academic.oup.com/nar/article/35/suppl_1/D61/1099759.
UniProt Consortium European Bioinformatics Institute Protein Information Resource SIB Swiss Institute of Bioinformatics. (2021, June 2). Apolipoprotein B. UniProt Consortium European Bioinformatics Institute Protein Information Resource SIB Swiss Institute of Bioinformatics. Retrieved October 17, 2021, from https://www.uniprot.org/uniprot/E1BNR0.
UniProt Consortium European Bioinformatics Institute Protein Information Resource SIB Swiss Institute of Bioinformatics. (2021, June 2). LDL receptor related protein 4. UniProt Consortium European Bioinformatics Institute Protein Information Resource SIB Swiss Institute of Bioinformatics. Retrieved October 17, 2021, from https://www.uniprot.org/uniprot/Q00KA9.
Bao, W., Kojima, K. K., & Kohany, O. (2015, June 2). Repbase update, a database of repetitive elements in eukaryotic genomes. Mobile DNA. Retrieved October 17, 2021, from https://mobilednajournal.biomedcentral.com/articles/10.1186/s13100-015-0041-9.

Team:SUNY Oneonta/Description