Team:Estonia TUIT/Model


To support the SALSASMILE project idea, we used several in silico approaches. All simulations and constants are documented by our team, to allow for replicative outcomes.


First, we applied Multiple Sequence Alignment (MSA) to find similarities between the repetitive domains in the SALSA protein (Ranganathan et al., 2019) (Supplementary MSA). We used the solvent-accessible surface area (SASA) method for the domains sequences’ alignment to classify the SALSA’s amino acid residues as hidden from or exposed to the solvent and thus also the protease (Supplementary SASA 1, SASA 2). These predictions helped us to find accessible sites to narrow down the list of candidate protease target sequences (Ali et al., 2014).

Following the MSA and SASA, several systems of ordinary differential equations were implemented (Hughes et al., 2011). First, to analyze the SALSA concentration oscillations in the mouth. Secondly, to simulate the activity of the candidate proteases. And thirdly, to model the protease neutralization and diffusion in the saliva (Zheng and Sriram, 2010).

Analysis of solvent accessibility and potential target sequences

The MSA approach was helpful in screening the regions critical for the attachment of bacteria to the tooth surface: SRCR 1-13 and SID domains of the SALSA protein (Supplementary MSA). For this purpose, we utilized Clustal Omega software (Madeira et al., 2019) and SALSA sequence obtained from Uniprot (Uniprot) database. The MSA revealed that the bacteria binding site along with the Ca2+ cofactor binding site and hydroxyapatite binding site are conserved in all SRCR 1-13 domains (Supplementary MSA: alignment 1). The SID linkers appear to be less conserved (Supplementary MSA: alignment 2).

SALSA three-dimensional structure prediction (Figure 1) (Jumper et al., 2021) was used in the GetArea software for the SASA analysis (Fraczkiewicz & Braun, 1998), which estimates SALSA residues’ solvent accessibility (RSA) ( Supplementary SASA 1, SASA 2).

Figure 1. The structural model of SALSA, predicted by AlphaFold. SRCR - Scavenger Receptor Cysteine-Rich (SRCR) domains. SID - SRCR-interspersed domains. CUB - C1r/C1s, urchin embryonic growth factor and bone morphogenetic protein-1 domain.

Afterward, the MSA of SALSA regions was analyzed together with RSA. It was revealed that all bacteria-binding sites in the SRCR domains each contain seven residue types with maximal accessibility for a protease: Glycine (G), Asparagine (N), Proline (P), Glutamine (Q), Arginine (R), Serine (S), Threonine (T) (Supplementary MSA: Supplementary alignment 3) and every SID contains three residue types with maximal accessibility: Glutamic Acid (E), Proline (P), Serine (S) (Supplementary MSA: alignment 4).

Search for potential proteases

Candidate protease search was performed in the MEROPS protease database (Rawlings et al., 2013). Experimentally verified SALSA peptidases and other candidates from the MEROPS database were analyzed with respect to their specificity to the accessible sequences, safe dosage, possible toxic effects, and the possibility to neutralize the enzyme’s activity safely. Further, candidate proteases’ cleaving efficiency (Michaelis constant, turnover number, and optimal salivary conditions), and feasibility (storage conditions, estimated cost, and existing applications) were studied.

Specificity, toxicity, and neutralization possibilities are essential criteria to eliminate proteases that can not be utilized for medical purposes. The parameters from the literature were gathered together in Table 1 and Table 2.

Table 1. SALSA candidate proteases’ specificity, toxicity and neutralization options. Specificity data is based on the amino acids sequences targeted by the protease. Toxicity data is based on therapeutic applications. For protease neutralization, the least toxic options were analyzed.

Candidate protease Specificity Toxicity Neutralization
Trypsin (Gunput et al., 2015) SRCR: R/K ((Rawlings et al. 2013)) Non-toxic trypsin 1,2 with mild side effects (Gökçen et al., 2014) Non-toxic alpha1-antitrypsin (Serres & Blanco, 2014)
TCEP (Gunput et al., 2015) N/A Toxic (Toxic-Free Future) N/A
Lys-C (Bikker et al., 2002) SRCR: K (Bikker et al., 2002) No found use in human trials N/A
StcE (Grys et al., 2020) N/A N/A N/A
Beta-lytic metallopeptidase SRCR: G Biofilm-degrading can be toxic (Gökçen et al., 2013 N/A
Staphylolysin SRCR: G Therapeutic, non-toxic (Barequet et al., 2011) Antibodies (Dajcs et al., 2002)
Stem bromelain SRCR and SID: G,S,N,R Therapeutic, non-toxic (Pavan et al., 2012) N/A
Actinidin SRCR: G, R Therapeutic, non-toxic (Mugita et al., 2017) N/A
Legumain (plant beta form) SRCR: N Plays physiological and pathological role (Dall & Brandstetter, 2016) Cystatin (Rotari et al., 2001)
Prolyl peptidase SRCR and SID: P No harmful side effects (Castillo & Leffler, 2014) Trypsin (Bethune & Khosla, 2012)

After the initial research, trypsin and prolyl peptidase were found to be the most suitable candidates for SALSA cleavage. We found that trypsin from the Atlantic cod was previously tested in a product called ColdZyme (Huijghebaert et al., 2021). Later investigation showed different advantages of the Atlantic cod trypsin including high catalytic efficiency and low cost. To conclude the selection for the most optimal protease for our project, we specifically analysed the efficiency of two trypsin variants and prolyl peptidase. The kinetic constants that can be found in Table 2 were later used for the ODE model of SALSA proteolytic degradation. The salivary pH was found in the study by Seethalakshmi et al (2016), and the temperature in the oral cavity was taken from Geneva (2019).

Table 2. The parameters of the candidate proteases: KM Michaelis constant; kcat catalytic rate constant; working efficiency in salivary conditions (normalized from 0 to 1) depending on the protease optimal pH and temperature.

Candidate Substrate cleaved to measure kinetic parameters K m K cat Efficiency in salivary conditions (pH, temperature)
Trypsin 2, Human DABCYL–LQVRTDVT–Glu(EDANS)-NH2 25.2 ± 9.8 uM (Schilling et al., 2018) 25.2 ± 9.8 *106 pM 3.59 ± 0.57 s-1 (Schilling et al., 2018) 215.4 min-1 ~ 0.8 (Slichter) , 1 (Biocompare)
Trypsin, Atlantic cod Z-Gly-Pro-Arg-pnitroanilide 0.017 ± 0.003 mM (A´sgeirsson& Cekan, 2006) 17 * 106 pM 172.8 ± 35.4 s-1 (A´sgeirsson& Cekan, 2006) 10368 min-1 0.35, 0.35 (Sandholt et al., 2019)
Prolyl peptidase Suc-Ala-Pro-p-nitroanilide 400 ± 30 u M (Shan et al., 2004) 400 ± 30* 106 pM 46 ± 5 s-1 (Shan et al., 2004) 2760 min- 1 (Kalwant & Porter, 1991), ~0.3 (Heinis et al., 2004)

Click on the graph to see the model’s details

SALSA oral fluctuations model

The ODE models allow the easiest representation of dynamic natural phenomena. Their relatively robust and modular nature allows us to implement and modify them according to our specific conditions. The first ODE model describes the natural oral fluctuation (dS(t)) of SALSA that was reproduced as a constant SALSA production (a) with frequent (t/45 ∈ N) SALSA removal (a((v 0 - v 1) /v 0) due to involuntary swallowing calculated as production rate (a) multiplied to the ratio of swallowed (v 0 - v 1) and total saliva ( v 0) ratio.
{dS(t) = a - a(v0-v1/v0), if t/0.75 ∈ N
{dS(t) = a , if t/0.75 ∉ N

The involuntary swallowing period was estimated to be 0.75 min. This time value was derived from spontaneous swallow frequency (1.32 times/min) that was stated by Afkari S., 2007. (Afkari, 2007)

The SALSA production rate was obtained by multiplication of unstimulated saliva flow rate (0.3-0.4 ml/min) with SALSA salivary concentration (0.5 ug/ml). It was calculated to be 0.15-0.2 μg/min. This was converted to 0.44118-0.58824 pmol/min with division by SALSA molecular weight (340 kDa) (Iorgulescu, 2009; Reichhardt, 2012). The time needed for saliva swallowing is estimated to be 1 second (Soares et al., 2015).

SALSA after/before swallowing ratio is estimated to be 0.72 by division of saliva volume after swallowing (0.77 ± 0.23 ml) by saliva volume before swallowing (1.07 ± 0.39 ml) (Lagerlöf & Dawes, 1984).

Table 5. SALSA natural oral fluctuation data.

SALSA production rate (pmol/min) Duration of swallowing (min) SALSA after and before swallowing ratio
0.44118-0.58824 0.016 0.72

SALSA natural oral fluctuation model estimates the equilibrium amount of SALSA, reached after tooth brushing and spitting and kept until the next tooth brushing to approximately - 0.7393 pmol (Figure 2). Therefore, the approximate concentration of SALSA is estimated to be 0.739 pmol/ml or 739 pmol/l (pM), as the average volume of saliva in the oral cavity is 1 ml (Lagerlöf & Dawes, 1984).

Figure 2. Model of the natural fluctuations of SALSA protein. Time 0 indicates the time of tooth brushing. Fluctuations are caused by involuntary swallowing (approx. every 45 seconds) *. Maximal and minimal concentrations are 1070 and 739 pM, respectively.

SALSA proteolytic degradation model

To simulate the proteolytic degradation of SALSA, we used Michaelis-Menten kinetic parameters for different proteases (Table 3). Michaelis-Menten model is used to simulate one-substrate enzyme-catalyzed reaction as turnover number (kcat), enzyme (E) and substrate concentrations’ (Sn) multiplication, divided by the sum of Michaelis-Menten constant (KM) and substrate concentration (S). (Doran, 2013) We considered SALSA being successfully removed when less than 5% of SALSA had remained undegraded. The change in the protein concentration (dS) was calculated as the difference between constant production SALSA (a) and it’s continuous proteolytic degradation in-between involuntary swallowing (t ∈ [0,0.75] ). For the proteolytic degradation equation, we used protease efficiency (p) relative to salivary conditions (pH 6.7-7.3, 30-37 ͒C) (Table 3) (Seethalakshmi et al., 2016),(Geneva et al., 2019)

After the model simulation, we found that the effective concentrations of each candidate protease to be used are 1.288 μM for Human trypsin 2, 0.041 μM for Atlantic cod trypsin, and 4.29 μM for prolyl peptidase. This concentration is enough to degrade 95% of the average SALSA concentration (960 pM) in the oral cavity during the time in-between swallowing (0.75 min) (Figure 3).

Figure 3. SALSA proteolytic degradation by three candidate proteases at their effective dose (capable of degrading >95% of total SALSA amount during 0.75 minutes).

According to our proteolytic degradation model, the most efficient protease is the Atlantic cod trypsin, despite its low activity in salivary conditions. We need the smallest amount of this protease to degrade SALSA protein, which minimizes the risks of toxic effects of proteases in the human body.

Neutralizing the protease activity and proteases elimination

To simulate effective inhibition of the proteases by neutralizers (I), ODE formulas were used. Inhibition (dE) was considered successful when less than 5% of the protease was still active. Simulation is based on subtraction of the associated protease (E) and I (-kaIE) from dissociated EI (kdEI), supplied by neutralizer availability (dI) and EIformation (dEI). Such process occurs in-between involuntary swallowing (t ∈ [0,0.75])

dE(t) = -kaIE+kdEI, t ∈ [0,0.75]

dI(t) = -kaIE+kdEI, t ∈ [0,0.75]

dEI(t) = kaIE-kdEI, t ∈ [0,0.75]

ka and kd constants were unavailable for the Atlantic codtrypsin, and therefore protease inhibition was modelled for its closest relative with available constants - Human trypsin 2. For the neutralizer-induced candidate protease inactivation Human trypsin 2 effective concentration was estimated to be 1.88 μM and inhibitor α1-antitrypsin concentration of 1.288 μM (Figure 4).

Figure 4. Inactivation of the Human trypsin 2 (blue line) by the α1-antitrypsin (orange line) and increase in inhibited Human trypsin 2 proteases (yellow line).

Neutralized candidate protease is eliminated (dS(t)) by the frequent ( t/45 ∈ N) swallowing ( a(v0-v1/v0)) multiplied by the concentration (S) of the protease and to the ratio of the swallowed (v0-v1): whole saliva (v0) which is present in the oral cavity.

{dS(t) = - S(v0-v1/v0), if t/45 ∈ N

{dS(t) = 0 , if t/45 ∉ N

The time of neutralized candidate protease elimination was calculated to be approximately 6.75 minutes (Figure 5).

Figure 5. Neutralized candidate protease elimination, due to swallowing.


We utilized several modelling techniques to conduct the research on the most effective, compatible and feasible proteases for the SALSA protein degradation. We were also successful in simulating the fluctuations of the protein in the oral cavity after brushing or enzymatic cleavage and calculated its amount before and after swallowing. Using the determined concentrations, we found the optimal quantity of candidate proteases. Atlantic cod trypsin showed the most promising efficiency in cutting the SRCR domains in SALSA at minimal concentration. This negligible amount of Atlantic cod trypsin protease ( 0.041 μM) should not be toxic for the organism, which was also supported in consultations with Professor Seppo Meri. His deep knowledge of immune signalling explained the risks and benefits of our product and helped us design and develop it. (see Integrated Human Practices for more details). However, we still considered and modelled trypsin proteases inactivation by the relatively safe and non-toxic α1-antitrypsin inhibitor. The modelling points out that trypsin is an optimal candidate for further modification of the protease to improve specific targeting of SALSA. Thus, we will use trypsin and increase its specificity by fusing the protease with a SALSA-binding peptide and further improving its activity towards SALSA SID linkers in a high-throughput protease engineering method developed in this project.

Pharmacology, 162(6), 1239.

SA, A., MI, H., A, I., & F, A. (2014). A review of methods available to estimate solvent-accessible surface areas of soluble proteins in the folded and unfolded states. Current Protein & Peptide Science, 15(5), 456–476.

Shoba Ranganathan, Michael Gribskov, Kenta Nakai, & Christian Schönbach. (2019). Applications, Volume 3. Encyclopedia of Bioinformatics and Computational Biology, 3, 938–952.

Clustal Omega < Multiple Sequence Alignment < EMBL-EBI. Retrieved October 15, 2021, from

Zheng, Y., & Sriram, G. (2010). Mathematical Modeling: Bridging the Gap between Concept and Realization in Synthetic Biology. Journal of Biomedicine and Biotechnology, 2010.

Dajcs, J. J., Thibodeaux, B. A., Hume, E. B. H., Zheng, X., Sloop, G. D., & O’Callaghan, R. J. (2001). Lysostaphin is effective in treating methicillin-resistant Staphylococcus aureus endophthalmitis in the rabbit. Current Eye Research, 22(6), 451–457.

ST, G., AJ, L., B, T., M, B., EC, V., & D, W. (2015). Complement activation by salivary agglutinin is secretor status dependent. Biological Chemistry, 396(1), 35–43.

Grys, T. E., Siegel, M. B., Lathem, W. W., & Welch, R. A. (2005). The StcE protease contributes to intimate adherence of enterohemorrhagic Escherichia coli O157:H7 to host cells. Infection and Immunity, 73(3), 1295–1303.

FJ, B., AJ, L., K, N., EC, V., W, van’t H., JG, B., A, P., AV, N. A., & J, M. (2002). Identification of the bacteria-binding peptide domain on salivary agglutinin (gp-340/DMBT1), a member of the scavenger receptor cysteine-rich superfamily. The Journal of Biological Chemistry, 277(35), 32109–32115.

TCEP: A toxic flame retardant - Toxic-Free Future. Retrieved October 15, 2021, from

[PDF] Exact and efficient analytical calculation of the accessible surface areas and their gradients for macromolecules | Semantic Scholar. Retrieved October 15, 2021, from

Jumper, J., Evans, R., Pritzel, A., Green, T., Figurnov, M., Ronneberger, O., Tunyasuvunakool, K., Bates, R., Žídek, A., Potapenko, A., Bridgland, A., Meyer, C., Kohl, S. A. A., Ballard, A. J., Cowie, A., Romera-Paredes, B., Nikolov, S., Jain, R., Adler, J., … Hassabis, D. (2021). Highly accurate protein structure prediction with AlphaFold. Nature 2021 596:7873, 596(7873), 583–589.

DMBT1 - Deleted in malignant brain tumors 1 protein precursor - Homo sapiens (Human) - DMBT1 gene & protein. Retrieved October 15, 2021, from

Geneva, I. I., Cuzzo, B., Fazili, T., & Javaid, W. (2019). Normal Body Temperature: A Systematic Review. Open Forum Infectious Diseases, 6(4).

O, S., ML, B., B, M., B, E., H, B., P, G., UH, S., & H, K. (2018). Specificity profiling of human trypsin-isoenzymes. Biological Chemistry, 399(9), 997–1007.

Sgeirsson, B. A. ´, & Cekan, P. (2006). Microscopic rate-constants for substrate binding and acylation in cold-adaptation of trypsin I from Atlantic cod.

Shan, L., Mathews, I. I., & Khosla, C. (2005). Structural and mechanistic analysis of two prolyl endopeptidases: Role of interdomain dynamics in catalysis and specificity. PNAS March, 8(10), 2021.

Review of Proteins & Enzymes. Retrieved October 15, 2021, from

Trypsin | Biocompare. Retrieved October 15, 2021, from

Sandholt, G. B., Stefansson, B., Scheving, R., & Gudmundsdottir, Á. (2019). Biochemical characterization of a native group III trypsin ZT from Atlantic cod (Gadus morhua). International Journal of Biological Macromolecules, 125, 847–855.

Kalwant, S., & Porter, A. G. (1991). Purification and characterization of human brain prolyl endopeptidase. Biochemical Journal, 276(Pt 1), 237.
C, H., P, A., & D, N. (2004). Engineering a thermostable human prolyl endopeptidase for antibody-directed enzyme prodrug therapy. Biochemistry, 43(20), 6293–6303.

N, M., T, N., K, T., PL, W., & Y, K. (2017). Proteases, actinidin, papain and trypsin reduce oral biofilm on the tongue in elderly subjects and in vitro. Archives of Oral Biology, 82, 233–240.

Dall, E., & Brandstetter, H. (2016). Structure and function of legumain in health and disease. Biochimie, 122, 126–150.

Rotari, V. I., Dando, P. M., & Barrett, A. J. (2001). Legumain Forms from Plants and Animals Differ in Their Specificity. 382(6), 953–959.

Bethune, M. T., & Khosla, C. (2012). Oral enzyme therapy for celiac sprue. Methods in Enzymology, 502, 241–271.

Hörmannsperger, G., Schillde, M.-A. von, & Haller, D. (2013). Lactocepin as a protective microbial structure in the context of IBD. Gut Microbes, 4(2), 152.

Nandan, A., & Nampoothiri, K. M. (2020). Therapeutic and biotechnological applications of substrate specific microbial aminopeptidases. Applied Microbiology and Biotechnology 2020 104:12, 104(12), 5243–5257.

IS, B., N, B., YN, P., M, S., D, Y., DE, O., M, R., & E, K. (2012). Staphylolysin is an effective therapeutic agent for Staphylococcus aureus experimental keratitis. Graefe’s Archive for Clinical and Experimental Ophthalmology = Albrecht von Graefes Archiv Fur Klinische Und Experimentelle Ophthalmologie, 250(2), 223–229.

Gökçen, A., Vilcinskas, A., & Wiesner, J. (2014). Biofilm-degrading enzymes from Lysobacter gummosus. Http://Dx.Doi.Org/10.4161/Viru.27919, 5(3), 378–387.

Coward, C., & Onaral, B. (2005). Computational cell biology and complexity. Introduction to Biomedical Engineering, 833–855.

Iorgulescu, G. (2009). Saliva between normal and pathological. Important factors in determining systemic and oral health. Journal of Medicine and Life, 2(3), 303. /labs/pmc/articles/PMC5052503/

MP, R., V, L., S, T., J, F., S, M., & H, J. (2012). The salivary scavenger and agglutinin binds MBL and regulates the lectin pathway of complement in solution and on surfaces. Frontiers in Immunology, 3(JUL).

SOARES, T. J., MORAES, D. P., MEDEIROS, G. C. de, SASSI, F. C., ZILBERSTEIN, B., & ANDRADE, C. R. F. de. (2015). Oral transit time: a critical review of the literature. Arquivos Brasileiros de Cirurgia Digestiva : ABCD = Brazilian Archives of Digestive Surgery, 28(2), 144.

F, L., & C, D. (1984). The volume of saliva in the mouth before and after swallowing. Journal of Dental Research, 63(5), 618–621.

Seethalakshmi, C., Reddy, R. C. J., Asifa, N., & Prabhu, S. (2016). Correlation of Salivary pH, Incidence of Dental Caries and Periodontal Status in Diabetes Mellitus Patients: A Cross-sectional Study. Journal of Clinical and Diagnostic Research : JCDR, 10(3), ZC12.

Contact Us: