Why is modelling necessary for our project?

We wanted to assess the immunogenic potential of the proposed vaccine construct before performing any experimental procedures. The immunogenicity of Spike-based SARS-CoV-2 vaccines is tightly linked to the conformation adopted by the Spike protein construct; The Spike protein in SARS-CoV-2 virions consists of two domains: S1, which interacts with the Angiotensin-converting enzyme 2 (ACE2) receptor in host cells, and is a part of our protein construct, and S2, which aids in the homotrimerization of the Spike protein1. In vivo, this protein can exist in two states: When not bound to ACE2, it adopts the closed state, with three Spike monomers tightly-packed in a homotrimer, while binding to ACE2 is facilitated by the open state, with the Spikes existing as monomers and the S1 subunit being free to interact with ACE2. Out of these two states, the closed state has been recognized as being antigenically preferred, with some commercial vaccine manufacturers modifying their Spike construct to keep it locked in that state2.
Our construct only includes the Receptor-Binding Domain (RBD) of the S1 subunit, and not the S2 subunit, and is thus incapable of forming a closed trimer using the same mechanism as the natural Spike protein. However, it includes the intrinsically-disordered FLS2 linker helix, which we hypothesized could assist in the formation of a trimer topology within the construct. Thus, the goal of our model was two-fold: To first check whether there were any steric impediments within the three-dimensional structure of our construct, which could prevent the RBDs from folding, and to appraise the general topology adopted by the protein. Neither of these tasks require single-Angstrom levels of precision, and can, as such, be fulfilled by computational modeling.

Modeling procedure and limitations

We constructed our model using the AlphaFold2 neural network-based model3. Simply put, this model works by first aligning the sequence of the protein to be modeled (target protein) to a database of protein sequences with known structures, and then inputting this alignment to a deep convolutional neural network, which predicts the probable relations between the residues of the target protein. This relationship network is then used to derive the atomic coordinates within the protein system, and the system is finally relaxed using molecular dynamics calculations based on the AMBER force field, giving rise to the final model prediction. AlphaFold also computes a confidence score, known as predicted Local Distance Difference Test, for each atom in the model, giving a measure of the distance difference from the native structure, were it to be experimentally determined, and the stereochemical plausibility of the predicted position.4
Due to computational constraints, we were unable to use the full AlphaFold2 model, but had to use the version published as “AlphaFold Colab”, which is based on a subset of the BFD sequence database, and does not use any templates of homologous structures to compute the final model. Since we were modeling an artificial construct, the lack of templates was not considered a major limiting factor, as no template would be able to exactly match our construct. Additionally, we considered even the limited sequence database to be sufficient, since we were only interested in a general topological prediction. We generated five models, selected the best model out of the ensemble and subjected it to AMBER relaxation. The AMBER-relaxed model was then used for visualization and analysis.
We visualized the finished model using the UCSF Chimera molecular graphics software5. We used a ribbon representation colored based on the average per-residue pLDDT score. To compare the structure of the RBD subunits to the known native structure, we used Chimera’s native MatchMaker tool.

Modeling results

Figure 1: Best AMBER-relaxed AlphaFold2 model. The model is shown in a ribbon representation and is colored according to the residue-averaged pLDDT score.
As is evident based on the coloring of Figure 1, AlphaFold2 produced a model of generally medium to high confidence (pLDDT 70-90 within the RBDs), with the inter-subunit NAAIRS loops and the FLS2 subunit having lower prediction confidences, which may correspond to intrinsic disorder in these regions. The FLS2 linker seems to neither sterically impede the folding of the rest of the construct, nor to interact with it; As such, the construct will probably adopt the monomeric RBD conformation, which was to be expected, given the fact that the RBD domains do not carry any non-covalently interacting domains that promote trimer formation (all the required residues for trimerization are a part of the S2 subunit, which is not present in this construct).
We also assessed the model using Chimera’s MatchMaker tool to align the structure of the RBD subunits to the experimentally determined RBD structures in the closed conformation(PDB ID: 6Z976). We found that the pLDDT predictions corresponded well to the Root Mean-Square Distance (RMSD) to the native conformation: Core regions with high pLDDTs (painted green in Figure 1) aligned almost exactly with the native conformation, while loop regions diverged significantly. Regions covalently linked with the NAAIRS loops also showed significant divergence; This may be either due to model inaccuracies, since these regions have relatively low pLDDTs, or due to steric impediments caused by covalent linkage (Figure 2); The model precision is not sufficient to differentiate between these possibilities. This is quantified by the RMSDs in Table 1. As such, the biological activity of the construct must be experimentally evaluated.
Figure 2: Native prefusion RBD structure superimposed on the construct’s RBD subunits.
Table 1: MatchMaker RMSD between best-matching (pruned) atom pairs and all atom pairs of the construct RBDs and the native prefusion RBD structure.

Impact on our lab procedures

As mentioned beforehand, the model of the protein remaining in the open conformation did not provide us with any clue concerning the biological activity of the construct. Although, the fact that the S1 subunits form correctly in their native structure was an indication that antibodies against SARS-Cov-2 could be created from this protein. Therefore, the lab procedures itself were not greatly altered, other than making us speed up the procedures so we could test in mice, even though we did not make it that far due to time constrains.


In conclusion, based on AlphaFold modeling, the construct has a relatively high probability of correctly folding its RBD subunits in a monomeric form, even though their biological activity must still be assessed. Since the monomeric form is not immunogenically preferred, the efficacy of a vaccine against the novel coronavirus based on this construct may be limited, since it may yield a low amount of neutralizing antibodies against the closed state. The impact this model had to our lab, all in all, was assuring us that our model in the current state of the project was not to be altered, providing us with extra credibility that it could work. If you want to know more about our lab work click here(link).
  • Huang Y, Yang C, Xu X, Xu W, Liu S. Structural and functional properties of SARS-CoV-2 spike protein: potential antivirus drug development for COVID-19. Acta Pharmacol Sin. 2020;41(9):1141-1149. doi:10.1038/s41401-020-0485-4
  • Riley TP, Chou H-T, Hu R, et al. Enhancing the Prefusion Conformational Stability of SARS-CoV-2 Spike Protein Through Structure-Guided Design. Front Immunol. 2021;12:660198. doi:10.3389/fimmu.2021.660198
  • Jumper J, Evans R, Pritzel A, et al. Highly accurate protein structure prediction with AlphaFold. Nature. 2021;596(7873):583-589. doi:10.1038/s41586-021-03819-2
  • Mariani V, Biasini M, Barbato A, Schwede T. lDDT: a local superposition-free score for comparing protein structures and models using distance difference tests. Bioinformatics. 2013;29(21):2722-2728. doi:10.1093/bioinformatics/btt473
  • Pettersen EF, Goddard TD, Huang CC, et al. UCSF Chimera--a visualization system for exploratory research and analysis. J Comput Chem. 2004;25(13):1605-1612. doi:10.1002/jcc.20084
  • Huo J, Zhao Y, Ren J, et al. Neutralization of SARS-CoV-2 by Destruction of the Prefusion Spike. Cell Host & Microbe. 2020;28(3):445-454.e6. doi:10.1016/j.chom.2020.06.010