Team:UNSW Australia/Model/Heat Shock Protein

iGEM UNSW

Aims

While the general behaviour of heat shock proteins is understood, there is much we don’t know about the structure and physics of the interactions that enable them to dynamically associate and dissociate between oligomers and subunits as they bind to denatured proteins. For this reason, we performed some structural prediction for our heat shock protein of interest, HSP22E, in order to better determine what kind of oligomers it might form and hence provide some insight into its mechanism of action.

The structure of HSP22E, which comes from the thermotolerant microalgae C. reinhardtii, has not yet been thoroughly characterised. Hence, our approach involved starting from the gene sequence, using this to predict a monomeric unit or dimeric unit, investigating how these subunits might form a larger oligomer, and then modelling its interaction with denatured proteins.

Fig. 1: Process of Structural Modelling HSP22E

Sequence

Choices, Choices - selecting a sequence

Fig. 2: Gene sequence of HSP22E. The putative transit peptide motif is highlighted in pink.

In 2020, the Phase One Team identified the sequence for the small heat-shock proteins HSP22E and HSP22F from C. reinhardtii, our candidates for synthetically alleviating heat stress in coral-symbiont algae. These two HSPs are known to form a heterodimer (Rütgers et al., 2017), but preliminary experiments in the 2020 lab suggested that HSP22E was also able to form a homodimer. This is not uncommon in HSPs since they often have high sequence similarity and are predicted to be found in so many forms due to gene duplication events (de Jong et al., 1998). For example, the oligomeric form of HSP21 (a known chloroplastic HSP) has two distinct geometries, a “T” tetrahedral dodecamer constructed of six homodimers (Yu et al., 2021) and a “D3” dihedral dodecamer of six heterodimers (Rutsdottir et al., 2017), both of which are functional. With this in mind, our Phase Two team chose to work with only HSP22E for simplicity.

The First Hurdle - cellular localisation

The Phase One Team also attempted some structural modelling of the selected HSPs, but they ran into some significant clashes in their model - there were just too many atoms to fit in the space! In collaboration with them, the Phase Two team took a closer look and found that the protein sequences provided contained a proposed transit peptide motif. This is a “tag” sequence that labels the protein for transport to the chloroplast: the organelle that produces ROS when under heat stress, and where HSP22E acts. Now the question was, does this tag stay attached to the protein, or is it snipped off when it reaches the place it is meant to be? After some research we found that the process of transporting such proteins across a plastid membrane (that is, into a mitochondrion or chloroplast) involves cleaving off the transit peptide motif (Kunze and Berger, 2015). We performed a ClustalOmega Multiple Sequence Alignment (MSA) of the transit peptide motif from HSP22E with other known plastid localisation motifs (Holbrook et al., 2016), and received a good alignment. In addition, performing pBLAST of the full HSP22E sequence against the PDB database produced no more hits than a pBLAST with just the “mature” sequence, as no alignments were produced in the transit peptide motif region. Hence we decided to focus our attention on a mature form of the sequence that does not contain the transit peptide.

In Perspective - comparison to known sHSPs

As a quick sanity check, we performed a simple PSI-BLAST search against the PDB database to compare our proposed sequence against homologs with known structure. Below is a T-COFFEE MSA of HSP22E (last row) with the homologs found (preceding rows). As you can see, there are two red sections which are highly conserved - these correspond to the core alpha-crystallin domain that characterises sHSPs. The other parts of the sequence show more variation, and it has been hypothesised that these regions are involved in oligomerisation and substrate binding (Caspers et al., 1995).

Fig. 3: Multiple Sequence Alignment of HSP22E (bottom row)

Monomer

We provided our mature sequence data to several modelling programs, which either used comparative modelling techniques or deep-learning methods to predict the structure. These included iTASSER, Rosetta Comparative Modelling, RoseTTAFold and AlphaFold, which each produced 5-10 models from the input data.

All of these different programs predicted the alpha-crystallin domains with a high degree of confidence, but there was a lot of variation in the N- and C-terminal regions and the middle loop. These are the regions with most variation, as seen in the T-COFFEE MSA above. Since the terminal regions are likely to be involved in oligomerisation (Caspers et al., 1995), it seemed reasonable to conclude these regions would be dynamic, and a more confident prediction of their structure would have to wait until the subunits were placed together.

Nevertheless, to obtain one structure to move forward with we submitted each of them to MolProbity. The most favourable score was obtained from Rosetta Comparative Model 1 (scoring shown below). In addition, this model provided an explicit residue-by-residue score for the certainty of the predicted coordinates, which we felt would be valuable moving forward for the future remodelling of terminal regions once we had an oligomeric scaffold.

Fig. 4: Structural Validation of the Rosetta Comparative Model 1 (from MolProbity)

Oligomer

Variations on a Theme - diversity and dynamism of HSPs

Small heat shock proteins are known to form complex higher order structures, in the realm of 12-48mers (Basha et al., 2012). Many of these large structures are known to dissociate into dimers under thermal stress in order to interact with denatured proteins, though there is also evidence of non-dissociative interaction (Yu et al., 2021). The large complexes are also known to regularly perform subunit exchange - new HSP dimers swapping with old ones to ‘refresh’ the large complex. As such, HSP oligomers are highly dynamic, and it is predicted that this dynamic process of interactions is governed by the more variable terminal regions of the sequence (Caspers et al., 1995).

The table linked below shows the variety of oligomeric structures formed by homologs of HSP22E:

Table 1: Oligomerization and geometric symmetry of experimentally derived sHSP structures.

Fig. 5: HSP22E Phylogenetic Tree

The Family Tree

For this reason, we generated a phylogenetic tree (Fig. 4) to determine sequence similarity between different HSPs with known structure against HSP22E. Each HSP is colour-coded by the type of oligomer it is known to form. It can be seen that while some clusters have the same order, others with very high sequence similarity do not. This is most noticeable across 4ZJ-A/D/9, which are a 24mer, 18mer and 2mer respectively, but derived from the same sequence with only very minor mutations.

Diving into Detail

Given the ambiguity of the sequence information processed in this way, we decided to approach the problem using more structural information.

The T-COFFEE Expresso program incorporates structural information to align sequences and produce a Multiple Sequence Alignment. We divided our template HSPs into known 2mers, 4mers, 12mers and 24mers, and constructed a MSA for each of these with HSP22E. These particular oligomeric forms were picked because there was a reasonable number of sequences available. From these comparisons, it was clear that HSP22E aligned most closely to dimers and dodecamers. As such, we aimed to first produce a viable dimer for HSP22E, and then use the dimer structure to form the larger 12-mer.

Fig. 6: T-COFFEE MSA scores for alignments by oligomeric order.

The dodecamer alignment in particular provided an interesting insight. Most of the other alignments were only able to pick up the generalised alpha-crystallin domains as conserved regions, but the dodecamer alignment showed an extra piece of sequence that was well conserved, in between the two large alpha-crystallin domains. Inspection of this small conserved region revealed that it was the region of sequence corresponding to β6 for the known HSPs.

Fig. 7: Alignment of HSP22E to known dodecamers. The approximate location of β-strands across the known structures are marked by a box.

Dimer

We pursued several methods to predict the dimeric structure of HSP22E:

Manual alignment via ChimeraX to the template 1GME	1GME, as seen above, is a sHSP from wheat which assembles into a dodecamer, and as such was considered a viable template. ChimeraX provides a Matchmaker tool that allows structural alignment of two models. The HSP22E monomeric structure was duplicated, and then aligned to the A and B chains of 1GME. The resulting dimer appeared reasonable, with few steric clashes between atoms and a large interaction interface.
Modeller alignment via ChimeraX	The Modeller tool, accessible as an extension through ChimeraX, matches a target sequence to known structures provided as a MSA to make a protein model. We provided the 12mer MSA alignment generated in the previous step with the HSP22E sequence. From all the dimers produced, this method was the only one to create a beta-6 strand involved in the dimeric interaction, but the terminal regions were highly disordered.
AlphaFold	The online AlphaFold server allows the input sequence to be classified as a monomer or dimer before the run. The resulting dimer had a good interaction interface, but the loops and terminal regions were left loose and unmodelled because the software could not predict these regions with high accuracy.
ROSIE’s Symmetric Docking Tool	The Rosetta Online Server that Includes Everyone (ROSIE) provides a Symmetric Docking tool which enables the user to specify both the number of subunits and whether cyclic or dihedral symmetry is expected for the resulting homomer, with the input monomeric pdb file. Ten models were produced with C2 cyclic geometric symmetry.

Comparing each of the models, the second model produced by ROSIE was selected for further work. This model was energetically favourable and aligned well with the homologous 1GME dimer.

Dodecamer

We again used several methods to dock our dimeric subunits into a large oligomer, including Rosie’s Symmetric Docking, Galaxy’s Homomer, and Bonvin Lab’s Haddock, as well as some manual alignment methods in Chimera. Each of these programs allows input of different parameters for constraints, and so we explored multiple combinations of these to find a model that was biologically favourable as well as energetically favourable.
Each method produced 5-10 models per set of inputs. In the table below, only selected models are shown for each method to illustrate general trends. Any residues coloured in red represent steric clashes in the model.

Manual alignment via ChimeraX Manually aligned dodecamer; clashing residues in pink	This was performed as per the dimer method. However, with all 12 chains present, the model contained many unresolvable clashes involving the N-terminal tails, and so was deemed unfeasible. Since 1GME is a hetero-12-mer, in which one of the chains is shorter than the other, it was hypothesised that the full-length N-terminal tail of HSP22E on all chains was not compatible with the template.
Homomer Dodecamer with cyclic symmetry 24mer from 5ZS3 template	Inputs A sequence or monomer structure An order for oligomer assembly (i.e., dimer, trimer, …, 12mer, etc.) Disordered termini or loops for refinement (a maximum of 3 regions, each less than 20 residues long). Drawbacks The fields for loop refinement were not reliably included. For some runs, this information was provided but the server did not process it. In addition, the central loop and terminal tails of HSP22E that would have benefited from refinement during oligomerisation were all considerably longer than 20 residues. If the input is structural, the subunit must contain only one chain. As such, we were only able to provide our monomeric structure as input, not our dimer. Out of the 15 models produced during various runs, 13 had cyclic symmetry rather than dihedral symmetry, which is not consistent with the known higher-order structures formed by HSPs. Results Both the sequence and a monomeric unit were provided as input in different runs, with 12 selected as the desired oligomeric order. This produced a total of 10 models, 9 of which were cyclic, and one dihedral 24-mer based on the template 5ZS3. An example cyclic model is shown adjacent. The monomeric unit was also provided to predict a tetramer, in the hopes that the resulting tetramer could be re-submitted with order 3 to produce a 12-mer with more complex symmetry. The 5 predicted tetramers did not resemble any known HSP structure.
Symmetric Docking Dihedral structure, showing a trimer of tetramers	Inputs A PDB structure containing the subunit. This subunit can involve multiple chains. The desired oligomeric order The type of geometric symmetry to use: a choice of ‘cyclic’ or ‘dihedral’. Drawbacks Beyond ‘dihedral’, no further symmetry constraints can be provided (e.g. to select between D3 or T symmetry). In addition, the default symmetry to look for is ‘cyclic’, so you must bear this in mind when submitting your job to the server and make sure you select the correct form of symmetry! No fields are given for regions of the structure that are disordered Results We submitted several jobs to ROSIE, using the dimers produced by manual alignment, AlphaFold and the Symm2 dimer. The dihedral models produced from manual alignment and Symm2 looked collectively unusual compared with known sHSP structures, but not unfeasible. The dihedral models produced from the AlphaFold dimer were of poor quality, because the large loop and tails had not been remodelled at all during oligomerisation.
HADDOCK Incomplete symmetry restraints, unable to fully define tetrahedral symmetry	Inputs Haddock is by far the most complex docking tool out of those listed here. It provides a considerable array of modifiable parameters, but those used for this project include: Active residues and dynamic residues: these allow you to specify particular amino acids that you know to be involved in interaction interfaces, or that you know to be disordered Symmetry constraints: these allow you to enforce specific geometric symmetry on the generated models. However, rather than specifying axes of symmetry, this is defined by ‘symmetry pairs’. Drawbacks Complex, and hence involves a steep learning curve (though the provided tutorials and documentation is very helpful) Providing symmetry information in the form of C2, C3, C4, C5, C6 and S3 symmetry pairs is not very scalable. A tetrahedral symmetry (T) is composed of 4 C3 axes and 3 C2 axes; for a complex with 12 subunits, this would require the definition of 4x4=16 C3 symmetry triplets and 6x3=18 C2 symmetry pairs. Not only is this arduous, the interface only allows a maximum of 10 symmetry pairs of each type to be defined. Results While the flexibility of the Haddock server initially seemed suitable, because the geometric symmetry could not be suitably defined it was not able to produce any viable models.
Modeller alignment via ChimeraX From 5ZUL template From 1GME template: refinement of clashes	Inputs Performed as per the dimer method One model was produced using the 1GME dodecamer template (heteromeric with D3 symmetry), and a second model was produced using the 5ZUL dodecamer template (homomeric with T symmetry). Results While the 5ZUL template was expected to produce better results, the nature of the input file proved to be of difficulty for the software, as it only produced a hexamer rather than a dodecamer. Some attempts were made to duplicate the hexamer and rearrange it to a dodecamer, but this was not successful as the particular arrangement of subunits in the hexamer did not prove to be reflectable. The dodecamer produced from the 1GME template had 5191 clashing residues, but the basic arrangement of core domains looked fairly reasonable. As such, this model was submitted to Haddock for refinement using molecular dynamics. Four structures were returned from this simulation, the best of which had reduced clashes down to 16, and the other three reducing clashes to around 70-75. This was a remarkable result. The refined model also appeared to have the C-terminal tails facing out of the oligomer, which would allow them to interact with denatured proteins.

From the results above, two models are proposed as the most likely potential high-order structures for HSP22E.

The 24-mer produced by Homomer from alignment to the template 5ZS3. This model had no clashes, and displayed oligomeric domain exchange through β-sheet interactions.
The refined 12-mer produced by Modeller with a 1GME template.

24-mer structure predicted by Homomer from the template 5ZS3

12-mer structure predicted by Modeller from the template 1GME, refined by Haddock

Since sHSPs are known to form dynamic complexes, it is quite possible that HSP22E is able to form both a 12-mer and a 24-mer from dimeric subunits. Further analysis of oligomeric order could be performed with laboratory access, by measuring the molecular weight of purified HSP22E on a gel.

Interactions

The Scaffold

The wheat heat shock protein 1GME, a homologous heat shock protein to HSP22E, is believed to form a large dodecamer under normal cell conditions, but dissociates into dimers at elevated temperatures (van Montfort et al., 2001). These dimers become the active units that interact with unfolded proteins. Hydrophobic regions of the dimers are buried in the oligomeric structure, but after dissociation, these regions are available on the surface to interact with the hydrophobic areas of denatured proteins. In this way the dimers bind to denatured proteins, preventing the formation of larger aggregates caused by the hydrophobic regions of denatured proteins binding to each other.

We decided to investigate if HSP22E might also function via this mechanism.

Fig. 8: A surface view of our HSP22E dimer, representing relative hydrophobicity by colour.

In Action

To view our modelled dimer in action, we simulated its interaction with a denatured form of citrate synthase (den-CS), commonly used as a model substrate for molecular chaperones. The structure for den-CS was provided to us by the Glover Lab. We first positioned our dimeric unit next to a den-CS unit in Chimera, before submitting this file for molecular dynamics simulation using GROMACS. This was submitted as a job on NCI’s Gadi, a high performance computing cluster in Australia.

All-atom simulations such as this measure the energy interactions between every single atom in the model, which can be quite time consuming. Molecular dynamics expert Brian Ee introduced us to the concept of coarse-grained simulations which define larger pseudo-atoms to represent the system being simulated. While coarse-grained simulations are a less accurate measure of the dynamics of the system, they significantly reduce the computational load required to run the experiment, and are valuable when simulating larger systems and for longer periods of time. We elected to perform a coarse-grained molecular dynamics experiment due to the relatively large size of our dimer and denatured protein system.

We performed the simulation at a temperature of 313K, at which proteins have the potential to denature, and at which the wheat heat shock protein 1GME has been observed to dissociate from its larger oligomeric storage form into active dimers (van Montfort et al., 2001). Our procedure based on a tutorial and sample script written by Wunna Kyaw and Stephanie Xu of the Lee Lab, Single Molecule Science, EMBL Australia (2018), which Brian provided. The mains steps involved were:

Convert the PDB structure (which we configured in Chimera) to be used into a coarse-grained structure, defined using the MARTINI22 force field.
Define a box around the system, the boundaries of which are at least 1.0nm away from the molecule.
Solvate the system with water. A coarse-grained water model was used as input for this.
Add ions to the system to produce an overall neutral charge.
Perform energy minimization on the system to fix poor rotamers and side-chain clashes. Minimization of our system took 7640 steps.
Equilibrate the system to relax side chain rotamers. The equilibration was set at a reference temperature of 313K.
Perform the molecular dynamics simulation, also set to a reference temperature of 313K. The Boltzmann distribution from which velocities were sampled was also defined using a temperature of 313K. We performed the simulation for 1.02μs.
Convert the trajectory to an alpha-carbon version and align the molecule so that the simulation can be viewed in VMD (Visual Molecular Dynamics).

Fig. 9: Molecular dynamics simulation between the HSP22E dimer (yellow and red chains) and the denatured form of citrate synthase (dark chains). Simulation lasted 1.02μs and was performed at 313K.

Download the scripts, mdp files, and trajectory files produced in the molecular dynamics simulation, along with our final monomer, dimeric, and large oligomeric structures here.

Looking Forward

A next step for this project would be to perform a molecular dynamics simulation of the large oligomers at both regular temperatures and under heat stress. This would test whether the oligomers are able to dissociate into dimers when under heat stress, or if further stimulus is required to initiate dissociation. A simulation could also be performed with an HSP22E oligomer in proximity to denatured proteins under heat stress, to determine if the oligomer can associate with denatured proteins without dissociating.

The dimer MD model could also be extended. Firstly, the MD simulation could be tested with different starting positions and orientations between the HSP and denatured protein. Then, further modelling could be performed with multiple dimeric units and multiple denatured proteins in the one system. This could provide information on the relative binding affinity of HSP22E to denatured protein compared with the denatured proteins’ binding affinity for each other, and provide an indication of how effectively the HSP22E dimer prevents aggregation. This could also provide insight into the various purposes of different oligomeric forms. It has been observed that some heat shock proteins, including homologous heat shock protein 1GME, have their large oligomeric form function as a storage box to keep HSPs contained until they are needed during heat stress (at which point they dissociate into dimers), while other heat shock proteins use their larger oligomeric form as a core to which unfolded protein attach (Rütgers et. al, 2017).

References

Basha, E., O’Neill, H., Vierling, E. (2012). Small heat shock proteins and α-crystallins: dynamic proteins with flexible functions. Trends in Biochemical Sciences. 37(3). doi:10.1016/j.tibs.2011.11.005.

Caspers, G.-J., Leunissen, J.A.M., De Jong, W.W. (1995). The expanding small heat-shock protein family, and structure predictions of the conserved ‘α-crystallin domain. Journal of Molecular Evolution. 40(3). doi:10.1007/BF00163229.

De Jong, W.W., Caspers, G.-J., Leunissen, J.A.M. (1998). Genealogy of the α-crystallin—small heat-shock protein superfamily. International Journal of Biological Macromolecules. 22(3–4). doi:10.1016/S0141-8130(98)00013-0.

Holbrook, K., Subramanian, C., Chotewutmontri, P., Reddick, L.E., Wright, S., Zhang, H., Moncrief, L., Bruce, B.D.(2016). Functional Analysis of Semi-conserved Transit Peptide Motifs and Mechanistic Implications in Precursor Targeting and Recognition. Molecular Plant. 9(9). pg 1286-1301. doi:10.1016/j.molp.2016.06.004.

Kunze, M., Berger, J. (2015). The similarity between N-terminal targeting signals for protein import into different organelles and its evolutionary relevance. Frontiers in Physiology. 6. doi:10.3389/fphys.2015.00259.

Rütgers, M., Muranaka, L.S., Mühlhaus, T., Sommer, F., Thomas, S., Schurig, J., Willmund, F., Schulz-Raffelt, M., Schroda, M. (2017). Substrates of the chloroplast small heat shock proteins 22E/F point to thermolability as a regulative switch for heat acclimation in Chlamydomonas reinhardtii. Plant Mol Biol. 95(6). pg 579-591. doi:10.1007/s11103-017-0672-y

Rutsdottir, G., Hallmark, J., Weide, Y., Hebert, H. (2017). Structural model of dodecameric heat-shock protein Hsp21: Flexible N-terminal arms interact with client proteins while C-terminal tails maintain the dodecamer and chaperone activity. Journal of Biological Chemistry. 292(19). doi:10.1074/jbc.M116.766816.

Stamler, R. et al. (2005) “Wrapping the α-Crystallin Domain Fold in a Chaperone Assembly,” Journal of Molecular Biology, 353(1). doi:10.1016/j.jmb.2005.08.025.

Van Montfort, R., Basha, E., Friedrich, K., Slingsby, C., Vierling, E. (2001). Crystal structure and assembly of a eukaryotic small heat shock protein. Nat Struct Mol Biol. 8(12). pg 1025–1030. https://doi.org/10.1038/nsb722

Yu, C., Leung, S.K.P., Zhang, W., Lai, L.T.F., Chan, Y.K., Wong, M.C., Benlekbir, S., Cui, Y., Jiang, L., Lau, W.C.Y. (2021). Structural basis of substrate recognition and thermal protection by a small heat shock protein. Nature Communications. 12(1). doi:10.1038/s41467-021-23338-y.

Software Acknowledgements

BLAST

Camacho, C., Coulouris, G., Avagyan, V., Ma, N., Papadopoulos, J., Bealer, K., Madden T.L. (2009). BLAST+: architecture and applications. BMC Bioinformatics. 10:421. DOI: 10.1186/1471-2105-10-421

T-COFFEE

Notredame, C., Higgins, D.G., Heringa, J. (2000). T-Coffee: A novel method for fast and accurate multiple sequence alignment. Journal of Molecular Biology. 302(1). pg 205-217. DOI: 10.1006/jmbi.2000.4042

ClustalOmega

Sievers F., Wilm A., Dineen D., Gibson T.J., Karplus K., Li W., Lopez R., McWilliam H., Remmert M., Söding J., Thompson J.D., Higgins D.G. (2011). Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Mol. Syst. Biol. 7:539 DOI: 10.1038/msb.2011.75

I-TASSER

Yang, J., Zhang, Y. (2015). I-TASSER server: new development for protein structure and function predictions. Nucleic Acids Research. 43:W174-W181.

Zhang, C., Freddolino, P.L., Zhang, Y. (2017). COFACTOR: improved protein function prediction by combining structure, sequence and protein–protein interaction information. Nucleic Acids Research. 45:W291-W299.

Robetta

Baek, M., DiMaio, F., Anishchenko, I., Dauparas, J., Ovchinnikov, S., Lee, G.R., Wang, J., Cong, Q., Kinch, L.N., Schaeffer, R.D., Millán, C., Park, H., Adams, C., Glassman, C.R., DeGiovanni, A., Pereira, J.H., Rodrigues, A.V., van Dijk, A.A., Ebrecht, A.C., Opperman, D.J., Sagmeister, T., Buhlheller, C., PavkovKeller, T., Rathinaswamy, M.K., Dalwadi, U., Yip, C.K., Burke, J.E., Garcia, K.C., Grishin, N.V., Adams, P.D., Read, R.J., Baker, D. (2021). Accurate prediction of protein structures and interactions using a 3-track network. Science 10.1126/science.abj8754. doi: https://doi.org/10.1126/science.abj8754.

AlphaFold

Jumper, J., Evans, R., Pritzel, A., Green, T., Figurnov, M., Ronneberger, O., Tunyasuvunakool, K., Bates, R., Zidek, A., Potapenko, A., Bridgland, A., Meyer, C., Kohl, S.A.A., Ballard, A.J., Cowie, A., Romera-Paredes, B., Nikolov, S., Jain, R., Adler, J., Back, T., Petersen, S., Reiman, D., Clancy, E., Zielinski, M., Steinegger, M., Pacholska, M., Berghammer, T., Bodenstein, S., Silver, D., Vinyals, O., Senior, A.W., Kaukcuoghu, K., Kohli, P., Hassabis, D. (2021). Highly accurate protein structure prediction with AlphaFold. Nature. 596. pg 583-589.

MolProbity

Williams, C.J., Headd, J.J., Moriarty, N.W., Prisant, M.G., Videau, L.L., Deis, L.N., Verma, V., Keedy, D.A., Hintze, B.J., Chen, V.B., Jain, S., Lewis, S.M., Arendall 3rd, B.W., Snoeyink, J., Adams, P.D., Lovell, S.C., Richardson, J.S., Richardson, D.C. (2018). MolProbity: More and better reference data for improved all-atom structure validation. Protein Science 27: 293-315.

Chen, V.B., Arendall III, W.B., Headd, J.J., Keedy, D.A., Immormino, R.M., Kapral, G.J., Murray, L.W., Richardson, J.S., Richardson, D.C. (2010). MolProbity: all-atom structure validation for macromolecular crystallography. Acta Crystallographica D66: 12-21.

Davis, I.W., Leaver-Fay, A., Chen, V.B., Block, J.N., Kapral, G.J., Wang, X., Murray, L.W., Arendall III, W.B., Snoeyink, J., Richardson, J.S., Richardson, D.C. (2007) MolProbity: all-atom contacts and structure validation for proteins and nucleic acids. Nucleic Acids Research 35: Web Server issue, W375-W383.

ROSIE SymmDock

Andre, I., Bradley, P., Wang, C., Baker, D. (2007). Prediction of the structure of symmetrical protein assemblies. Proc Natl Acad Sci USA. 104(45). pg 17656-61. Link: http://www.pnas.org/content/104/45/17656.long [this is the primary citation for the algorithm].

Lyskov, S., Chou, FC., Conchúir, S.Ó., Der, B.S., Drew, K., Kuroda, D., Xu, J., Weitzner, BD., Renfrew, P.D., Sripakdeevong, P., Borgo, B., Havranek, J.J., Kuhlman, B., Kortemme, T., Bonneau, R., Gray, J.J., Das, R. (2013). Serverification of Molecular Modeling Applications: The Rosetta Online Server That Includes Everyone (ROSIE). PLoS One. 8(5):e63906. doi: 10.1371/journal.pone.0063906. Print 2013.

GalaxyRefine

Heo, L., Park, H., Seok, C. (2013). GalaxyRefine: Protein structure refinement driven by side-chain repacking. Nucleic Acids Res. 41. W384-8. doi: 10.1093/nar/gkt458.

GalaxyHomomer

Baek, M., Park, T., Heo, L., Park, C., Seok, C. (2017). GalaxyHomomer: A web server for protein homo-oligomer structure prediction from a monomer sequence or structureI. Nucleic Acids Research. DOI: 10.1093/NAR/GKX246

GROMACS

Apol, E., Apostolov, R., Bauer, P., Berendsen, H.J.C., Bjelkmar, P., Blau, C., Bolnykh, V., Boyd, K., van Buuren, A., van Drunen, R., Feenstra, A., Groenhof, G., Hamuraru, A., Hindriksen, V., Irrgang, M.E., Lupinov, A., Junghans, C., Jordan, J., Karkoulis, D., Kasson, P., Kraus, J., Kutzner, C., Larsson, P., Lemkul, J.A., Lindahl, V., Lundborg, M., Marklund, E., Merz, P., Meulenhoff, P., Murtola, T., Pall, S., Pronk, S., Schulz, R., Shirts, M., Shvetsov, A., Sijbers, A., Tieleman, P., Virolainen, T., Wennberg, C., Wolf, M., Zhmurov, A. (2021). GROMACS Documentation. Royal Institute of Technology.

A. Bondi. van der Waals Volumes and Radii. J. Phys. Chem. 68 (1964) pp. 441-451

MARTINIZE

MARTINIZE, script version 2.4: de Jong et al., J. Chem. Theory Comput., 2013, DOI:10.1021/ct300646g

VMD

Humphrey, W., Dalke, A., Schulten, K., (1996). VMD - Visual Molecular Dynamics. J. Molec. Graphics. 14.1. pg 33-38.

ChimeraX

Molecular graphics and analyses performed with UCSF ChimeraX, developed by the Resource for Biocomputing, Visualization, and Informatics at the University of California, San Francisco, with support from National Institutes of Health R01-GM129325 and the Office of Cyber Infrastructure and Computational Biology, National Institute of Allergy and Infectious Diseases.

Pettersen, E.F., Goddard, T.D., Huang, C.C., Meng, E.C., Couch, G.S., Croll, T.I., Morris, J.H., Ferrin, T.E. (2021). UCSF ChimeraX: Structure visualization for researchers, educators, and developers. Protein Sci. 30(1). pg 70-82. doi: 10.1002/pro.3943.

PyMOL

The PyMOL Molecular Graphics System, Version 1.2r3pre, Schrödinger, LLC.

Haddock

Honorato, R.V., Koukos, P.I., Jimenez-Garcia, B., Tsaregorodtsev, A., Verlato, M., Giachetti, A., Rosato, A., Bonvin, A.M.J.J. (2021). Structural biology in the clouds: The WeNMR-EOSC Ecosystem. Frontiers Molecular Biosciences 8. fmolb.2021.729513.

Van Zundert, G.C.P., Rodrigues, J.P.G.L.M., Trellet, M., Schmitz, C., Kastritis, P.L., Karaca, E., Melquiond, A.S.J., Van Dijk, M., De Vries, S.J., Bonvin, A.M.J.J. (2016). The HADDOCK2.2 webserver: User-friendly integrative modeling of biomolecular complexes. Journal of Molecular Biology. 428. pg 720-725.

The FP7 WeNMR (project# 261572), H2020 West-Life (project# 675858), the EOSC-hub (project# 777536) and the EGI-ACE (project# 101017567) European e-Infrastructure projects are acknowledged for the use of their web portals, which make use of the EGI infrastructure with the dedicated support of CESNET-MCC, INFN-PADOVA-STACK, INFN-LNL-2, NCG-INGRID-PT, TW-NCHC, CESGA, IFCA-LCG2, UA-BITP, SURFsara and NIKHEF, and the additional support of the national GRID Initiatives of Belgium, France, Italy, Germany, the Netherlands, Poland, Portugal, Spain, UK, Taiwan and the US Open Science Grid.

PDB ID	Organism	Name	Chains	Number of subunits	Symmetry
47JA	Salmonella typhimurium	AgsA	homomeric	24	octahedral (O)
4ZJD			homomeric	18	dihedral (D3)
4ZJ9			homomeric	2	cyclic (C2)
3N3E	Danio rerio	α-crystallin	homomeric	2	cyclic (C2)
4YLB	Sulfolobus solfatataricus	Hsp14.1, A102D mutant	homomeric	4	cyclic (C2)
4YLC		Hsp14.1, Del-C4 mutant	homomeric	8	dihedral (D2)
4YL9		Hsp14.1, wildtype	homomeric	4	asymmetric (C1)
3W1Z	Schizosaccharomyces pombe	Hsp16.0	homomeric	16	dihedral (D4)
2BYU	Triticum aestivum	Hsp16.3	homomeric	12	tetrahedral (T)
1SHS	Methanococcus jannaschii	Hsp16.5	homomeric	24	octahedral (O)
4ELD	Methanococcus jannaschii	Hsp16.5, activated variant	homomeric	48	octahedral (O)
1GME	Triticum aestivum	Hsp16.9	heteromeric	12	dihedral (D3)
5DS1	Pisum sativum	Hsp17.7	homomeric	12
5DS2	Pisum sativum	Hsp18.1	homomeric	12
6L6M	Entamoeba histolytica	Hsp18.5	homomeric	4	cyclic (C2)
5NMS	Arabidopsis thaliana	Hsp21	heteromeric	12	dihedral (D3)
7BZW	Arabidopsis thaliana	Hsp21	homomeric	12	tetrahedral (T)
2H50	Saccharomyces cerevisiae	Hsp26	homomeric	24	tetrahedral (T)
	Saccharomyces cerevisiae	Hsp42	homomeric	12	likely dihedral (D3)
	Saccharomyces cerevisiae	Hsp42	homomeric	24	likely dihedral (D3)
6EWN	Thermosynechococcus vulcanus	HspA	homomeric	12
			homomeric	14
			homomeric	24
5ZS6	Mycobacterium marinum	MMAR sHSP	heteromeric	24	tetrahedral (T)
5ZS3			heteromeric	24	tetrahedral (T)
5ZUL			homomeric	12	tetrahedral (T)
4YDZ	Caenorhabditis elegans	Sip1	homomeric	32	dihedral (D8)
4YEO	Caenorhabditis elegans	Sip1	homomeric	2	cyclic (C2)
3GT6	Xanthomonas axonopodis	SpA	homomeric	36	dihedral
2BOL	Taenia saginata	Tsp36	homomeric	2	cyclic (C2)
5JZN	Xylella fastidiosa	XfsHSP17.9	homomeric		mesh with square holes

Team:UNSW Australia/Model/Heat Shock Protein

Modelling

Structural Modelling of Heat Shock Protein HSP22E

Aims

Sequence

Choices, Choices - selecting a sequence

The First Hurdle - cellular localisation

In Perspective - comparison to known sHSPs

Monomer

Oligomer

Variations on a Theme - diversity and dynamism of HSPs

The Family Tree

Diving into Detail

Dimer

Manual alignment via ChimeraX to the template 1GME

Modeller alignment via ChimeraX

AlphaFold

ROSIE’s Symmetric Docking Tool

Dodecamer

Manual alignment via ChimeraX

Homomer

Inputs

Drawbacks

Results

Symmetric Docking

Inputs

Drawbacks

Results

HADDOCK

Inputs

Drawbacks

Results

Modeller alignment via ChimeraX

Inputs

Results

Interactions

The Scaffold

In Action

Looking Forward

References

Software Acknowledgements

BLAST

T-COFFEE

ClustalOmega

I-TASSER

Robetta

AlphaFold

MolProbity

ROSIE SymmDock

GalaxyRefine

GalaxyHomomer

GROMACS

MARTINIZE

VMD

ChimeraX

PyMOL

Haddock