Contribution

A guide to free online predictive protein modeling software

iTasser

A forewarning to using predictive protein modeling software:
1) The protein modeling software of today is a very powerful tool, but it has its limitations. Without understanding the limitations, the data produced can look both incredibly promising and meaningless at the same time.
2) Some online protein modeling software, as are those explained later in this page, use an online database to determine 3-D structure. This can be a limitation as is later discussed under iTasser.
3) Another limitation of the softwares is its inability to predict complicated amino acid folding such as fluorophore colour motifs.
Of course these limitations do not devalue the usefulness of predictive protein modeling. With a good sequence and a bit of luck, protein modeling can be a nice addition to your project!

On how to use iTasser:

iTasser is a unique online software that differs from other protein modelling softwares by allowing you to choose a .pdb file to compare your sequence against.
Normal protein modelling softwares compares the amino acid sequence to a database of proteins to generate a 3 dimensional model. This is generally very useful; however, when trying to ascertain the 3-D structure of a chimeric protein, this method falls short. These programs try to model the entire protein as if it was a single protein from a single organism, which raises problems for modelling chimeric proteins. This created errors in our model, as the fluorophore amino acid sequence was recognized by the program and the entire protein was folded using this as a template, even though the vitamin D binding domain undergoes different folding conditions. This resulted in properly folded fluorophores, but an improperly folded vitamin D binding domain.
Therefore, iTasser is unique because you can take the .pdb file from chimera or from the software you used to create the chimeric protein and use it to make other predictive models if you’ve made changes to the sequence or mutations have changed the amino acid sequence.
Link to iTasser Webpage. Click Here!

Here's how to use iTasser:

1) Add your amino acid sequence here

2) Add you add your pdb file you want to compare against here and click run.

Phyre2

Phyre2 is a useful protein modeling tool in the case that you are determining the tertiary structure of a protein. As the successor to the original Phyre project, Phyre2 offers a user-friendly service of both high speed and high quality, as well as many applicable settings.

Search Modes

Phyre2 has two search modes: Normal, which allows for quick but rough structure determination, and Intensive, which permits more thorough structure determination at the cost of time.

Descriptive Results

In the case of either search mode, Phyre2 searches over a range of genomes which guides its decision in outputting the final predicted protein structure. When finished, you will be emailed a link to the Results page.

Summary

The first section of the results page is subscripted under a heading titled “Summary” as:

Here, you are shown information regarding the top model of your protein sequence’s predicted structure. This includes:

-An image of your sequence’s predicted structure
-A link to the template that the prediction is based off of
-Information about the template (including its PDB Entries)
-The confidence in the accuracy of the model as well as template coverage

If you click on the three-dimensional image of your predicted protein structure you will be prompted to download it in a .pdb format.

Sequence Analysis

Below the summary tab is the sequence analysis tab, which is a short tab that looks like:

If you click on the View PSI-Blast Pseudo-Multiple Sequence Alignment button a new window will pop up:

This window shows the amino acid alignments of your reference sequence to the top matching sequences determined by Phyre2.

Secondary Structure and Disorder Prediction

Moving down to the next tab you will see secondary structure and disorder prediction. Clicking on show will give you:

Here you are presented with the amino acid sequence of your predicted structure color coded by amino acid classification:
A,S,T,G,P - small/polar - orange
M,I,L,V - hydrophobic - green
K,R,E,N,D,H,Q - charged - red
W,Y,F,C - aromatic + cysteine - purple

Under the amino acid sequence you are shown the predicted secondary structure with arrows denoting beta sheets and coils denoting alpha helices. Each region of secondary structure prediction is accompanied by a confidence value, represented by ROYGBIV coloration, with red indicating high confidence and violet indicating low confidence. Furthermore, on this page, there are indicated regions of disordered amino acids that are predicted to not mediate protein folding. These indications are also accompanied by confidence values. Lastly, at the bottom of the page it is shown the percentages of the amino acids in your that contribute to disordered, beta sheet, alpha helical foldings.

Domain Analysis

Following the secondary structure and disorder prediction tab is the domain analysis tab, which looks like:

This tab crudely displays the region of alignment that your template has with its top 20 matches. Hovering over one of these regions and clicking will take you down to the next tab.

Detailed Template Information

After clicking on any search sequence from the domain analysis tab you will be scrolled down to see:

Here, you are shown information regarding the template that you selected. Although in total, results are shown for the top 100 matches. For each match, you are able to investigate the phyre2 sequence data as well as the PDB entries. If you click on the Alignment button under the Alignment Coverage tab a new window will pop up, which displays:

Here you are shown a more qualitative alignment between the reference sequence and the template sequence, which includes secondary structure comparisons.

Binding Site Prediction

At the bottom of the page is a tab that is called binding site prediction, and it looks like:

If you click on HERE, you will be taken to another webpage that looks like:

On this webpage, you can paste the same sequence that you inputted for your Phyre2 modeling, or the pdb file that was generated by Phyre2. In doing so, 3DLigandSite will superimpose your predicted structure onto a library of known ligand-bound structures in order to determine potential binding sites of your predicted structure.

Overall

Phyre2 is a powerful tool that can be used to predict the folding of an unknown protein from only its sequence. Since its library of protein folding information is ever-growing, it is a tool that becomes more relevant and useful for structure determination over time. Phyre2 Webpage

References:

1) A Roy, A Kucukural, Y Zhang. I-TASSER: a unified platform for automated protein structure and function prediction. Nature Protocols, 5: 725-738 (2010)
2) J Yang, R Yan, A Roy, D Xu, J Poisson, Y Zhang. The I-TASSER Suite: Protein structure and function prediction. Nature Methods, 12: 7-8 (2015)
3) J Yang, Y Zhang. I-TASSER server: new development for protein structure and function predictions. Nucleic Acids Research, 43: W174-W181 (2015)
4) The Phyre2 web portal for protein modeling, prediction and analysis. Kelley LA et al.. Nature Protocols 10, 845-858 (2015)
5) 3DLigandSite: predicting ligand-binding sites using similar structures. Wass MN, Kelley LA and Sternberg MJ Nucleic Acids Research 38, W469-73 (2010) [PubMed]

Team:Northern BC/Contribution