In the ab initio modeling section, we saw that nature uses a relatively small number of possible polypeptide chain conformations. Nature favors energetically favorable conformations. A protein chain folds either by condensation around a nucleation site or through intermediate stages rich in secondary structures. This means that looking at all possible conformations is wasteful.
Fold recognition methods detect folds that can be used for structural modeling with homology at the sequence level. The principle of fold recognition is the identification of folds that are compatible with a given query sequence i.e. instead of sequences being used to predict folds, the folds are fitted to the sequence. This involves:
- searching for known folds
- scoring folds
- identifying candidates that best fit the sequence
- aligning the query and the best-scoring proteins
Once such a template has been identified, the remainder of the process is the same as comparative modeling. Fold recognition methods are based on both sequence similarity searches and structural information.
Summary of Structure Prediction
| Method |
Knowledge |
Approach |
Difficulty |
Usefulness |
| Secondary structure prediction |
sequence - structure statistics |
Cannot do 3D. Suitable for predicting H/E |
Medium |
very useful. if sequence identity is greater than 40%, suitable for drug design. |
| Homology Modeling |
proteins of known structure |
identify related structures with sequence methods, copy 3D coordinates and adjust |
easy |
useful |
| Ab Initio |
energy functions, statistics |
simulate folding or generate many candidate structures |
very difficult |
not yet useful |
| Fold Recognition |
proteins of known structure |
identify folds, compare sequence, copy 3D coordinates and adjust |
medium |
limited depending on the models |
Source
[1] Principles of Proteomics by R. M. Twyman
[2] Structural Bioinformatics by Bourne & Weissig
[3] Lecture slides of Dr. Lina Yip