Protein level predictions 2 - Sequence alignment methods
This subset contains tools to be used for multiple sequence alignments (MSA) of proteins. Important applications of sequence alignments include secondary and tertiary structure prediction and functional prediction which belong to the set of prediction tools we are interested in. Accurate MSAs are a necessity for these tools to work. To obtain a perspective of recent trends of MSA's see article by Kemena and Notredame.
*Categorisation according to Kemena and Notredame to C (consistency based) and T (template based) methods
Reference: Wallace et al. M-Coffee: combining multiple sequence alignment methods with T-Coffee. Nucleic Acids Res, 2006, 34, 6, 1692-1699. doi:10.1093/nar/gkl091
Katoh et al. MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res., 2002, 30, 14, 3059-3066. http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pubmed&pubmedid=12136088
Katoh et al. MAFFT version 5: improvement in accuracy of multiple sequence alignment. Nucleic Acids Res. ,2005, 33, 2, 511-518. doi:10.1093/nar/gki198
"MISTRAL (Multiple STRuctural ALignment) is a novel strategy for multiple protein alignment based on the minimization of an energy function over the low-dimensional space of the relative rotations and translations of the molecules. The energy minimization avoids combinatorial searches and returns pairwise alignment scores for which a reliable a priori statistical significance can be given" (citation from Micheletti and Orland).
References: Micheletti and Orland. MISTRAL: a tool for energy-based multiple structural alignment of proteins. Bioinformatics, 2009, 25, 20, 2663-2669. doi:10.1093/bioinformatics/btp506
Reference: Do et al. ProbCons: Probabilistic consistency-based multiple sequence alignment. Genome Res., 2005, 15, 2, 330-340. doi:10.1101/gr.2821705
Reference: Pei et al. PROMALS web server for accurate multiple protein sequence alignments. Nucleic Acids Res., 2007, 35, Web Server issue, W649-52. doi:10.1093/nar/gkm227
Reference: Larkin et al. Clustal W and Clustal X version 2.0. Bioinformatics, 2007, 23, 21, 2947-2948. doi:10.1093/bioinformatics/btm404
Reference: Edgar. MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics, 2004, 5, 113. doi:10.1186/1471-2105-5-113
- PDB (database of protein structures containing currently around 60 000 protein structures)
- Kemena and Notredame. Upcoming challenges for multiple sequence alignment methods in the high-throughput era. Bioinformatics, 2009, 25, 19, 2455-2465. doi:10.1093/bioinformatics/btp452
- Janita Thusberg and Mauno Vihinen. Pathogenic or Not? And If So, Then How? Studying the Effects of Missense Mutations Using Bioinformatics Methods. Hum Mutat. 2009 May;30(5):703-14: doi:10.1002/humu.20938