Tools predicting problems in mRNA splicing
Pre-messenger RNA (pre-mRNA) splicing defects are likely to have an impact on clinical practice as these seem to have a role in almost all known diseases with genetic aetiology (Tazi et al. 2009). It is likely that approximately 15 % of pathogenic mutations cause disease through the defect that they introduce in the splicing mechanism (Baralle et al 2009). Therefore it is important to have software tools that could predict whether a given variation would interfere with the mRNA splicing. This site collects together such tools mainly based on review of Baralle et al. 2009. It must be noted however that these programs are not yet considered to be reliable and any result obtained by in silico methods should therefore be confirmed by wet lab experiments. This section also contains tools and articles associated with other RNA level consequences caused by mutations/SNPs.
An extensive overview of mRNA splicing tools can be found from the recent review article by Baralle et al.
The program categorisation used here follows the one in the review article.
- MaxEntScan (MES)
- Analyzer Splice Tool (AST)
- Automated Splice-Site Analyses (ASSA)
- Human Splicing Finder (HSF formerly know as SSF)
These methods tend to be more accurate than programs evaluating SREs. This is due to the fact that the donor and acceptor elements are more conserved than SREs (Baralle et al 2009).
MES is unable to read through the sequence and find the splice site. The user has to indicate the exon-intron junktion in the short nucleotide sequence tested. Knowledge of putative splice site is consequently a prerequisite when working with MES. However, the algorithm can be used through Alamut which circumvents this problem.
Reference: Yeo et al. Maximum entropy modeling of short sequence motifs with applications to RNA splicing signals. J.Comput.Biol., 2004, 11, 2-3, 377-394. doi:10.1089/1066527041410418
Reference:Rogozin I.B. and L. Milanesi. Analysis of donor splice signals in different organisms. J. Mol. Evol., 1997, V.45, 50-59.
Reference: Dogan et al. SplicePort: An Interactive Splice-Site Analysis Tool. Nucleic Acids Research, 2007. doi:10.1093/nar/gkm407
Reference:Brunak et al. Prediction of Human mRNA Donor and Acceptor Sites from the DNA Sequence, Journal of Molecular Biology, 1991, 220, 49-65. doi:10.1016/0022-2836(91)90380-O
Reference:Desmet et al. Human Splicing Finder: an online bioinformatics tool to predict splicing signals. Nucleic Acids Res., 2009, 37, 9, e67 doi:10.1093/nar/gkp215
Also available through Alamut.
Reference:Reese MG, Eeckman, FH, Kulp, D, Haussler, D, 1997. ``Improved Splice Site Detection in Genie''. J Comp Biol 4(3), 311-23.
ESEfinder (Cartegni et al 2003) is a web-based resource that facilitates rapid analysis of exon sequences to identify putative ESEs and to predict whether exonic mutations disrupt such elements. It implements motif-scoring matrices. Possible drawback of this program is that it searches the ESE motifs corresponding only to four SR proteins. This means that sequences corresponding to the RNA binding specificities of other SR proteins or for other types of proteins will be missed by the program.
Reference: Cartegni et al (2003). ESEfinder: A web resource to identify exonic splicing enhancers. Nucleic Acids Res., 2003, 31, 13, 3568-3571 . doi: 10.1093/nar/gkg616
RESCUE-ESE predicts motifs with ESE sequences based on the statistical analysis on differences in hexamer frequencies between exons and introns and between exons with weak and strong splice sites.
Reference:Fairbrother et al (2004) . RESCUE-ESE identifies candidate exonic splicing enhancers in vertebrate exons. Nucleic Acids Res., 2004, 32, Web Server issue, W187-90. doi:10.1093/nar/gkh393
website: Splicing Rainbow
website: Splice Signal Analysis
Sroogle is a webserver for visualization of splicing signals. It provides a graphic display of splicing related data on DNA segments including splice site scores based on different metrics, mapping of putative exonic and intronic splicing regulatory sequences (SRSs), data regarding SRSs that would occur as a result of point mutations and percentile scores comparing your target exon to precompiled datasets of constitutive and alternative exons.
As an input server requests exons along with the two introns flanking it. The server will accept either consecutive stretches of DNA, or stretches of DNA separated by spaces and numbers, as obtained in the UCSC web browser.
Reference: Schwartz et al. SROOGLE: webserver for integrative, user-friendly visualization of splicing signals. Nucleic Acids Res. 2009 Jul 1;37(Web Server issue):W189-92. Epub 2009 May 8. doi: 10.1093/nar/gkp320
Download:mfold version 3.4
Reference:M. Zuker. Mfold web server for nucleic acid folding and hybridization prediction. Nucleic Acids Res. 31 (13), 3406-15, (2003) doi:10.1093/nar/gkg595
pFOLD is RNA secondary structure prediction program using stochastic context-free grammars. It takes an alignment of RNA sequences as input and predicts a common structure for all sequences.
Reference: Knudsen et al. Pfold: RNA secondary structure prediction using stochastic context-free grammars. Nucleic Acids Res., 2003, 31, 13, 3423-3428. article
For some observed associations the disease phenotype is caused by a structural rearrangement in a regulatory region of the RNA transcript. UTR and SNP combinations identified by the method described in the article are postulated to constitute a “RiboSNitch,” that is a regulatory RNA in which a specific SNP has a structural consequence that results in a disease phenotype. SNPfold algorithm can help identify RiboSNitches by leveraging GWAS data and an analysis of the mRNA structural ensemble.
Reference: 2010 Disease-Associated Mutations That Alter the RNA Structural Ensemble. PLoS Genet 6(8): e1001074. doi:10.1371/journal.pgen.1001074
PupaSuite finds all the SNPs mapping in locations that might cause a loss of functionality in the genes.
Reference: Conde et al. PupaSuite: finding functional SNPs for large-scale genotyping purposes. Nucl Acids Research, 2006, 34: W621-W625. doi:10.1093/nar/gkl071
Databases were extensively listed in the review article by Baralle et al (2009). The catagorisation of databases follows the one used in the article. A few additional databases have been added in addition to those mentioned in the review article.
Splicing mutations database
- Human Gene Mutation Database (HGMD)
- DBASS3 and DBASS5
- Alternative Splicing Mutation Database (ASMD)
- list of specialiced databases
Alternative splicing database
- ASTD (Alternative Splicing and Transcript Diversity, successor of ASD and ATD)
- Alternative Splicing Prediction Database (ASPicDB)
- Alternative Splicing database
- MAASE (The Manually Annotated AlternativelySpliced Events)
- ASDB (database of alternatively spliced genes)
- ASG (Alternative Splicing Gallery)
- Baralle et al (2009). Missed threads. The impact of pre-mRNA splicing defects on clinical practice. EMBO reports 10, 8, 810–816 (2009). doi:10.1038/embor.2009.170
- Baralle et al. Splicing in action: assessing disease causing sequence changes. J.Med.Genet., 2005, 42, 10, 737-748. doi:10.1136/jmg.2004.029538
- Hartmann et al. (2008) Diagnosis of pathogenic splicing mutations: does bioinformatics cover all the bases? Front Biosci., 13:3252-3272. doi:10.2741/2924
- Hellen, E. (2010) Splice Site analysis tools. NGRL report. pdf
- Houdayer,C et al. Evaluation of in silico splice tools for decision-making in molecular diagnosis.Hum.Mutat., 2008, 29, 7, 975-982. doi:10.1002/humu.20765