SpliceRover is a prediction tool which can be used for donor and acceptor splice site prediction, in both human and arabidopsis. By making use of convolutional neural networks, we achieve state-of-the-art accuracy.
All implementation details can be found in our publication at .
The input for this tool is required to be in a fasta format. A
minimum sequence length of 398
is needed for the human
models, and one of 402
for the arabidopsis models. The
maximum length is limited at 30000
.
All four available models are trained on datasets described in the aforementioned publication.
The human donors & acceptors models are trained on the GWH dataset
as described in Sonnenburg et al (2007), with a
pos:neg
sub sampling of 1:10
. The
sequence length is 398
.
The arabidopsis donors & acceptors models are trained on the
arabidopsis dataset as described in Degroeve et al (2005),
retaining the original pos:neg
ratio from the publication.
The sequence length is 402
.
TISRover is a prediction tool for predicting translation initiation sites in human. By making use of convolutional neural networks, we achieve state-of-the-art accuracy
All implementation details can be found in our publication.
Predictions will be made for sequences of length 203
,
which is therefore the minimum length required. The maximum length is
limited at 30000
.
The model is trained on the CCDS dataset (excluding chromosome 21), as
described in the aforementioned publication. It predicts the
probability for a positive classification of an ATG
triplet, based on 60
nucleotides upstream and
140
nucleotides downstream.