For Linux, you can install latest version of Java by running following commands from terminal:
sudo apt-get update
sudo apt-get install default-jre
sudo apt-get install default-jdk
For Mac, you can install latest version of Java by running following commands from terminal:
brew update
brew cask install java
For Linux, you can install latest version of g++ by running following commands from terminal
sudo apt-get update
sudo apt-get install build-essential
For Mac, you can install g++ by running following command from terminal.
g++
If you've already installed g++, the terminal prints this message, "no input files".
ECPred.tar.gz
Above file (around 3 GB) should be downloaded from:
Extract the files using:
tar -xvf ECPred.tar.gz
After extraction the total size of the folder will be around 10 GB.
Run runLinux.sh or runMac.sh from terminal according to your OS using one of these commands:
./runLinux.sh
or
./runMac.sh
These bash scripts will install necessary libraries and tools.
cd into the ECPred installation folder.
java -jar ECPred.jar method inputFile libraryDir tempDir outputFile
method
argument can be one of the following: blast, spmap, pepstats, weighted
inputFile
argument is the file that contains protein sequences in fasta format
libraryDir
is the path to the directory where the "lib" folder is located
tempDir
is the path to the directory where the temporary files are located. You may wish to delete these files after you complete your prediction runs.
outputFile
argument is optional. If you don't specify the output file name, the results will be printed to standard output.
Sample run
java -jar ECPred.jar weighted sample.fasta /full/path/to/ECPred/ temp/ results.tsv
There is no limit on the number of protein sequences; however, a single protein is predicted in one minute on average on an Intel 2.70 GHz i7 processor.
Output is optinal. If you don't specify the output file name, the results will be printed to standard output.
"ECNumberList.txt":
A text file containing the list of EC numbers that ECPred can predict.
"sample.fasta":
An example input fasta file.
"results.tsv":
An example output prediction file (for sample.fasta).
ECPred: a tool for the prediction of enzymatic properties of protein sequences based on the EC Nomenclature Copyright (C) 2018 CanSyL
This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program. If not, see http://www.gnu.org/licenses/.
If you find ECPred useful, please consider citing our publication:
Dalkiran, A., Rifaioglu, A. S., Martin, M. J., Cetin-Atalay, R., Atalay, V., & Doğan, T. (2018). ECPred: a tool for the prediction of the enzymatic functions of protein sequences based on the EC nomenclature. BMC bioinformatics, 19(1), 334. https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-018-2368-y