Tutorial

How to use the Web Server
The web server provides all of the options available in the standalone version. The instructions for using the web server are provided below.
Application Page
This page is designed to allow to users to submit the protein sequences in FASTA format. These sequences will be analyzed by the MP3 software and will segregate these sequences into pathogenic and non-pathogenic sequences. Users can submit wither the protein sequences from a completed genome where it is assumed that the proteins sequences will be complete, OR, the user can submit the translated metagenomic ORFs where it is assumed that they could be complete or partial sequences. The various parameters which can be specified are highlighted in the following figure.
Sample file: The sample file can be uploaded using the 'Upload Sample File' option. In case of genomic, the sample file contains 100 complete proteins in FASTA format. In case of metagenomic, it contains 100 partial proteins in FASTA format.
Results Page Link
After successful upload of the file on the server, a link to retreive the results will appear as shown below. In case of a large sized input which might take a while to process, users can save the link and access their results later.
Results Page
The results will provide the summary of the analysis and links to download the result files as shown below.
HOW TO USE THE STANDALONE PROGRAM
How to install:
The installation instructions are provided on the Download page.
Command line usage for running MP3
./mp3 <infile> <dataset> <minimum length of protein sequences> <threshold>

infile: Input file containing protein sequences in FASTA format.

dataset: Specify 1 or 2. "1" referes to genomic proteins and "2" refers to metagenomic proteins.

minimum length of protein sequences: Please specify the estimated minimum length of protein sequences in the input file.

threshold: Please specify the threshold at which the results will be classified as pathogenic or non-pathogenic.

FILE FORMATS AND DESCRIPTION
Input file format The sequences should be in fasta file format.
>r1|info1|info2
SQKLILDKLSFSVPKNSITSILAPNGSGKTTLLKCLLGLLKPLEETEIKACNKDILPLKPYEKAKLI
The ‘info’ in the above example is any information which the user wishes to specify in the annotation line. If the sequences are not in FASTA format or contain any character other than usual amino acid letter codes, the file will not be accepted and an error message will appear.

NOTE : THE UPLOADED FILE NAME SHOULDN'T HAVE ANY BLANK SPACES (for e.g. "sample file" is not allowed)
OUTPUT FILES

Descriptions of the output files.

1. *.HMM.result: This file provides the list of all proteins which are classified as Pathogenic or Nonpathogenic by HMM module.

2. *.HMM.summary: This file provides the summary of the HMM module results.

3. *.Hybrid.result: This file provides the list of all proteins which are classified as Pathogenic or Nonpathogenic by the Hybrid module. It also provides the prediction value of SVM, type of domain present in the protein and the final assignment by Hybrid module.

4. *.Hybrid.summary: This file provides the summary of the Hybrid module results.

5. *.SVM.result: This file provides the list of all proteins which are classified as Pathogenic or Nonpathogenic by SVM module along with the SVM prediction value.

6. *.SVM.summary: This file provides the summary of the SVM module results.