Download

The Woods package is now available at below link.
Package Platform
woods.tar.bz2 Linux or Unix
We recommend that your system has a minimum of 48 GB RAM and 2Ghz processor for carrying out analysis on large metagenomic datasets.

Please enter following information:

 
Woods development
The algorithm was written in Perl (version 5.10.1). Random Forest (RF) has been implemented via R package ( http://cran.r-project.org/). RAPSearch2 (http://omics.informatics.indiana.edu/mg/RAPSearch2/) is used a similarity-search algorithm in Woods. Software packaging was done using the PAR Packager (PAR-Packer-1.002) and executables were generated for Linux.
How to install
We have pre-compiled the executable for Linux operating systems.
After downloading, uncompress the package:
tar -xjvf woods.tar.bz2

A directory named Woods will be created containing the sample file and executables as listed below.
Woods/Databases Woods/RF.Rdata Woods/random.sh Woods/woods Woods/test.fasta Woods/dataprep.sh

Other downloads

1. Please download the RAPSearch2 from this link, and copy the executibles into the woods directory.

( http://sourceforge.net/projects/rapsearch2/files/ )

2. Dowload R from this link as per the systems requirement from.

( http://cran.r-project.org// ) OR

Direct installation by typing

For centOS:

yum install R # R installation requires super-user privileges #

OR

For Debian:

apt-get update

apt-get install r-base r-base-dev

3. RandomForest installation by typing.

R

install.packages ('randomForest')

CRAN mirror (choose the appropriate mirror)

q ()

4. Download the README file

Optional download

The 100 tree models is also available, user can download using the following link.

RF_t100.tar.bz2

After downloading, uncompress the packages by typing.
tar -xjvf RF_t100.tar.bz2

Move the uncompressed files into the Woods directory.

Command line usage for running Woods
sh dataprep.sh (Note: Run this command only once for database preparation)
./woods <infile> <dataset> <RF model> <method>

infile: Input file containing protein sequences in FASTA format.

dataset: Specify 1 or 2. "1" referes to genomic proteins and "2" refers to metagenomic proteins.

RF Model: Specify the RF model (100) if you wish to use RF model of 100 trees. Default model is of 500 trees

Method: Specify the method of prediction. 'r' for only Random Forest model, Woods is default.