Visit 123Bioinformatics.com for more updates.
Installing and executing stand-alone BLAST softwares in Linux.
Stand alone BLAST is the local installation of the NCBI BLAST suite of programs. NCBI provides binaries for various platforms. It is the same as the NCBI BLAST programs except that we can execute in the local machine.
The local version is significant when we have a large set of sequences to BLAST and this is not affected by the Internet speed /Traffic etc and it can be automated.
The local version is significant when we have a large set of sequences to BLAST and this is not affected by the Internet speed /Traffic etc and it can be automated.
The stand alone blast can be downloaded from the NCBI FTP site (The link can be found at the bottom side tool bar in the NCBI main page “FTP Site-> Blast-> executables->Latest”).
The file should be in binary mode. Filenames are of the following form:
Program-version-architecture-os.extension Remember to choose the appropriate architecture (32 bit or 64 bit). Download the file and extract the contents in the gzip'ed tar archive. The ‘.gz’ file extension indicates that the file has been compressed with gzip (a standard Unix compression utility), The ‘.tar’ extension indicates that the file is a tape archive created with tar (a standard Unix archiving tool).
To uncompress ‘gunzip’ and extract the files from the archive into the current working directory follow the comments given below.
jk@jk:~/Desktop/blast-2.2.18/bin$ gunzip blast-2.2.18-ia32-linux.tar.gz #uncompress
jk@jk:~/Desktop/blast-2.2.18/bin$ tar -xpf blast-2.2.18-ia32-linux.tar #extract
For more information on the options look into $man tar/gunzip.
When you get into the extracted directory you can see three other directories (bin, data, doc). The doc directory contains the README files for each software. The data directory contains the scoring matrices. The bin directory contains all the executables for running various BLAST searches.
How to execute bl2seq (BLAST two sequence):
Bl2seq performs a comparison between two sequences using either the blastn or blastp algorithm. Both sequences must be either nucleotides or proteins.
The input files to any BLAST softwares should always be in FASTA format.
eg
>gi|229673|pdb|1ALC| Alpha-Lactalbumin
KQFTKCELSQNLYDIDGYGRIALPELICTMFHTSGYDTQAIVENDESTEYGLFQISNALWCKSSQSPQSR
NICDITCDKFLDDDITDDIMCAKKILDIKGIDYWIAHKALCTEKLEQWLCEKE
Syntax:
jk@jk:~/Desktop/blast-2.2.18/bin$ ./bl2seq - # Displays all options
You can choose the required options. The must-options are -p, -i, -j. The other options can be defined or elze the program will choose the default value.
jk@jk:~/Desktop/blast-2.2.18/bin$ ./bl2seq -p blastp -e 0.01 -i
-i First sequence [File In]
-j Second sequence [File In]
-p Program name: blastp, blastn, blastx, tblastn, tblastx. For blastx 1st sequence should be nucleotide, tblastn 2nd sequence nucleotide.
-e E-Value # (optional)
How to execute Blastall:
go to NCBI-> FTP site-> RefSeq-> H_sapiens-> H_sapiens ->chr22.
Note:
>gi|86438068|gb|AAI12638.1| HGD protein [Bos taurus]
MTELKYISGFGNECASEDPRCPGALPEGQNNPQVCPYNLYAEQLSGSAFTCPRSTNKRSWLYRILPSVSH
KPFEFIDQGHITHNWD
>gi|116283875|gb|AAH44758.1| Hgd protein [Mus musculus]
MSVLQRILAVQVPCPKDSWLYRILPSVSHKPFESIDQGHVTHNWDEVGPDPNQLRWKPFEIPKASEKKVD
FVSGLYTLCGAGDIKSNNGLAVHIFLCNSSMENRCFYNSDGDFLIVPQKGKLLIYTEFGKMSLQPNEICV
>gi|116283724|gb|AAH24369.1| Hgd protein [Mus musculus]
MSVLQRILAVQVPCPKDSWLYRILPSVSHKPFESIDQGHVTHNWDEVGPDPNQLRWKPFEIPKASEKKVD
Formatdb:
jk@jk:~/Desktop/blast-2.2.18/bin$ ./formatdb - # displays all options
jk@jk:~/Desktop/blast-2.2.18/bin$ ./blast-2.2.18/bin/formatdb -i
-i Input file(s) for formatting (this parameter must be set) [File In]
-p Type of file T - protein F - nucleotide (default = T)
-o Parse options T - True: Parse SeqId and create indexes. F - False: Do not parse SeqId. ( default = F)
2. Executing Blastall:
jk@jk:~/Desktop/blast-2.2.18/bin$ ./blastall -i
-p Program Name [String] Input should be one of "blastp", "blastn", "blastx", "tblastn", or "tblastx".
-d Database [String] default = nr The database specified must first be formatted with formatdb.
-i Query File [File In]
-o BLAST report Output File [File Out]
The output file will contain the BLAST output for all the input query sequences.