Simple SHELL script for parsing BLAST output
1. To parse the sequence names from BLAST output.
"grep" is one of the very powerful unix command to retrieve the particular pattern from a file.
Syntax:
grep "
Example: grep ">" Blast_output.txt
In this above example grep command will retrieve the lines which are having ">" symbol. In Blast output file all the sequence names are starting with ">". So you can get all the sequence names in the Blast output file.
Learn More
2. Parsing the Sequence names and the sequences from the BLAST output
"egrep" is one of the powerful command in retrieving multiple patterns from a file.
Syntax:
egrep "pattern1 | pattern2 | pattern3" filename
Example:
Below is the combination of SHELL and Perl script for parsing the BLAST Output.
egrep "> | sbjct" Blast_output | sed 's/Sbjct://' BLAST_output.txt >output.txt
open (FH, output.txt);
while("
{
if($ln !~ m/>/)
{
@temp = split(/\t/,$ln);
print "$temp[1]\n";
}
else
{
print $ln;
}
}
In the above example egrep will retrieve the lines which are matching with ">" and Sbjct and store the output in output.txt. Then the Perl script will parse the sequeunces.
Thursday, March 27, 2008
SHELL Scripts for Simple Bioinformatics Analysis
Subscribe to:
Post Comments (Atom)

2 comments:
JANTA VEDIC COLLEGE,Baraut(U.P.),under CCS University,Meerut offering M.Sc. Bioinformatics since 2007.
For further details:
Dr.Rajeshwari Sharma
09837739099
Co-cordinator,
Dept. of Bioinformatics
Manoj Kumar Sharma
09756276078
Lecturer-Bioinformatics
Dept. of Bioinformatics
JANTA VEDIC COLLEGE,Baraut(U.P.),under CCS University,Meerut offering M.Sc. Bioinformatics since 2007.
For further details:
Dr.Rajeshwari Sharma
09837739099
co-cordinator,
Dept. of Bioinformatics
Manoj Kumar Sharma
09756276078
Lecturer-Bioinformatics
Dept. of Bioinformatics
Post a Comment