Saturday, August 9, 2008

Bioinformatics CPAN Modules

Bioperl is the product of a community effort
to produce Perl code which is useful in biology.

Bioperl Tutorial...


BAMBE Bayesian Analysis in Molecular
Biology and Evolution.

Perl extension to perform Hidden Markov
Model calculations.

Perl extension for searching in DNA and
Protein sequences.

Write EMBOSS programs in Perl. This module
allows Perl programmers to access functions
of the EMBOSS (European Molecular Biology
Open Software Suite) package.
Tranfoorm ncbirn bank into fasta with
annotated headers.

A simple interface for byte indexing a
WU-BLAST multi-part report for
faster access.This module was written
to aid accessing specific reports from longer,
multi part WU-BLAST (
alignments reports.

Bioperl class for FASTA Sequence database search.

Extract peptide sequences from MEDLINE article abstracts.

'ONTO-PERL' a collection of perl modules for dealing
with the Cell Cycle Ontology (CCO) and in general
with OBO ontologies (like the Gene Ontology).

A light-weight parsing module for handling FASTA
formatted sequence within larger perl applications.

Database object interface to SwissProt retrieval.

Construct a phylogenetic tree using distance based

Testing compatibility of phylogenetic trees with
nested taxa.

Implementation of sequence with residue quality
and trace values.

A Perl module for creating and manipulating DNA
Microarray experiment objects.
converts uniprot native text format (.dat or .seq)
into fasta file,reporting varsplic,signal,peptide,PTM,conflicts.
Reads input fasta file and produce a shuffle
databank & avoid known cleaved peptides: shuffle
sequence but avoid producing known tryptic peptides.

Bioperl BLAST sequence analysis object.

Object for the calculation of a multiple sequence
alignment from a set of unaligned sequences or
alignments using the Clustalw program.

Interface for handling web queries and data retrieval
from Entrez Utilities at NCBI.

Object for the calculation of a multiple sequence
alignment from a set.

Object for the calculation of an iterative multiple
sequence alignment from a set of unaligned sequences
or alignments using the Amap (2.0). program of
unaligned sequences or alignments using the TCoffee program.

Create input for and work with the output from the program

Object for identifying low complexity regions in a given
protein sequence.

Wrapper for RepeatMasker Program.

Object for predicting 'pseudogenes' in a given sequence
given a protein and a CDNA sequence.

Wrapper for aligning two sequences using promoterwise.

Searches a DNA database for matches with a set of STS
primers (EMBOSS).

wrapper around GOR4 protein secondary structure
prediction server.

wrapper around HNN protein secondary structure prediction

Wrapper around Sopma protein secondary structure prediction

Wrapper for the phylip program neighbor for creating a
phylogenetic tree(either through Neighbor or UPGMA) based
on protein distances based on amino substitution rate.

Calculate Protein Alignment statistics (mostly distances).

Finding repeats in protein sequences.

Wrapper around the Scansite server.

Object holding alternative alphabet coding for one
protein sequence.

Convert an input mRNA/cDNA sequence into protein.

Protein Data Bank file format reader/writer.

This module is a wrapper around the ELM server (
which predicts short functional motifs on amino acid sequences.

Wrapper around Mitoprot server.

Object for predicting genes in a given sequence given a protein.

Object for identifying transmembrane helixes in a given
protein seequence.

A feature representing an arbitrarily complex structure
of a gene.

Parses a gene annotation file.

Provides API for retrieving data from Gene Ontology obo file.

Compares data from serial analysis of gene expression
(SAGE) libraries.

Friday, August 8, 2008

What is Perl ?


* Perl is a stable, cross platform programming language.
* Perl stands for Practical Extraction and Report Language.
* It is used for mission critical projects in the public and private
* Perl is Open Source software, licensed under its Artistic

License or the GNU General Public License (GPL).
* Perl was created by Larry Wall.
* Perl 1.0 was released to usenet's alt.comp.sources in 1987
* PC Magazine named Perl a finalist for its 1998 Technical
Excellence Award in the Development Tool category.
* Perl is listed in the Oxford English Dictionary.

Supported Operating Systems:

* Unix systems
* Macintosh - (OS 7-9 and X) see The MacPerl Pages.
* Windows - see ActiveState Tools Corp.
* And many more...

Best Features Of Perl :

* Perl takes the best features from other languages, such as C, awk,
sed, sh, and BASIC, among others.
* Perls database integration interface supports third-party databases including Oracle, Sybase, Postgres MySQL and others.
* Perl works with HTML, XML, and other mark-up languages.
* Perl supports Unicode.
* Perl is Y2K compliant.
* Perl supports both procedural and object-oriented programming.
* Perl interfaces with external C/C++ libraries through XS or SWIG.
* Perl is extensible. There are over 500 third party modules available
from the Comprehensive Perl Archive Network.
* The Perl interpreter can be embedded into
other systems.

PERL and the Web

* Perl is the most popular web programming language due to its text
manipulation capabilities and rapid development cycle.
* Perl is widely known as " the duct-tape of the Internet.
* Perl's module, part of Perl's standard distribution, makes
handling HTML forms simple.
* Perl can handle encrypted Web data, including e-commerce transactions.
* Perl can be embedded into web servers to speed up processing by as
much as 2000%.
* mod_perl allows the Apache web server to embed a Perl interpreter.
* Perl's DBI package makes web-database integration easy.


Wednesday, August 6, 2008

Bioinformatics Definition / Bioinformatics Definitions / What is Bioinformatics ?

Bioinformatics is a tool to solve the Biological problems based on existing data.

Bioinformatics is a method to solve the Biological outcomes based on existing experimental results.

Bioinformatics = Biology + Informatics + Statistics + (Bio-Chemistry + Bio- Physics).

Bioinformatics creates the way for the Biologists to store all the data.

Bioinformatics makes some lab experiments easy by predicting the outcome of the lab experiment.

Somtimes Bioinformatics shows the initial way to start the lab experiment from existing results.

Bioinformatics helps the researchers to get an idea about any lab experiments before they start.

Sunday, June 29, 2008

ScalaBLAST - For High End genome analysis

Analyzing the whole genome is still a time taking process. A new computational tool developed at the Department of Energy's Pacific Northwest National Laboratory is speeding up our understanding of the machinery of life – bringing us one step closer to curing diseases, finding safer ways to clean the environment and protecting the country against biological threats.

ScalaBLAST is a sophisticated "sequence alignment tool" that can divide the work of analyzing biological data into manageable fragments so large data sets can run on many processors simultaneously. The technology means large-scale problems – such as the analysis of an organism – can be solved in minutes, rather than weeks.

Using ScalaBLAST, researchers can manage the large influx of data resulting from new questions that arise during human genome research. Prior to this new tool, it took researchers 10 days to analyze one organism. Now, researchers can analyze 13 organisms within nine hours, making the time-to-solution hundreds of times faster.

PyroBayes - Analyze 500,000 Sequences in 10 Mins.

Human Genome and Annotation project took more than a decade to complete. Boston College Biologist Gabor Marth and his research team have developed software that can analyze half a million DNA sequences in 10 minutes.
The Marth laboratory's proprietary PyroBayes software is one of a new breed of computer programs able to accurately process the mountains of genome data flowing from the latest generation of gene decoding machines, which have placed a premium on computational speed and accuracy in data-crunching fields known as bioinformatics and high-throughput biology, said Marth, an associate professor of Biology.

