BioPerl documentation on ClustalW is great but I faced some problems as a beginner. The following is a code from ClustalW docs modified to make life easier for the beginner.
1. Make sure bioperl-run in installed in addition to BioPerl.
2. Make sure clustalw is installed at executable
3. Set path using the following command (assuming that clustalw is installed at /usr/local/bin/clustalw2):
export CLUSTALDIR=/usr/local/bin/clustalw2
#!/usr/bin/perl
use Bio::AlignIO;
use Bio::Root::IO;
use Bio::Seq;
use Bio::SeqIO;
use Bio::SimpleAlign;
use Bio::TreeIO;
BEGIN { $ENV{CLUSTALDIR} = '/usr/local/bin/clustalw2/' }
use Bio::Tools::Run::Alignment::Clustalw;
# Build a clustalw alignment factory
@params = ('ktuple' => 2, 'matrix' => 'BLOSUM');
$factory = Bio::Tools::Run::Alignment::Clustalw->new(@params);
# Pass the factory a list of sequences to be aligned.
$inputfilename = 'blastdump/input.fasta';
$aln = $factory->align($inputfilename); # $aln is a SimpleAlign object.
# or
$seq_array_ref = \@seq_array;
# where @seq_array is an array of Bio::Seq objects
$aln = $factory->align($seq_array_ref);
# Or one can pass the factory a pair of (sub)alignments
#to be aligned against each other, e.g.:
$aln = $factory->profile_align($aln1,$aln2);
# where $aln1 and $aln2 are Bio::SimpleAlign objects.
# Or one can pass the factory an alignment and one or more unaligned
# sequences to be added to the alignment. For example:
$aln = $factory->profile_align($aln1,$seq); # $seq is a Bio::Seq object.
# Get a tree of the sequences
$tree = $factory->tree(\@seq_array);
# Get both an alignment and a tree
($aln, $tree) = $factory->run(\@seq_array);
# Do a footprinting analysis on the supplied sequences, getting back the
# most conserved sub-alignments
my @results = $factory->footprint(\@seq_array);
foreach my $result (@results) {
print $result->consensus_string, "\n";
}
# There are various additional options and input formats available.
# See the DESCRIPTION section that follows for additional details.