Creating sequence objects

When working with fasta or other files, you have to first create sequence objects.

#!/usr/bin/perl
use strict;
use warnings;
use Bio::Perl;
use Bio::SeqIO;

# create sequence object
my $s = Bio::SeqIO->new( -file => "pygm.fasta", -format => "fasta");
my $st = $s->next_seq;
print $st->seq;

usage:

$ perl create_sequence_object.pl

It takes a file named pygm.fasta as input and creates a sequence object. The last two lines are for printing the sequence.

If you have multiple fasta sequences in a file, SeqIO would create multiple sequence object for you automatically. To print all the sequences, you can use the while loop in the last line.

#!/usr/bin/perl
use strict;
use warnings;
use Bio::Perl;
use Bio::SeqIO;

# create sequence objects
my $s = Bio::SeqIO->new( -file => $ARGV[0], -format => "fasta");

# print sequences
while (my $st = $s->next_seq) { print $st->seq; print "\n"; }

usage:

$ perl create_sequence_objects.pl pygm.fasta

Table 1: Sequence Object Methods
Name Returns Example Note
new

Sequence object $so = Bio::Seq->new(-seq => "MPQRAS") create a new one, see Bio::Seq for more
seq sequence string

$seq = $so->seq get or set the sequence
display_id identifier $so->display_id("NP_123456") get or set an identifier
primary_id identifier $so->primary_id(12345) get or set an identifier
desc description $so->desc("Example 1")

get or set a description
accession identifier $acc = $so->accession get or set an identifier
length

length, a number $len = $so->length get the length
alphabet alphabet $so->alphabet('dna') get or set the alphabet ('dna','rna','protein')

subseq sequence string $string = $seq_obj->subseq(10,40) Arguments are start and end
trunc Sequence object

$so2 = $so1->trunc(10,40) Arguments are start and end
revcom Sequence object $so2 = $so1->revcom Reverse complement
translate protein Sequence object $prot_obj = $dna_obj->translate See the Bioperl Tutorial for more
species

Species object $species_obj = $so->species See Bio::Species for more
seq_version version, if available $so->seq_version("1")

get or set a version
keywords keywords, if available @array = $so->keywords get or set keywords
namespace

namespace, if available $so->namespace("Private") get or set the name space
authority authority, if available $so->authority("FlyBase") get or set the organization

get_secondary_accessions array of secondary accessions, if available @accs = $so->get_secondary_accessions get other identifiers
division division, if available (e.g. PRI)

$div = $so->division get division (e.g. "PRI")
molecule molecule type, if available $type = $so->molecule get molecule (e.g. "RNA", "DNA")
get_dates array of dates, if available @dates = $so->get_dates get dates
pid pid, if available $pid = $so->pid

get pid
is_circular Boolean if $so->is_circular { # } get or set

Table source: http://www.bioperl.org/wiki/HOWTO:Beginners