In this article of my BioPerl Tutorial series, we will see how to write BioPerl Program or BioPerl Script. If you are looking for BioPerl Introduction and Usage, click here and to install BioPerl you can see this BioPerl tutorial for windows installation.
NOTE: For this, I assume that you know how to write Perl programs and have fair knowledge about Object Oriented Programming Paradigm (OOPS).
How to code and run BioPerl programs?
When trying your hands for the first time, you might get confused on where to code and how to run BioPerl modules and programs. Well, the answer is same way you do for Perl. You will be writing your program in a text editor (I use Notepad++). If you don’t have Notepad++, you can download it from here. Once you write program, save it with .pl extension in the same directory where your BioPerl install has taken place. Then you need to run the program from Command Line.
Creating a Sequence Object
We will be first creating a sequence by creating a sequence object. We can’t use our programs as such and needs to create objects first as BioPerl is Object Oriented Language. If you have not done enough programming in C++, Java or other OOPS language, I would suggest you to try your hands on these first.
NOTE: Perl is a case sensitive language. Ensure, writing syntax exactly as written.
Creating BioPerl Object:
BioPerl objects are created with specific BioPerl modules. Thus, it is important to remember which perl module you need to use.
In our case, we will be using Seq BioPerl module. Write it in following way:
This line informs Perl application to use specific modules on your system called as “Bio/Seq.pm”. We are using Bio::Seq module for creating a Bio::Seq object. This object contains sequence and associated features like identifiers, names as well as properties.
Now let’s create sequence object named obj_seq
$obj_seq = Bio::Seq-> new (-seq->”atgcatgc”, -alphabet=>”dna”);
In this we have stored a single sequence “atgcatgc”.
alphabet tells BioPerl that sequence to be stored is DNA. Other choices available are ‘protein’, ‘rna’. Thus if you have to store RNA sequence, you need to write as following:
$obj_seq = Bio::Seq-> new (-seq->”augcaugc”, -alphabet=>”rna”);
Though ‘alphabet’ is optional and BioPerl is self-capable of guessing input sequence but with use of lot of X, it can guess wrong, thus always advisable to tell BioPerl nature of input sequence.
new() method is used for explicitly creating an object. The syntax of new() name of variable or object, name of module, then -> symbol and then name of method/argument or value/argument itself like
seq(name of argument)->”atgcatgc” (Vaule of argument)
Automatic Creation of Bio::Seq Objects
In this specific example, we have manually created Bio::Seq objects but they can be created automatically too.
Bio::Seq module has a method called seq() which prints sequence in Bio::Seq objects. To use it, following syntax is used:
This will print our script: atgcatgc
-> symbol is employed when an object has to access or call its methods.
Thus, the complete code will be:
$obj_seq = Bio::Seq-> new(-seq => “atgcatgc”,-alphabet => “dna”);
Here is screenshot of BioPerl program:
Screenshot of BioPerl script execution:
I hope you are also able to code as easily as I did in BioPerl now. In the next article, we will see how to write sequence information to a FASTA file using Bio::SeqIO BioPerl module.