Perl stands for Practical Extraction and Report
Language. Perl is a very rich language and has a huge collection of operators, functions, and a lot more. It used to be the hottest language on the Internet until PHP came along. Now it dominates the biotechnology sector. Here you will find many tutorials and articles and we are always adding more. Browse around.
[Getting Started with Perl] [Essential Perl] [File Handling] [Merciless Guide to Perl] [Introduction to References] [Subroutine Best Practices] [Strings] [Special Variables]
[Pattern Matching Chapter] [Five Habits for Successful Regular Expressions]
[Reading and Writing XML with Perl] [XML Core Concepts]
[Perl DBI] [A Short Guide to Perl DBI]
[Perl CGI Security] [Geometric Algorithms with Perl]
Downloading Perl
You can download the latest version from ActiveState.com.
Choose the Windows MSI download. The latest version at the time of the
printing of the article is 5.8.8. The downloaded file should have a
.msi extension.
Installing Perl
Double-click on the downloaded file to install the software. The
following two options should be checked:
Add Perl to the PATH environment variable
Create Perl file extension association
Configuring Perl and CGI
Open Apache's httpd.conf file for editing. If you don't know
how, refer to Installing
Apache on Windows XP.
Change the following line:
Options Indexes FollowSymLinksto
Options Indexes FollowSymLinks ExecCGI IncludesThen uncomment the #AddType text/html .shtml and #AddOutputFilter INCLUDES .shtml by removing # characters. This modification would enable CGI and SSI.
Then change:
#AddHandler cgi-script .cgito
AddHandler cgi-script .cgi .plthen comment ScriptAlias line by addina a # character in front. This modification would allow you to run cgi scripts with .cgi and .pl extensions outside cgi-bin folder.
Restart Apache.
Hello Perl!
Now its time to test the installation and configuration. Use notepad
and and save the following example in Apache's working directory as
hello.pl. My directory is c:/public_html. If you are confused,
refer to Installing
Apache on Windows XP.
#!c:/perl/bin/perl use CGI qw(:standard); use strict; print header; print "<b>Hello Perl!</b>";Make sure Apache is running. Use your browser to view http://localhost/hello.pl. You should see Hello Perl. If you get an error message, refer to your apache error log. It is located at your_apache_installation_folder/logs/error.log.
Debugging Errors
To debug your Perl scripts, you would need to run your scripts from
the command line. To do so go to Start --> Run --> type cmd and
click on OK. In the command line window, use cd command to go
to c:/public_html (your working directory). Type perl hello.pl
and hit enter.
Perl is different from popular languages like C, C++, and Java. In C, variable is name of a memory location. You can store a piece of data in it like an interger, a floating point number, a character etc. Before using a variable, you have to declare it along with it type (whether it is an integer, character, etc.). In Perl you don't have to deal with all that nonsense. All you do is just use a variable.
A perl variable is different from a C variable. In Perl, a variable can be a scalar, array, hash, subroutine, or a typeglob. Perplexed. Don't be. A scalar performs all the functions of a C variable and a lot more. In Perl, array is also a variable. You will see how powerful your code will become if you can treat an array like you treat a variable. Hash used to be called an associative array. It is a different kind of an array, an unordered array. A subroutine is much like a C function and a lot more. A typeglob is another interesting topic. Each of these data types are distinguished by the first symbol of the variable. Everything starting with $ is a scalar. Everything starting with @ is an array and so on. See the table below.
| $cents | An individual value (number or string) | ||
| @large | a list of values, keyed by number | ||
| %interest | A group of values, keyed by string | ||
| &how | A callable chunk of Perl code | ||
| *struck | Everything named struck |
Just like C, Perl variables also have types. When you use a variable, Perl automatically declares and initializes it. All scalars are initialized to 0 by default. All arrays are initialized to NULL by default. All strings are initialized to empty string by default.
A scalar variable holds a single scalar value. The value represents either a number, a string or a reference to something. Scalar variable names begin with a dollar sign followed by a letter(s), digit(s), or underscore(s). All scalar variables are case-sensitive. Perl has three contexts in which it will interpret a scalar variable: string context, numeric context, and miscellaneous context.
The scalar data type is the most basic form of data container Perl has. Perl treats strings and numbers in a similar manner. You don't need to declare a scalar, just create it (use it).
$string = "YourString";
$number = 269;
$decimal = 49.42
Perl figures out by itself whether it is a float, integer, or string. If you want to include the symbols ( " ) or ( ; ) in your string, you will need to escape them by using ( \" ) and ( \; ). You can use q() for single quotes and qq() for double quotes as well. Please refer to Perl Quotes and Escape Sequences for more information.
| Type | Example |
|---|---|
| integer | $answer = 968; |
| real | $pi = 3.14159265 |
| scientific | $avogadro = 6.02e23 |
| string | $car = "BMW" |
| string with interpolation | $sign = "I love my $car" |
| string without interpolation | $cost = 'It costs $80000'; |
| another variable | $one = $two |
| expession | $force = $mass * $acceleration; |
| string output from a command | $cwd = 'pwd' |
| numeric status of a command | $exit = system("vi, $x") |
| an object | $car = new Car "BMW"; |
There is no way to declare a scalar to be of type "number" or "string". Perl converts between the various subtypes as needed, so you can treat a number as a string or a string as a number, and Perl will do the Right Thing. References (pointers), however, are not castable.
| 12345 | integer |
| 12345.67 | floating point |
| 6.02E23 | scientific notation |
| 0xffff | hexadecimal |
| 0489 | octal |
Before performing an operation, perl operators decide the type of its operands. If the both or all operands (whichever is applicable) are scalars,then the result is a scalar. We would explore what happens if all operands are not scalar in the following chapters. This section would just briefly touch the topic of operators. Please refer to Perl Operators for more information. Perl supports common arithmetic operators like +, -, * , /, and %.
$a = 4 + 7; # $a = 11
$a = 4.9 + 3.9; # $a = 8.8
$a = 10 / 3; # $a = 3.3333333....
$a = 5 % 3; # $a = 2, remainder
Perl has different comparison operators for strings and numbers.
| Comparision | Numbers | Examples | Strings | Examples |
|---|---|---|---|---|
| Equal | == |
if($one == 5) { # do something; } |
eq |
if($string1 eq $string2) { # do something; } |
| Not Equal |
if($one != 5) { # do something; } |
ne |
if($string1 ne $string2) { # do something; } |
|
| Less Than |
if($one < 5) { # do something; } |
if($string1 lt $string2) { # do something; } |
||
| Greater Than |
if($one > 5) { # do something; } |
if($string1 gt $string2) { # do something; } |
||
| Less Than or Equal to |
if($one <= 5) { # do something; } |
if($string1 le $string2) { # do something; } |
||
| Greater Than or Equal to |
if($one >= 5) { # do something; } |
if($string1 ge $string2) { # do something; } |
There are two really handy operators for strings only. (.) and (x). The first concatenates strings. The other multiplies them:
"my " . "life."; # my life. This operator concatenates strings
"perl" x 3; # This is same as perlperlperl
From time to time, you would want to convert a string to a number and a number to string. This is how you do it. Suppose you have a $string "123" and a $string2 "234", then string = string + string2; would produce 357 (as a number, not as a string), not 123234.
To be fair, Perl also provides some operators which deal with numbers only. Namely autoincrement and autodecrement.
$a = 4; $b = 9;
$r = ++$a; # $r = 5. Increments before assignment
$r = $b++; # $r = 5, $b = 10. Increment after assignment
$b--; # postdecrement
--$b; # predecrement
chop
Chop function chops of the last character of a string scalar. It seems like a useless function but it does come in handy at times.
$r = "perls";
$r = chop($r); # $r = s
chop($r); # $r = perl
chomp
chomp deletes the last character only if it is a \n. It comes in handy.
Perl contains numerous variables that have a special meaning. Below is a list of many of them.
| Variables |
English Name |
Description |
| $_ | $ARG | The default input and pattern-searching space |
| $& | $MATCH | The string matched by the last successful pattern match |
| $* | $PREMATCH | The string preceding whatever was matched by the last successful pattern |
| $' | $POSTMATCH | The string following whatever was matched by the last successful pattern match |
| $` | $LAST_PAREN_MATCH | The last bracket matched by the last search pattern |
| $+ | $MULTILINE_NUMBER | If set to 1, Perl 5+ does multi-line matching within a string (the default is 0) |
| $. | $INPUT_LINE_NUMBER | The last current input line number from the last file handle read (an explicit close on a file handle resets the line number) |
| $/ | $INPUT_RECORD_SEPARATOR | The input record separator (newline by default) |
| $| | $OUTPUT_AUTOFLUSH | If set to any nonzero value, forces a flush after every write or print on the currently selected output device (the default is 0) |
| $, | $OUTPUT_FIELD_SEPARATOR | The output field separator for the print function |
| $\ | $OUTPUT_RECORD_SEPARATOR | The output record separator for the print function |
| $" | $LIST_SEPARATOR | The output list separator for the print function |
| $; | $SUBSCRIPT_SEPARATOR | The subscript separator for multidimensional array emulation |
| $# | $OFMT | The output format for printed numbers |
| $% | $FORMAT_PAGE_NUMBER | The current page number of the currently selected output file handle |
| $= | $FORMAT_LINES_PER_PAGE | The current page length (printable lines) of the currently selected output file handle |
Perl has a whole array of very useful operators. They can generally be classified as follows:
You have to use different operator for numeric and string to accomplish the same task. String operators cannot be used for numeric values and vice versa.
| String | Numeric | Purpose | Syntax |
|---|---|---|---|
| eq | == | equal to | true if $a == $b true if $s1 eq s2 |
| ne | !- | not equal to | true if $a != $b true if $s1 ne s2 |
| lt | < | less than | true if $a < $b true if $s1 lt s2 |
| gt | > | greater than | true if $a > $b true if $s1 gt s2 |
| le | <= | less than or equal to | true if $a <= $b true if $s1 le s2 |
| ge | >= | greater than or equal to | true if $a >= $b true if $s1 ge s2 |
| cmp | <=> | comparison with a signed result | 0 if equal 1 if $a greater -1 if $b greater |
Perl has a rich collection of string operators:
| . | Concatenate |
|
|
|
|
+ |
Addition |
|
- |
Substraction |
|
* |
Multiplication |
|
/ |
Division |
|
** |
Raise the right operand to the power of the left operand |
|
% |
Modulo |
These operators are already defined in the tables above in the context of numeric and string. For example = is an assignment operator and eq is an equivalence operator.
|
|
|
|
|
|
||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
||
|
|
|
|
|
|
|
|
|
|
|
|
|
Named Operators |
||
|
int |
int(5.6234) | 5 |
| length | length("nose") | 4 |
| lc |
lc(LOWER) |
lower |
| uc | uc(upper) | UPPER |
| cos | cos(30) | 0.8660 |
| rand | rand(5) |
Returns a random number from 0 to less than its argument. If the |
|
|
|
| Terms and list operators (leftward) | Left |
| -> | Left |
| ++ -- | Nonassociative |
| ** | Right |
| ! ~ \ and unary + and - | Right |
| =~ !~ | Left |
| * / % x | Left |
| + - . | Left |
| << >> | Left |
| Named unary operators | Nonassociative |
| < > <= >= lt gt le ge |
Nonassociative |
| == != <=> eq ne cmp |
Nonassociative |
| & | Left |
| | ^ | Left |
| && | Left |
| || | Left |
| .. | Nonassociative |
| ?: | Right |
| = ++ -+ *= and so on |
Right |
| , => | Left |
| List operators (rightward) | Nonassociative |
| not | Right |
| and | Left |
| or xor | Left |
print ++($foo = '99'); # prints '100'
print ++($foo = 'a0'); # prints 'a1'
print ++($foo = 'Az'); # prints 'Ba'
print ++($foo = 'zz'); # prints 'aaa'
-2**4 is -(2**4), not (-2)**4
Unary ! performs logical negation which is "not"
Unary - performs arithmetic negation if the operand is numeric. If the operand is an identifier, a string consisting of a minus sign conccatenated with the identifier is returned. Otherwise, if the string starts with a plus or minus, a string starting with the opposite sign is returned.
Unary ~ performs bitwise negation, that is 1's complement.
Unary + has no semantic effect whatsoever, even on strings. It is syntactically useful for separating a function name from a parenthesized expression which would otherwise be interpreted as the complete list of function arguments.
Unary \ creates a reference to whatsoever follows.
Binary =~ binds a scalar expression to a pattern match, substitution, or translation. These operations search or modify the string $- by default.
Binary !~ is just like =~ except the return value is negated in the logical sense. The following expressions are functionally equivalent:
$string !~ /pattern/
not $string =~ /pattern/
* / and % work as expected. If you have a floating point use fmod() instead of % because % converts its operands to integers before finding the remainder according to integer division.
Binary x is the repetition operator.
as a string replicator
print '-' x 80;
# print row of dashes
print "\t" x ($tab/8), ' ' x ($tab%); # tab over
as a list replicator
@ones = (1) x 80;
# a list of 80 1's
@ones = (5)
# set all elements to 5
to initialize array and hash slices
@keys = qw(perls before swine);
@hash{@keys} = ("") x keys;
which is equivalent to
$hash{perls} = '"';
$hash{before} = "";
$hash{swine} = "";
+ and - convert their arguments from strings to numeric values if necessary and return a numeric result. The "." operator provides string concatenation.
$almost = "Fred" . "Flintshone";
# returns FredFlintstone
another method of concatenation is
$fullname = "$firstname $lastname";
The bit-shift operators (<< and >>)
1 << 4; # returns 16
32 >> 4; # returns 4
Some of the functions described in chapter 3 are really unary operators.
sleep 4 | 3 is equivalent to (sleep 4) | 3
but
print 4 | 3 is equivalent to print (4 | 3)
This is so because sleep is a unary operator and list operator. When in doubt use parenthesis. Remember, if it looks like a function then it is a function.
A file test operator is a unary operator that takes one argument, either a filename or a filehandle, and tests the associated file to see if something is true about it.
|
|
|
|
|
File is readable by effective uid/gid |
|---|---|
|
|
File is writable by effective uid/gid |
|
|
File is executable by effective uid/gid |
|
|
File is owned by effective uid |
|
|
File is readabe by real uid/gid |
|
|
File is writable by real uid/gid |
|
|
File is executable by real uid/gid |
|
|
File is owned by real uid |
|
|
File exists |
|
|
File has zero size |
|
|
File has non-zero size (returns size) |
|
|
File is a plain file |
|
|
File is a directory |
|
|
File is a symbolic link |
|
|
File is a named pipe (FIFO) |
|
|
File is a socket |
|
|
File is a block special file |
|
|
File is a character special file |
|
|
Filehandle is opened to a tty |
|
|
File has setuid bit set |
|
|
File has setgid bit set |
|
|
File has sticky bit set |
|
|
File is a text file |
|
|
File is a binary file (opposite of -T) |
|
|
Age of file (at startup) in days since modification |
|
|
Age of file (at startup) in days since last access |
|
|
Age of file (at startup) in days since inode change |
Bitwise AND, OR, and XOR: &, |, and ^. Both operands must be of
the same type.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
And: $a && $b # $a if $a
is false, $b otherwise
Or: $a || $b
# $a if $a is true, $b otherwise
open(File, "filename") || die "Cannot open somefile: $!\n";
The range operator .. performs two different tasks. In a list context,
it returns a list of values counting (by ones) from the left value to the
right value.
In scalar context, .. returns a Boolean value.
if (101 .. 200) { print; } # print 2nd hundred
lines
next line if (1 .. /^$/); # skip header lines
s?^/> / if (/^$/ .. eof()); # quote body
The angle operator (<>),
sometimes called a diamond operator, is primarily used for reading and writing
files.
Computers are very efficient at decision making (of the decisions, they are programmed to make) and at repeating a task. Control structures allows the programmers define the decisions and to iterate.
This structure has the following syntax:
if (condition) {Code;}
if (condition) {Code;}else {Code;}if (condition) {Code;}elsif {Code;}.... else {Code;}
Unlike C++, {} are not optional under any condition. They must be used even if they are followed by a single statement:
if($color eq "red") {
print "red";
} elsif {
print "white";
} else {
print "blue";
}
We just ifs and all kinds of elses. Now suppose you don't want the if. You only want the else. You can use unless:
unless(condition) {Code;}
unless ( $a < 13) {
# do something
}
The while loop repeats a bunch of statements until as long as the condition specified is true:
while (Condition)
{
Code;
}
until is to while what unless is to if. In other words, it does the opposite of else. It iterates over a statement block until something is true.
unless (condition)
{
Code;
}
In a while loop or the do loop, the condition is tested at the top. This means that if the condition is not true from the first place (or true in case of do), the code inside would never be executed. A lot of times you would want the code to run at least once before the condition is test. For such a scenario, there is do while and do until loops. The while and until act like they are supposed to. The only difference is that they are at the bottom instead of at the top.
do {
Code;
} while(Expression)$stops = 0;
do {
$stops = 0;
print "Next stop? ";
chomp($location =);
} until $stops > 5 || $location eq 'home';
The perl for loop acts much like C's for loop:
for (Declare / Initialize; Condition; Increment / Decrement)
{
# Code;
}for( $i = 0; $i <= $#array; $i++ ) {
print $array[$i]; # print each element of array, one per loop
}
If you look carefully, you would see that there are three fields inside the for( ) loop. The leftmost one can be used for initialization. If you do not wish to initialize anything, leave it blank but do not omit the semicolon. You can leave any or all of the three fields blank. The middle field is the condition. As long as this condition is true, the loop would continue to run. As soon as it becomes false, the loop is exited. The rightmost field can be used to increment or decrement values.
The foreach statement takes a list of values and assigns them one at a time to a scalar variable, executing a block of code with each successive assignment.
foreach $i (@list) {
# code
}@array = (1, 2, 3);
foreach $b (reverse @a) {
print $b;
}
The following is also possible because of an implied $_:
foreach (reverse @a) {
print;
}
The next and last operators allow you to modify the flow of your loop. The next operator would allow you to skip to the end of your current loop iteration, and start the next iteration. The last operator would allow you to skip to the end of your block, as if your test condition had returned false.
@array = ( 1 .. 9 );
foreach $item (@array) {
if ($item == 3 ) {
next;
}
if ($item == 7 ) {
last;
}
print $item, "\s";
}# 1 2 4 5 6
Take a look at the result. The number 3 is missing because next operator interrupted the loop before the print statement. 7, 8, and 9 are missing because last interrupted before the print statement.
Perl also provides Labels but I do not recommend anyone to use them.
Perl has a data structure that is strictly known as array of scalars. This structure is more commonly known as an array or a list. Perl's arrays can be used as a simple list, stack, or even the skeleton of a complex data structure. Anything beginning with an @ symbol is an array.
Arrays are closely related to (but not the same as) lists. A Perl list is a sequence of comma separated values usually in a set of parentheses. A Perl array is a container for a sequence of values (that is, a container for a list). Lists are commonly used to initialize arrays. Assigning a list to an array places each item in the list in a consecutive element of the array. Lists can also be used to extract values from arrays.
The most common method of using an array as an indexed list is to directly assign the array all of its values at creation. The following example sets the array variable @months to the months of the year. There are two items to mention regarding the example below: the placeholder JUNK and the keyword qw. Arrays start at index 0: junk is the placeholder so Jan could be 1.
@months = qw ( JUNK Jan Feb March April May June July Aug Sept Oct Nov Dec);
@array = qw (a b c d e);
is equivalent to
@array = ("a", "b", "c", "d", "e");
The keyword qw is a shortened form used to extract individual words from a string. The above example can also be done in the following manner:
$months[0] = "JUNK";
$months[1] = "Jan";
...$months{12} = "Dec";
@home = ("a", "b", "c");
($m, $n, $o) = @home;
$home[0] = "a";
$home[1] = "b";
$home[2] = "c";
Notice when you assign the array elements directly, you use the $ character, not the @ character.
The list constructor operator could save you the trouble of listing all the values if you are using numbers:
(1 .. 5) # is equivalent to (1, 2, 3, 4, 5)
(2 .. 6, 59, 98) # is equivalent to (2, 3, 4, 5, 6, 59, 98)
($x .. $y) # if $x and $y are two numbers, .. is the range in between
You can assign values to arrays using all the methods discussed above. What we have been doing above is assigning scalar values to an array. Perl also allows you to assign an array to another array.
@onearray = @anotherarray;
You can also mix things up:
@hexcharacters = qw(a b c d e f);
@palindrome = (1 .. 9, @hexcharacters, reverse(@hexcharacters), 9, 8, 7, 6, 5, 4, 3, 2, 1);
# ( 1, 2, 3, 4, 5, 6, 7, 8, 9, a, b, c, d, e, f, f, e, d, c, b, a, 9, 8, 7, 6, 5, 4, 3, 2, 1 )
Don't be intimidated by the line noise. @hexcharacters is assigned the values a, b, c, d, e, and f. In @palindrome, first we are assigning the values 1, 2, 3, 4, 5, 6, 7, 8, and 9 using the list constructor operator. Then we are assigning the array @hexcharacters. The function reverse() does what you think it does, it reverses the array @hexcharacters. You can take on from there. The point is that you have a lot of ways to assign an array and you can use them simultaneously if you wish or need to do so.
Note that @hexcharacters is not the ninth element of @palindrome, the ninth element is a. In fact, @hexcharacters is not even an element of @palindrome. This is because a list cannot contain another list as an array. If you still want an array to be an element of another array, then use a list reference (which we will talk about in the references lesson).
Going back to the paragraph before the previous one, Perl offers you a great degree of flexibility and a lot of ways to do the same thing. Well, most of the new ways of doing things come from your ingenuity, but you can't do much if the language won't allow it. To get a feel of what I mean see below:
($one, $two, $three) = (23, 43, 37); # $one = 23, $two = 43, and $three = 37
@sqr[1..4] = (1, 4, 9, 16) # range of indices
@sqrt[1, 49, 9, 16, 4] = (1, 7, 3, 4, 2); # non-sequential indices
@inverse[@sqr] = (1, 0.25, 0.1111, 0.0625); # indices stored in another array
($one, $two) = ($two, $one) # swaps $one and $two
($d, @array) = ($x, $y, $z) # $d = $x and @array = ($y $z)
($five, @numbers) # This moves the first element of @numbers to $five
We have seen that scalars can be assigned to arrays. Arrays can also be assigned to arrays. We know from the previous lesson that scalars can be assign to scalars. Can we assign an array to a scalar? The answer is yes, but don't go away, yet. Yes you can assign it, but the scalar would not get the array, it would get the size of the array.
@array = (23, 34, 45);
$scalar = @array; # $scalar = 3, the lengh of the array
($scalar) = @array; # $scalar = 23, the first element of the array
You can assign values to more than one arrays at the same time:
@array1 = @array2 = @array3 = (21, 324, 324);
Perl arrays are indexed 0 to n. Suppose if @array is an array of 23 elements, then $array[0] is the first element and $array[22] is the last element. To copy an element's value to a scalar:
$scalar = $array[9];
$array[5]++; # increment sixth element of @array
$n = 5;
$array[$n]; # accesses
the sixth element of the array
$array[++$n]; # accesses the seventh
element of the array
$array[--$n]; # would decrement $n
and then use as an index
$array[$n] += 5; # adds 5 to the nth element of the
array
($array[0], $array[1]) = ($array[1], $array[0]); # swaps two elements
of the array. You can also do this for the entire array
You can also use negative values to access perl arrays. They access the
array in reverse:
@array = (23, 34, 4, 3421, 234);
$array[-2]; # 3421
$array[-3]; # 4
Slicing: the act of accessing a list of elements from an array. Here is
how you do it:
@array[3, 4];
# is equivalent to ($array[3], $array[4])
@array[3, 4] = @array[4, 3];
# slice and swap
@array[3, 10, 15] = (4, 98, 120); # slice and assign values
@array[4, 6, 9] = @array[2, 2, 2]; # assign the value of $array[2]
to $array[4], $array[6], and $array[9]
Where does the array end? Every programmer using an array needs to
know the answer to this question, regardless of the language he is using.
Java does not allow a program to access an array element out of bound (Meaning
element which is out of the range of an array. For example the 100th element
is out of bound of a 10 element array). C++ allows you access an element
out of bound but that attempt will return a garbage value. Perl allows
you to access an element out of bound but that element will return the
value undef meaning undefined. Which method do you think is the best? Java
or Perl?
C++ does not allow a program to extend an array dynamically. For example
you have an array of 10 elements. Then the program adds an element while
running. This is not allowed in C++. Its not allowed in Java but Java provides
a vector which can be resized dynamically. Dynamic resizing is allowed
in perl. It does not have to be in order. Meaning that if there is a 10
element array, you can add the 19th element without having to add eleventh,
twelfth, ... eighteen element. All elements in between would have the value
undef.
@array = (1 .. 4);
$array[6] = "perl"; # (1, 2, 3, 4, undef, undef,
perl)
Occasionally you would have to access the last element of the array. You
can access the last element by using $#arrayname. You can also
use -1 as an index.
@array = (12, 43, 54, 213);
print $array[-1];
# 213
print $#array;
# element number 3
print $array[$#array]; # 213
@stuff = ("one", "two", "three");
$stuff = @stuff;
# $stuff = 3
$stuff = ("one", "two", "three"); # $stuff =
3
LISTs do automatic interpolation of sublists. That is, when a LIST is evaluated,
each element of the list is evaluated in a list context, and the resulting
list value is interpolated into LIST just as if each individual element
were a member of LIST. Thus arrays lose their identity in a LIST.
(@foo, @bar, &Somesub)
contains all the elements of @foo, followed by all elements of @bar, followed
by all the elements returned by the subroutine named Somesub when it's
called in a list context. You can use a reference to an array if you do
not want it to interpolate. Null list is represented by ().
@days + 0;
# implicitly forces @days into a scalar context
scalar(@days); # explicitly forces
@days into a scalar context
@whatever = (); # assigning a null list
$#whatever = -1; # assigning a null list
Using Arrays as Stacks (push and pop):
When I was learning C++, I had to go through a lot of pain to learn
how to create my own stack. I didn't have to go through the same pain in
Java because there is a class by the name of stack defined in the language.
Learning to use it took a little time but was a blessing when compared
to C++. In perl, you can convert an array into a stack in one line and
then back in another line! No wonder a lazy programmer like myself got
hooked to Perl. To utilize an array as a stack, use the push and pop functions:
Suppose LIFO = (1, 2, 3)
push(@myList, "LIFO"); # @myList = (1, 2, 3)
$one = 34;
push(@myList, $one); # @myList = (1, 2,
3, 34)
push(@myList, 99, 100); # @myList = (1, 2, 3, 34, 99, 100)
$index = pop(@myList); # $index = 100
The push function takes an array and a list of elements to append to it.
It then appends them and returns the new length of the array. The pop function
removes the last element of an array and returns that element. If the array
is empty, it returns undef.
shift and unshift:
The push and pop functions deal with the highest subscripts. This is
sometimes called the right side of an array. Now that we discovered that
an array can also be treated like a stack, it seems a bit awkward to call
it a array. This is why the word is used to refer to an array. The shift
and unshift functions deal with the lowest subscripts. This is sometimes
called the left side of the array:
unshift(@array, $a);
# like @array = ($a, @array);
unshift(@array, $a, $b); # like @array = ($a,
$b, @array);
$x = shift(@array);
# like ($x, @array) = @array;
@array = (5, 6, 7);
unshift(@array, 2, 3, 4); # @array is now (2, 3, 4,
5, 6, 7)
$x = shift(@array);
# $x gets 2, @array is now (3, 4, 5, 6, 7)
The unshift and shift functions work just like push and pop respectively,
except that they add elements to the start of an array instead of the end.
splice
The push, pop, shift, and unshift functions are special cases of a
more general function called splice, which changes the elements of an array.
The splice function takes four arguments:
the array to be modified
the index at which it's to be modified
the number of elements to be removed (starting at the index specified
in the previous argument)
a list of extra elements to be inserted at the index (after the previous
elements are removed)
The function returns a list of the elements removed from the array being
modified.
The following is from programming perl:
splice ARRAY, OFFSET, LENGTH, LIST
splice ARRAY, OFFSET, LENGTH
splice ARRAY OFFSET
This function removes the elements designated by OFFSET and LENGTH from
an array, and replaces them with the elements of LIST, if any. The function
returns the elements removed from the array. The array grows or shrinks
as necessary. If LENGTH is omitted, the function removes everything from
OFFSET onward. The following equivalences hold (assuming $[ is 0):
| Direct Method | Splice Equivalent |
| push(@a, $x, $y) | splice(@a, $#a+1, 0, $x, $y) |
| pop(@a) | splice(@a, -1) |
| shift(@a) | splice(@a, 0, 1) |
| unshift(@a, $x, $y) | splice(@a, 0, 0, $x, $y) |
| $a[$x] = $y | splice(@a, $x, 1, $y) |
The splice function is also handy for carving up the argument list passed
to a subroutine. For example, assuming list lengths are passed before lists:
sub list_eq {
# compare two list values
my @a = splice(@_, 0, shift);
my @b = splice(@_, 0, shift);
return 0 unless @a == @b;# same len?
while (@a) {
return 0 if pop(@a) ne pop(@b);
}
return 1;
}
if (list_eq($len, @foo[1..$len], scalar(@bar), @bar)) { ... }
It would probably be cleaner just to use references for this, however.
reverse
The reverse function reverses the order of the elements of its arguments,
returning the resulting list. The original list is always unaltered, reverse
works on a copy.
@array1 = (234, 89, 36, 98);
@array2 = reverse(@array1); # @array2 = (98,
36, 89, 234)
sort:
The sort function does what you think it does, it sorts. Note the way
numbers are sorted.
sort("one", "two", "three"); # one three two
sort(1, 2, 12, 24);
# 1, 12, 2, 24
chomp:
The chomp function works on an array variable as well as a scalar variable.
This function removes the last element.
@stuff = ("one\n", "two\n", "three");
chomp(@stuff); one two three
|
|
|
| @days | Same as ($days[0], $days[1],....$days[n]) |
| @days[3..5] | Same as ($days[3], $days[4], $days[5]) |
| @days[3..5] | Same as @days[3, 4, 5] |
| @days{'Jan', 'Feb'} | Same as ($days{'Jan'}, $days{'Feb'}) |
Associative Arrays (Hashes):
Hashes are also called associative arrays. I will be using the two terms interchangeably. Hashes are indexed by string values instead of an integer index value. Associative arrays, unlike scalar arrays, do not have a sense of order. There is no first addressable element. This is because the indexes of the hashes are strings and information is not stored in a predictable order.
A hash is best thought of as a two-column table, where the left column stores keys and the right colunm stores their associated scalar values. It's called a hash because a hashing algorithm is used to map each key string to an internal index into the table. To retrieve a value from a hash, you must know the key. If you know a key of hash %hash and you want to print out the value, you would use the following syntax:
print $hash{'mike'};
This example prints out the value of a key named mike in the hash named %hash. The interior of the curly braces (or the left-hand side of a => operator) of the hash will automatically interpret an identifier as a quoted string. So we can also write:
print $hash{mike} # notice that there are no quotes.
This is only true, however, if the contents are an unbroken sequence of alphanumerics or underscores. That is, we can't write:
$sound{mike willis}= "son of willis"; # wrong
if we mean:
$sound{"mike willis"} = "son of willis";
Populating a Hash:
Much like the normal array, an associative array can have all its values assigned at once. The following assign records to the hash %cities:
%cities = ("Toronto" => "East", "Calgary" => "Central", "Vancouver" => "West");
is equivalent to
%cities = ("Toronto", "East" => "Calgary", "Central" => "Vancouver", "West");
is equivalent to
%cities = ("Toronto", "East", "Calgary", "Central", "Vancouver", "West");
is equivalent to
$cities{'Toronto'} = "East";
$cities{'Vancouver'} = "West";
$cities{'Calgary'} = "Central";
Functions:
Data in hashes is stored in key/value pairs. In the example above, Toronto is the key, East is the value. Due to the nature of this assignment (which is unordered), hashes cannot be referenced like an array could. Contents of a hash can be listed by using either of the functions: keys, values, and each.
Keys:
The keys function returns a list of the keys of the given associative array when used in a list context, and the number of keys when used in scalar context.
my %cities = ("Toronto" => "East", "Calgary" => "Central", "Vancouver" => "West");
for $key (keys %cities)
{
print "Key: $key Value: $cities{$key} \n";
}
In scalar context, the keys function gives the number of elements (key-value pairs) in the hash. For example:
if(%cities == 3) {
# do something
}
Values:
The code above returns both keys and its values. If you want only the values of the hash and not the keys, use the values function.
my %cities = ("Toronto" => "East", "Calgary" => "Central", "Vancouver" => "West");
for $value (values %cities)
{
print "Value: $value \n";
}
or
@array = values(%cities);
Each:
The each function iterates over the entire hash and returns all key-value pairs.
my %cities = ("Toronto" => "East", "Calgary" => "Central", "Vancouver" => "West");
while(($key, $value) = each(%cities))
{
print "Key: $key Value: $cities{$key} \n";
}
Delete:
How do you delete an element from a hash? It cannot be done by assigning a null. It can't be done by chopping it off. There is no order, so there is no connection you can chop off. To tend to this need, delete function was created.
delete $cities{"Toronto"};
This will delete the key value pair.
Hash Slices:
Like an array, hashes can also be sliced. Observe:
$cities{"Toronto"} = East;
$cities{"Calgary"} = Central;
$cities{"Vancouver"} = West;
This can be simplified to:
($cities{"Toronto"}, $cities{"Calgary"}, $cities{"Vancouver"}) = ("East", "Central", "West");
or
@cities{"Toronto", "Calgary", "Vancouver"} = ("East", "Central", "West");
or
@locations = qw(Toronto Calgary Vancouver);
print "Places are: @cities{@locations}\n";
Hash slices can also be used to merge smaller hash into a larger one. In this example, the smaller hash takes precedence in the sense that if there are duplicate keys, the value from the smaller hash is used:
%destinations{keys %cities} = values %cities;
or
%destinations = (%destinations, %cities);
The values of %cities are merged into the %destinations hash.
A subroutine is a small user-defined, self-contained subprogram. Like Perl's built-in functions, a subroutine is invoked by name and may have arguments passed to it. A subroutine may return a scalar or list value.
Defining subroutines:
Subroutines are defined using the sub keyword, followed by the subroutine code in curly braces:
sub dictionary_order
{
@ordered = sort @_;
return @ordered;
}
The following is an error because & was used:
sub &dictionary_order # Fatal compile_time error
{
retrun sort @_;
}
Calling subroutines:
Subroutines are called by specifying their name, followed by a list of arguments:
@sorted = dictionary_order ("eat", "at", "Joes");
@sorted = dictionary_order (@unsorted);
@sorted = dictionary_order (@sheep, @goats, "shepherd", $goatherd);
@sorted = &dictionary_order("eat", "at", "Joes");
You can also call a subroutine without parentheses
sub make_sequence # args: (from, to, step_size)
{
# to see the arguments, you can do any of the following.
print "@_";
print $_[0], " ", $_[2], "etc";
%arg = @_;
print $arg{min}, " ", $arg{max}, " ", $arg{step_size};
@list = ();
for ($n = $_[0]; $n < $_[1]; $n+=$_[2])
{
push @list, $n;
}
return @list;
}
# then later...
@stepped_sequence = make_sequence $min, $max, $step_size;
Passing arguments:
Just like any other list, if teh argument has nested lists or arrays, they are "flattened." Therefore, at the start of the third call to dictionary_order above, @_ would contain the contents of the array @sheep, followed by the contents of @goats, the value "shepherd", and finally the scalar value stored in $goatherd. It is possible to pass two or more arrays to a subroutine and keep them "unflattened" by using explicit references.
Refer back to first example in defining subroutines above. The arguments passed to the subroutine are available within its code block via the special @_ array. The built-in function return causes execution of the subroutine to finish immediately and the value specified after the return to be returned as the result. Using a return is optional in a subroutine. If none is specified, the subroutine automatically returns the value of the last statement it actually executed.
Because a subroutine's arguments are passed to it in the special array @_, and because all arrays in Perl are dynamically sized, any subroutine may be passed any number of arguments.
Named arguments:
Suppose we want to implement a subroutine called listdir that provides the functionality of our operating system's directory listing command (i.e., dir or ls). Such a subroutine might take arguments specifying which files to list, what type of files to consider, whether to list hidden files, what details of each file should be reported, whether files and directories should be listed recursively, how many columns to use, and whether the output should be paged or just dumped.
But we certainly don't want to have to specify every one of those nine parameters every time we call listdir:
listdir(undef, undef, 1, 1, undef, undef, undef, 4, 1);
Some programming languages provide a mechanism for naming the arguments passed to a subroutine. Perl supports named arguments in a cunning way. If we pretend that a particular subroutine takes a hash, rather than a list, we can use the => operator to associate a name with each argument. For example:
listdir(cols=>4, page=>1, hidden=>1, sep_dirs=>1);
Inside the subroutine, we simply initialize a hash with the resulting contents of the @_ array. We can access the arguments by name, using each name as the key to an entry in the hash. For example, we can define listdir like so:
sub listdir
{
%arg = @_; # Convert argument list to hash
# Use defaults for missing arguments...
$arg{match} = "*" unless exists $arg{match};
$arg{cols} = 1 unless exists $arg{cols};
# etc.
# Use arguments to control behaviour...
@files = get_files( arg{match} );
push @files, get_hidden_files() if $arg{hidden};
# etc.
}
Since the entries of a hash can be initialized in any convenient order, we no longer need to remember the order of the nine potential arguments, as long as we remember their names. Because hashes are flattened inside lists, if we have several calls that require the same subset of arguments, we can store that subset in a separate hash and reuse it:
%std_listing = (cols=>2, page=>1, sort_by=>"data");
listdir(file=>"*.txt", %std_listing);
listdir(file=>"*.log", %std_listing);
listdir(file=>"*.dat", %std_listing);
We can even override specific elements of the standard set of arguments, by placing an explicit version after the standard set. Then the explicit version will reinitialize (i.e. overwrite) the corresponding entry in the hash:
listdir(file=>"*.exe", %std_listing, sort_by=>"size");
Aliasing of parameters:
Elements of the @_ array are special in that they are not copies of the actual arguments of the function call. Rather they are aliases for those arguments. That means that if values are assigned to $_[0], $_[1], $_[2], etc., each value is actually assigned to the corresponding argument with which the current subroutine was invoked. In other words, its a call-by-reference rather than call-by-value. The following subroutine increments its first argument each time it's called, but keeps the result less than 10 at all times.
sub cyclic_incr
{
$_[0] = ($_[0]+1) % 10;
}
The result would be:
$next_digit = 8;
print $next_digit; # prints 8
cyclic_incr($next_digit);
print $next_digit; # prints9
cyclic_incr($next_digit);
print $next_digit; # prints 0
An unmodifiable value like 7 as opposed to a variable like $next_digit would cause a fatal error. If you don't intend to change the values of the original arguments, it's usually a good idea to explicitly copy the @_ array into a set of variables.
sub next_cyclic
{
($number, $modulus) = @_;
$number = ($number+1) % $modulus;
return $number;
}
The variables $number and $modulus are still global but more visible. For local variables use my keyword.
Calling Context
When a subroutine is called, it's possible to detect whether it was expected to return
* a scalar value
* a list or
* nothing at all
These three possibilities define three contexts in which a subroutine may be called.
listdir(@files); # void context: no return value expected
$listed = listdir(@files); # scalar context: scalar return value expected
@missing = listdir(@files); # list context: list return value expected
($f1, $f2) = listdir(@files); # list context
print( listdir(@files) ); # list context
Wantarray function
There is a built-in function in Perl, which tells the subroutine is expected to return. The function returns
* undef if the current value was not expected to return a value.
* "" if it was expected to return a scalar.
* 1 if it was expected to return a list.
We could use this information to select the appropriate form of return statement (and perhaps optimize for cases where the return value would not be used). For example:
sub listdir
{
# Do file listing, and then:
return @missing_files if wantarray();
return $listed_count if defined(wantarray());
}
If a subroutine is always supposed to return a value, we could issue a warning whenever that return value is ignored:
use Carp;
sub listdir
{
# Do file listing, and then:
return @missing_files if wantarray;
return $listed_count if defined(wantarray);
carp "subroutine &listdir was called in void context";
}
We use Carp::carp subroutine, instead of the built-in warn function, so that the warning reports the location of the call to listdir, instead of the location within listdir at which the error was actually detected.
Determining a subroutine's caller
The Carp module is useful because it reports the location of a subroutine's caller, rather than the location of the subroutine's code.
caller function
Unlike most languages, Perl makes it easy to determine where a subroutine was called. The built-in caller function provides details of the caller. This function works differently in string and list context:
1. String Context
In scalar context caller returns:
1. the package from which the current subroutine was called.
2. the name of the file containing the code that called the current subroutine
3. the line in that file from which the current subroutine was called
2. List Context
In list context, caller returns:
1. the package from which the current subroutine was called.
2. the name of the file containing the code that called the current subroutine
3. the line in that file from which the current subroutine was called
4. the name of the subroutine
5. whether the subroutine was passed arguments
6. the context in which the subroutine was called (the value returned by wantarray)
7. the actual source code that called the subroutine (but only if the call was part of an eval TEXT statement)
8. whether the subroutine was called as part of a require or use statement.
Prototypes
Subroutines can also be declared with a prototype, which is a series of specifiers that tells the compiler to restrict the type and number of arguments with which the subroutine may be invoked. For example, in the subroutine definition
sub insensitive_less_than ($$)
{
return lc($_[0]) lt lc($_[1]);
}
the prototype is ($$) and specifies that the subroutine insensitive_less_than can only be called with two arguments, each of which will be treated as a scalar -- even if it's actually an array. In other words, a $ prototype causes the corresponding argument to be evaluated in a scalar context. That means, for example, that a call like insensitive_less_than(@a, @b) will be treated @a and @b as scalars. The two values passed to insensitive_less_than will be the lengths of @a and @b respectively, not their contents. This kind of introduced subtlety is a good reason to avoid using a prototype, unless you're very confident that you know its full consequences.
Prototypes are only enforced when a subrouting is called using the name(args) syntax. Prototypes are not enforced when a subroutine is called with a leading & or through a subroutine reference. They are also ignored when an object method is called.
In Perl, references are not just pointers, they are data types.
Creating a Reference
When a reference is declared, a new instance of the reference is created and stored in a scalar.
# Set up the data types.
my $scalarVar = "Something";
# Create a reference to it.
my $scalarRef = \$scalarVar;
Dereferencing a Reference
In order to access the information that a reference points to, the reference must be dereferenced. Perl's references do not automatically dereference themselves when used. e.g.
# Initialize variables
my $scalarVar = "something";
my @arrayVar = qw(a b c d e);
my %hashVar = ("Toronto" => "East", "Calgary" => "Central", "Vancouver" => "West");
# Create the references
my $scalarRef = \$scalarVar;
my $arrayRef = \@arrayVar;
my $hashRef = \%hashVar;
# Print out the references.
print "$scalarRef \n";
print "$arrayRef \n";
print "$hashRef \n";
# The output of the program is
SCALAR(0xaddc4)
ARRAY(0xadec0)
HASH(0xade30)
So, how do we dereference? Dereferencing is different for scalar, array, etc. Lets look at each one of them.
Scalar References
A scalar reference is reference to a scalar value. The example below show how a scalar reference is created, dereferenced, and printed.
# Creating a scalar variable
my $scalarVar = "something";
# Creating a scalar reference
my $scalarRef = \$scalarVar;
# Printing a scalar variable
print "Var: $scalarVar \n";
# Printing a scalar reference
print "Ref: " . $$scalarRef . "\n";
# The output would be
Var : something
Ref: something
Note the two $ signs in the scalar reference. Also note the difference in the way the scalar variable and scalar reference is printed.
Array References:
Array reference is created using \ operator and dereferenced using @$.
# Create the array
my @letters = qw(a b c d e);
# Create the array reference
my $arrayRef = \@letters;
# Printing the array reference
for $month (@$arrayRef)
{
print "Letters: $letter \n";
}
Hash References:
Hash references are created using \ operator and dereferenced using %$.
# Create and associative array
my %who = ('Name' => 'Gizmo', 'Age' => 3, 'Height' => '10 cm', 'Weight' => '10 gm');
# Create the hash reference
my $hashRef = %who;
# Print the associative array
for $key (sort keys %$hashRef)
{
$value = $hashRef->{$key};
printf "Key: %10s Value: %-40s\n", $key, $value;
}
# output of program
Key: Age Value: 3
Key: Height Value: 10 cm
Key: Name Value: Gizmo
Key: Weight Value: 10 gm
Code References
A code reference points to a Perl subroutine. Code references are mainly used for callback functions, where a callback is a function that you ask to have called at a later time. Code references are created with \ operator and dereferenced with &$.
# define the callback function
sub callBack
{
my ($mesg) = @_;
print "$mesg\n";
}
# Create the code reference
my $codeRef = \&callBack;
# Call the callback function with different parameters.
&$codeRef("Hi someone");
&$codeRef("something");
Anonymous Array References
An anonymous array is an array without an associated name variable. This means the array has been defined and stored into a reference instead of an array variable. There will be times when you may want to create a temporary array but don't feel like creating a new array name. When you use an anonymous array, Perl creates the namespace for the array. To create an anonymous array, use square brackets around a list of values. The following is an anonymous array inside an anonymous array.
# create the anonymous array reference
My $arrayRef = [[1, 2, 3, 4], 'a', 'b', 'c', 'd', 'e', 'f'];
# Print out some of the array
print $arrayRef->[0][0] . "\n";
print $arrayRef->[0][1] . "\n";
print $arrayRef->[1] . "\n";
# output is
1
2
a
References are particularly useful in creating multidimensional data structures. As we saw earlier, nested lists are automatically flattened, so trying to build a list of lists doesn't work:
@table = (
( 1, 2, 3 ),
( 2, 4, 6 ),
( 3, 6, 9 ),
);
This fails to have the desired effect because flattening makes the above equivalent to:
@table = (1,2,3,2,4,6,3,6,9);
Fortunately, each element in a Perl array can store any kind of scalar value. Since a reference is just a special kind of scalar, it's possible to write:
@row1 = (1,2,3);
@row2 = (2,4,6);
@row3 = (3,6,9);
@cols = (\@row1, \@row2, \@row3);
$table = \@cols;
Now the elements in the "row" arrays can be accessed using the arrow notation:
print "2 x 3 is ", $table->[1]->[2];
Of course, tables like this are very popular, so Perl provides syntactic assistance. If we specify a list of values in square brackets instead of parentheses, the result is not a list, but a reference to a nameless (or anonymous) array. That array is automatically initialized to the specified values. So the above code could be written as:
$row1_ref = [ 1, 2, 3];
$row2_ref = [ 2, 4, 6];
$row3_ref = [ 3, 6, 9];
$table = [$row1_ref, $row_ref, $row3_ref];
or use nested brackets
my $table =
[
[ 1, 2, 3],
[ 2, 4, 6],
[ 3, 6, 9],
]
And finally
print $table->[1]->[2];
can be replaced with
print $table->[1][2];
Anonymous Hash References
Anonymous hash or associative array references are created the same way anonymous array references are created. The hash is created and the reference is stored directly into the reference.
my $hashRef = {'Name' => 'Gizmo', 'Age' => 3, 'Height' => '10 cm'};
print $hashRef->{'Name'} . "\n";
print $hashRef->{'Age'} . "\n";
print $hashRef->{'Height'} . "\n";
# output is
Gizmo
3
10 cm
It is possible to create references to anonymous hashes by replacing the parentheses of a hash-like list:
%association = ( cat=>"nap", dog=>"gone", mouse=>"ball" ); # parentheses
with curly braces:
$association = { cat=>"nap", dog=>"gone", mouse=>"ball" }; # curly braces
Like the [...] array constructor, the {...} hash constructor returns a reference, which must be assigned to scalar variable ($association), not to a hash (%association). Access to the resulting anonymous hash is only possible through the returned reference:
print $association->{cat};
We can even create multilevel hashes, by nesting anonymous hash references:
$behaviour =
{
cat => { nap => "lap", eat => "meat" },
dog => { prowl => "growl", pool => "drool" },
mouse => { nibble => "kibble" },
};
Accessing the data requires a chain of arrow operators:
print $behaviour->{cat}->{eat};
And, as with multidimensional arrays, any arrows after the first can be omitted:
print $behaviour->{mouse}{nibble};
Anonymous Subroutine References
An anonymous subroutine is a subroutine that has been defined without a name. The $ operator is used to access the anonymous routine. The following script creates a reference to an anonymous function.
my $codeRef = sub { my $mesg = shift; print "mesg\n"; };
&$codeRef ("hi someone");
&$codeRef("something");
Passing subroutine arguments as explicit references
References also provide a means of passing unflattened arrays or hashes into subroutines. Suppose we want to pass array of values to a subroutine. We can't call this subroutine in the obvious way:
insert(@ordered, $next_val);
because normal list flattening will squash the contents of @ordered and the value of $next_val into a single list. Instead, we could set up the subroutine insert so that it expected a reference to the array as its first argument:
sub insert
{
($arr_ref, $new_val) = @_;
@($arr_ref) = sort {$a<=>$b} (@{$arr_ref}, $new_val); # numerical sort
}
We could then call it like so:
insert(\@ordered, $next_val);
Identifying a Referent
Because a scalar variable can store a reference to any kind of data, and dereferencing a reference with the wrong prefix leads to fatal errors, it's sometimes convenient to be able to determine the type of referent to which a specific reference refers. Perl provides a built-in function called ref that takes a scalar, such as $$slr_ref, and returns a description of the kind of reference it contains.
What ref returns If $slr_ref contain ... then ref($slr_ref) returns ...
a scalar value undef
a reference to a scalar "SCALAR"
a reference to an array "ARRAY"
a reference to a hash "HASH"
a reference to a subroutine "CODE"
a reference to a filehandle "IO" or "IO::Handle"
a reference to a typeglob "GLOB"
a reference to a precompiled pattern "Regexp"
a reference to another reference "REF"
The ref function can be used to improve error messages.
die "Expected scalar reference" unless ref($slr_ref) eq "SCALAR";
or to allow a subroutine to automatically dereference any arguments that might be references:
sub trace
{
($prefix, @args) = @_;
foreach $arg ( @args );
{
if (ref($arg) eq 'SCALAR') { print $prefix , ${$arg} }
elsif (ref($arg) eq 'ARRAY') { print $prefix, @{$arg} }
elsif (ref($arg) eq 'HASH') { print $prefix, $arg }
else { print $prefix, $arg }
}
}
The ref function has a vital additional role in object-oriented Perl, where it can be used to identify the class to which a particular object belongs.
Regular Expressions
Regular expressions are used to search for patterns in strings of data.
Pattern-Matching Operators:
Pattern-matching operators are the keywords in Perl that perform pattern matches. The difference between regular expression syntax and pattern-matching operators is that regular expressions allow the programmer to build complex expressions, whereas pattern-matching operators deals with how to use them. The syntax used to perform a pattern match on a string is:
$string =~ /regular expression/expression modifier (optional)
The strings inside / / will be searched for.
The two main pattern matching operators are m//, the match operator, and s///, the substitution operator. There is also a split operator, which takes an ordinary match operator as its first argument but otherwise behaves like a function.
Although we write m// and s/// here, you can pick your own quote characters. On the other hand, for the m// operator only, the m may be omitted if the delimiters you pick are in fact slashes. (You'll often see patterns written this way, for historical reasons.)
Regular Expression Syntax:
There are a lots and lots of regular expressions in Perl. The most common operator used to apply regular expressions on strings is what is called a pattern-binding operator (=~) and (!~). The first compares a string to the pattern and succeeds if the two match. The second binding operator compares the string to the pattern and succeeds if the comparision fails. The syntax.
$string !~ /regular expression/expression modifier (optional)
The rules of regular expression matching
.
Modifiers:
An expression modifier can be added to most regular expressions to modify the behaviour of the expression. The following is an example.
# Create a basic string.
my $string = "Hello World!";
if ($string =~ /"Hello World!"/)
{
print "Case Match!\n";
}
if ($string =~ /"hello WORLD!"/i)
{
print "Case insensitive Match!\n";
}
Commandline parameters are values passed to the program from commandline when calling the program. For example:
$ perl sum.pl 4 5
sum.pl has two commandline arguments, 4 and 5.
All commandline parameters in Perl get inserted into the @ARGV array in the order they are typed. To access a values in the @ARGV array (for this example), we use $ARGV[0] and $ARGV[1].
#!/usr/bin/perl print $ARGV[0] + $ARGV[1];
This program returns the sum of the two arguments passed in the commandline.
My system is configured as follows:
Web root = c:/www
Apache version = 2.2.6
Apache installation directory = c:/Apache2.2/
#!c/Perl/bin/perl
print "Hello, World...\n";
Start > Run > cmd > Ok
type perl hello.pl
If you see Hello, World..., Perl installed correctly, otherwise reinstall ActiveState Perl.
Options Indexes FollowSymLinks
Options Indexes FollowSymLinks ExecCGI
#AddHandler cgi-script .cgi
AddHandler cgi-script .cgi .pl
#!/Perl/bin/perl
print "Content-type:text/html\n\n";
print "hello world";
Save it as c:/Apache2.2/cgi-bin/hello.cgi. View http://localhost/cgi-bin/hello.cgi in your browser. If you see hello world, CGI is working correctly.
Very often you would not want to run CGI from an Apache directory. To change the cgi-bin, create the folder
c:/www/cgi-bin/
ScriptAlias /cgi-bin/ C:/www/cgi-bin/
<Directory "C:/Apache2.2/cgi-bin">
<Directory "C:/www/cgi-bin">
Perl errors are logged in the Apache error log. To view it, go to Start > Apache HTTP Server 2.2.6 > Review Server Log Files > Review Error Log or
C:/Apache2_2/logs/error.log
When using Apache and Perl on Windows, it is common to get Couldn't spawn child process error. The solution is very simple. Verify that you have using the correct directive at the top of your file.
!#/usr/bin/perl
should be changed to
#!c:/Perl/bin/perl.exe
I am assuming that your perl.exe file is located at c:/Perl/bin/perl.exe. If it is at another address, use that address.
Perl string literals are enclosed in quotation marks. You can use either single quotes ('') or double quotes ("").
print "hello"; print 'hello';
Now suppose you need to print a string which contains a single or double quote such as following strings:
This is bob's watch. Alice replied, "It was a birthday gift."
If I do the following,
print 'This is Bob's watch.'; Alice replied, "It was a birthday gift."
Then I would get an error message stating that the "substitution pattern was not terminated". This error is generated because Perl would read the following string: 'This is Bob'. Then it would not know what to do with "s watch.'".
There are several ways to resolve this problem.
Alternating quotes
The simplest technique is to alternate the quotes.
print "This is Bob's watch."; print 'Alice replied, "It was a birthday gift."';
If the string contains a single quote, we use double quotes and if the string contains double quotes, we use single quotes. This technique could not be used if you have both single and double quotes inside a string.
You must also understand the differences between single and double quotes before using this approach. Double quotes support variable interpolation, single quotes do not. Suppose I run the following program:
#!/usr/bin/perl $name = "John"; print 'His name is $name\n"; print "His name is $name\n";
This program would print the following results:
His name is $name His name is John
Adding Backslashes
Another simple technique is to add backslashes before the quotes inside the string.
print 'This is Bob\'s watch.'; print "Alice replied, \"It was a birthday gift\"";
With this technique, you can escape both single and double quotes inside a string. However, there are times when you would find this technique very hectic. For example, when there are 30 quotes to escape which is quite common when you are printing HTML.
q and qq operators The q operator escapes single quotes and the qq operator escapes double quotes. These operators are used as follows:
q(This is Bob's watch.); qq(Alice replied, "It was a birthday gift");
The escaped sequence is to be placed between the parentheses. You can also use any of the following escape sequences:
< >, or [ ]
Most programmers tend to use backslashes, q, and qq operators. There are several other advanced techniques as well. The best solution, however, is always the one which best suits your needs.