PERL for Biologists

Course by Kurt Stüber

Previous, Part 3, Next

Arrays and mean values

Arrays are series of values. For instance the days of the week, the months of the year, daily rainfall measurements, hourly temperatures etc. Those values can be grouped as you in the following declarations:

@weekday = ( "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday" );
@month = ( "Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dez" );
@rain = ( 12, 44, 0, 34, 75, 122, 4 );
@temperature = ( 37.4, 38.5, 39.1, 40.3, 39.8, 37.4 );

Individual values are accessed by their indices. For instance in the weekday array the first value ("Monday") is printed:

print $weekday[ 0 ];

When individual values are used the @-sign is replaced by the $-sign and the index is enclosed in square brackets []. Please note that the elements in the array are numbered starting with 0. The last day in the week has the index 6. It is also possible to assign values to individual elements of an array:

$rain[ 7 ] = 17;

This element with index 7 has not been defined in the above declaration. As soon as this assigment is executed the element will be created, the array @rain will be elongated.

A simple way to test for the current number elements in an array is to asign the array to a scalar variable:

$no_of_elements_in_array = @weekday;

The variable $no_of_elements_in_array will then contain exactly this namely the number of elements in the array @weekday, that is 7.

The for-loop

The for-loop allows repetitive calculations. The number of repetitions is determined by a loop-variable ($n). In the following example the mean rain fall over seven days is calculated:

$sum = 0;
for( $n = 0; $n < 7; $n = $n + 1 )
   $sum = $sum + $rain[ $n ];
$mean = $sum / 7;
$print "The mean rain fall is $main mm per day.\n";

$n is the loop variable. This variable is first initialized with 0. Then a condition is defined which must be true in order for the following block to be executed. In the third part of the for-statement the loop variable is incremented. During execution time the loop variable is initialized only once. Then condition is tested and if it comes out to be true then the "block", i.e. the statements which follow between the wavy brackets {} are executed. Then control goes back to the for-statement, the loop-variable is incremented (from 0 to 1) and again the condition is tested. If true another round is started, repeating the statements in the block with the new value of $n. This is then repeated until at some time the condition will no longer be true, thast is when $n hast reached the value of 7. Then the execution is continued with the first statement after the block, where the $sum is divided by the number of days resulting in the mean rain fall during this week.

Please note also that the "block" is indented. This no necessary PERL feature but simply a habit you should develop to mark statements in blocks. This greatly enhances the readability and maintainability of your programs, especially when block are "nested", i.e. when blocks contain other blocks and you have to to locate the corresponding brackets. Missing or wrong placed brackets are a common source of programming errors.

Example: Calculation of standard deviation.

We continue to calculate the standard deviation of a set of values. The following formula has been taken from a statistics textbook and shows the variance of a set of values. The standard deviation (s) is the square root of the variance (s2).

Again we can use a for-loop for the calculation. sqrt is a predefined PERL-routine for the square root and ** is the potentiation symbol:

$sum_of_squares = 0;
for( $n = 0; $n < 7; $n = $n + 1 )
   $sum_of_squares = $sum_of_squares + ( $rain[ $n ] - $mean )**2;
$standard_deviation = sqrt( $sum_of_squares / 6 );
print "The standard deviation is: $standard_deviation\n";

The while-loop

The while-loop is controlled by a single condition:

$no_of_days = @weekday;
$n = 0;
while ( $n < $no_of_days )
   print $weekday[ $n ] . "\n"
   $n = $n + 1;

You have to care yourself for incrementation of the loop-variable ($n). If you forget this, then the while-loop will never end. In the example above the days of the week are printed after the number of days in the array @weekday have been determined.

© 2001-2007, by Kurt Stüber.

This page is part of a PERL course for bioinformatics, written by Kurt Stüber. Copyright 2007 by Kurt Stüber.