Randomizing a list using Rand in perl - perl

Hi i currently am using the List::Util shuffle to randomize a array with CGI however I want to modify the code to use rand instead
here is my code
print "Content-type: text/html\n\n";
use List::Util qw(shuffle);
#haikuone = ('behind', 'the', 'red', 'barn');
#haikutwo = ('prairie', 'grasses', 'reclaiming');
#haikuthree = ('the', 'basketball', 'court');
#randomize1 = shuffle(#haikuone);
#randomize2 = shuffle(#haikutwo);
#randomize3 = shuffle(#haikuthree);
print "<html>\n";
print "<head><title>Haiku_Random</title></head>\n";
print "<body>\n";
print "<pre>\n";
print "RANDOM HAIKU (DISCLAIMER: NONSENSE MAY OCCUR)\n";
print "#randomize1\n";
print "#randomize2\n";
print "#randomize3\n";
How would i modify this code to use rand instead of List::Util
I dont think its much but a novice here
I'm trying to get this working
$haikuone = ('behind', 'the', 'red', 'barn');
$haikutwo = ('prairie', 'grasses', 'reclaiming');
$haikuthree = ('the', 'basketball', 'court');
#random1 = $line1[rand #haikuone];
#random2 = $line2[rand #haikutwo];
#random3 = $line3[rand #haikuthree];
print "RANDOM HAIKU (DISCLAIMER: NONSENSE MAY OCCUR)\n";
print "$line1\n";
Now when i do this
#!/usr/local/bin/perl
#haikuone = ('behind', 'the', 'red', 'barn');
#haikutwo = ('prairie', 'grasses', 'reclaiming');
#haikuthree = ('the', 'basketball', 'court');
#random1 = $line1[rand #haikuone];
#random2 = $line2[rand #haikutwo];
#random3 = $line3[rand #haikuthree];
print "RANDOM HAIKU (DISCLAIMER: NONSENSE MAY OCCUR)\n";
print "#haikuone\n";
It will print haikuone but it wont randomize it

sub fisher_yates_shuffle {
my $deck = shift; # $deck is a reference to an array
return unless #$deck; # must not be empty!
my $i = #$deck;
while (--$i) {
my $j = int rand ($i+1);
#$deck[$i,$j] = #$deck[$j,$i];
}
}
my #randomize1 = #haikuone;
fisher_yates_shuffle(\#randomize1);
print "#randomize1\n";

Always use use strict; use warnings;! You have the following code, but don't have any arrays named #haikuone, #haikutwo, #haikuthree, #line1, #line2 or #line3.
#random1 = $line1[rand #haikuone];
#random2 = $line2[rand #haikutwo];
#random3 = $line3[rand #haikuthree];
It's also really weird that use three arrays with one element each.

Hi i currently am using the List::Util shuffle to randomize a array
with CGI
This makes sense. List::Util::shuffle() is the best way to shuffle a list in Perl - whether or not you're writing a CGI program.
however I want to modify the code to use rand instead
This doesn't make sense. rand() doesn't shuffle a list. It just generates a random number.
It's a good idea to use rand() to extract a single random element from an array.
my $random_element = #array[rand #array];
But that's not what you're trying to do.
If you really want to use rand() then you need to incorporate its use in a function. There's a good function given in the Perl FAQ (in the answer to the question - "How do I shuffle an array randomly?" - so perhaps you should have taken a look at the FAQ before asking here) which looks like this:
sub fisher_yates_shuffle {
my $deck = shift; # $deck is a reference to an array
return unless #$deck; # must not be empty!
my $i = #$deck;
while (--$i) {
my $j = int rand ($i+1);
#$deck[$i,$j] = #$deck[$j,$i];
}
}
But note that this is implemented in Perl. The shuffle() function in List::Util is written in C, so it's going to be faster.
So, all in all, there's really no good reason for not using List::Util::shuffle().

Related

Set::IntervalTree acting weirdly when I add nodes through a loop in perl

I am trying to use the Set::IntervalTree module and I think it is giving me the same node pointers if I insert the elements in a loop.
If I insert them outside a loop sequentially one after the another the nodes are inserted fine and the find / find_window calls works perfectly. But on the nodes which were added in the loop, the find functions give strange results.
#!/usr/bin/perl
use Set::IntervalTree;
my $tree = Set::IntervalTree->new;
$tree->insert("60:70", 60, 70);
$tree->insert("70:80", 70, 80);
$tree->insert("80:90", 80, 90);
for(my $i = 0; $i < 60; $i=$i+10)
{
$j = $i+10;
print "$i".":"."$j\n";
$tree->insert("$i".":"."$j", $i, $i+10);
}
print $tree->str;
my $results1 = $tree->fetch(25, 28);
my $window = $tree->fetch_window(25,250);
my #arr1 = #$results1;
print " #arr1 found.\n";
my $results2 = $tree->fetch(65, 68);
my #arr2 = #$results2;
print " #arr2 found.\n";
This is the output. Check the node pointers.. the ones added from the loop have same pointers and they return wrong interval (which i guess is due to the same pointer values.)
Node:0x9905b20, k=0, h=9, mH=9 l->key=NULL r->key=NULL p->key=10 color=BLACK
Node:0x9905b20, k=10, h=19, mH=29 l->key=0 r->key=20 p->key=30 color=RED
Node:0x9905b20, k=20, h=29, mH=29 l->key=NULL r->key=NULL p->key=10 color=BLACK
Node:0x9905b20, k=30, h=39, mH=89 l->key=10 r->key=70 p->key=NULL color=BLACK
Node:0x9905b20, k=40, h=49, mH=49 l->key=NULL r->key=NULL p->key=50 color=RED
Node:0x9905b20, k=50, h=59, mH=69 l->key=40 r->key=60 p->key=70 color=BLACK
Node:0x98c6270, k=60, h=69, mH=69 l->key=NULL r->key=NULL p->key=50 color=RED
Node:0x98fd138, k=70, h=79, mH=89 l->key=50 r->key=80 p->key=30 color=RED
Node:0x98fd078, k=80, h=89, mH=89 l->key=NULL r->key=NULL p->key=70 color=BLACK
50:60 found.
60:70 found.
It looks like you have to use a Perl scalar variable if you are calling insert from an inner block. If you use an interpolated string it will be discarded and overwritten by the subsequent execution of the block.
This code seems to work fine. Please always use strict and use warnings on all your programs - particularly those you are asking for help with.
#!/usr/bin/perl
use strict;
use warnings;
use Set::IntervalTree;
my $tree = Set::IntervalTree->new;
for (my $i = 0; $i <= 80; $i += 10) {
my $name = sprintf '%02d:%02d', $i, $i+10;
$tree->insert($name, $i, $i+10);
}
my $results1 = $tree->fetch(25, 28);
print "#$results1 found\n";
my $results2 = $tree->fetch(65, 68);
print "#$results2 found\n";
output
20:30 found
60:70 found
i have used this package. the C++ code for insert has arguments (&t, int, int). this means that the C++ code store away a pointer to the perl data
for the first argument. using a scalar is not really a solution. eventually
the value will be overwritten. the solution i used was to put the id into a
hash and pass the hash value as the id. this will keep around the perl data
for the id. the C++ code should copy the id to maintain its value.
$id = "$start:$end";
$rephash1{$id} = $id;
$repeats1{$scaffold} = Set::IntervalTree->new if (!defined($repeats1{$scaffold}));
$repeats1{$scaffold}->insert($rephash1{$id}, $start, $end);

shuffle url parameter behind & in Perl

Hello i was wondering if anyone know how to shuffle an url with perl, but then only all parameters after the &.
here's an example:
anyurl=i&ct=1&cad=1&rsm=6&ei=JthyULClH8fH0QWcooD4Bw&zx=1349703728841
here's what im looking for:
anyurl=i&ei=JthyULClH8fH0QWcooD4Bw&cad=1&ct=1&rsm=6&zx=1349703728841
just so that all parameters behind the & are placed on a different place randomly. So i want all parameters behind the & on a different place every print, is this possible?
thnx in advance.
#!/usr/bin/env perl
use strict;
use warnings;
use List::Util qw(shuffle);
my $url = 'http://www.anyurl.com/userdata?ct=1&cad=1&rsm=6&ei=JthyULClH8fH0QWcooD4Bw&zx=1349703728841';
my #parts = split(/\?/,$url);
my $parms = join('&',shuffle(split(/&/,$parts[1])));
my $shuffled = join('?',$parts[0],$parms);
print $shuffled;
can be shorter, but this is a step-by-step idea of how to do that.
Try converting the query string into an array and then shuffling the array:
my $qryStr = 'ct=1&cad=1&rsm=6&ei=JthyULClH8fH0QWcooD4Bw&zx=1349703728841';
my #init = split('&', $qryStr);
my $i = #init;
my #shfld;
while($i--)
{
my $j = int(rand($i+1));
$shfld[$i] = $init[$j];
splice(#init, $j, 1);
}
my $rslt = join('&', #shfld);
print "\nRESULT = ".$rslt;

Perl: Can't pass an "on-the-fly" array to a sub

strftime(), as per cpan.org:
print strftime($template, #lt);
I just can't figure the right Perl code recipe for this one. It keeps reporting an error where I call strftime():
...
use Date::Format;
...
sub parse_date {
if ($_[0]) {
$_[0] =~ /(\d{4})/;
my $y = $1;
$_[0] =~ s/\d{4}//;
$_[0] =~ /(\d\d)\D(\d\d)/;
return [$2,$1,$y];
}
return [7,7,2010];
}
foreach my $groupnode ($groupnodes->get_nodelist) {
my $groupname = $xp->find('name/text()', $groupnode);
my $entrynodes = $xp->find('entry', $groupnode);
for my $entrynode ($entrynodes->get_nodelist) {
...
my $date_added = parse_date($xp->find('date_added/text()', $entrynode));
...
$groups{$groupname}{$entryname} = {...,'date_added'=>$date_added,...};
...
}
}
...
my $imday = $maxmonth <= 12 ? 0 : 1;
...
while (my ($groupname, $entries) = each %groups) {
...
while (my ($entryname, $details) = each %$entries) {
...
my $d = #{$details->{'date_added'}};
$writer->dataElement("creation", strftime($date_template, (0,0,12,#$d[0^$imday],#$d[1^$imday]-1,#$d[2],0,0,0)));
}
...
}
...
If I use () to pass the required array by strftime(), I get:
Type of arg 2 to Date::Format::strftime must be array (not list) at ./blah.pl line 87, near "))"
If I use [] to pass the required array, I get:
Type of arg 2 to Date::Format::strftime must be array (not anonymous list ([])) at ./blah.pl line 87, near "])"
How can I pass an array on the fly to a sub in Perl? This can easily be done with PHP, Python, JS, etc. But I just can't figure it with Perl.
EDIT: I reduced the code to these few lines, and I still got the exact same problem:
#!/usr/bin/perl
use warnings;
use strict;
use Date::Format;
my #d = [7,13,2010];
my $imday = 1;
print strftime( q"%Y-%m-%dT12:00:00", (0,0,12,$d[0^$imday],$d[1^$imday]-1,$d[2],0,0,0));
Where an array is required and you have an ad hoc list, you need to actually create an array. It doesn't need to be a separate variable, you can do just:
strftime(
$date_template,
#{ [0,0,12,$d[0^$imday],$d[1^$imday],$d[2],0,0,0] }
);
I have no clue why Date::Format would subject you to this hideousness and not just expect multiple scalar parameters; seems senseless (and contrary to how other modules implement strftime). Graham Barr usually designs better interfaces than this. Maybe it dates from when prototypes still seemed like a cool idea for general purposes.
To use a list as an anonymous array for, say, string interpolation, you could write
print "#{[1, 2, 3]}\n";
to get
1 2 3
The same technique provides a workaround to Date::Format::strftime's funky prototype:
print strftime(q"%Y-%m-%dT12:00:00",
#{[0,0,12,$d[0^$imday],$d[1^$imday]-1,$d[2],0,0,0]});
Output:
1900-24709920-00T12:00:00
Normally, it is easy to pass arrays "on-the-fly" to Perl subroutines. But Date::Format::strftime is a special case with a special prototype ($\#;$) that doesn't allow "list" arguments or "list assignment" arguments:
strftime($format, (0,0,12,13,7-1,2010-1900)); # not ok
strftime($format, #a=(0,0,12,13,7-1,2010-1900)); # not ok
The workaround is that you must call strftime with an array variable.
my #time = (0,0,12,13,7-1,2010-1900); # note: #array = ( ... ), not [ ... ]
strftime($format, #time);
I looked again and I see the real problem in this code:
my $d = #{$details->{'date_added'}};
$writer->dataElement("creation", strftime($date_template, (0,0,12,#$d[0^$imday],#$d[1^$imday]-1,#$d[2],0,0,0)));
Specifically #{$details->{'date_added'}} is a dereference. But you're assigning it to a scalar variable and you don't need to dereference in the line below it:
my #d = #{$details->{'date_added'}};
$writer->dataElement("creation", strftime($date_template, (0,0,12,$d[0^$imday],$d[1^$imday]-1,$d[2],0,0,0)));
I've created a regular array for your reference #d and just accessed it as a regular array ( $d[ ... ] instead of #$d[ ... ] )

Converting code to perl sub, but not sure I'm doing it right

I'm working from a question I posted earlier (here), and trying to convert the answer to a sub so I can use it multiple times. Not sure that it's done right though. Can anyone provide a better or cleaner sub?
I have a good deal of experience programming, but my primary language is PHP. It's frustrating to know how to execute in one language, but not be able to do it in another.
sub search_for_key
{
my ($args) = #_;
foreach $row(#{$args->{search_ary}}){
print "#$row[0] : #$row[1]\n";
}
my $thiskey = NULL;
my #result = map { $args->{search_ary}[$_][0] } # Get the 0th column...
grep { #$args->{search_in} =~ /$args->{search_ary}[$_][1]/ } # ... of rows where the
0 .. $#array; # first row matches
$thiskey = #result;
print "\nReturning: " . $thiskey . "\n";
return $thiskey;
}
search_for_key({
'search_ary' => $ref_cam_make,
'search_in' => 'Canon EOS Rebel XSi'
});
---Edit---
From the answers so far, I've cobbled together the function below. I'm new to Perl, so I don't really understand much of the syntax. All I know is that it throws an error (Not an ARRAY reference at line 26.) about that grep line.
Since I seem to not have given enough info, I will also mention that:
I am calling this function like this (which may or may not be correct):
search_for_key({
'search_ary' => $ref_cam_make,
'search_in' => 'Canon EOS Rebel XSi'
});
And $ref_cam_make is an array I collect from a database table like this:
$ref_cam_make = $sth->fetchall_arrayref;
And it is in the structure like this (if I understood how to make the associative fetch work properly, I would like to use it like that instead of by numeric keys):
Reference Array
Associative
row[1][cam_make_id]: 13, row[1][name]: Sony
Numeric
row[1][0]: 13, row[1][1]: Sony
row[0][0]: 19, row[0][1]: Canon
row[2][0]: 25, row[2][1]: HP
sub search_for_key
{
my ($args) = #_;
foreach my $row(#{$args->{search_ary}}){
print "#$row[0] : #$row[1]\n";
}
print grep { $args->{search_in} =~ #$args->{search_ary}[$_][1] } #$args->{search_ary};
}
You are moving in the direction of a 2D array, where the [0] element is some sort of ID number and the [1] element is the camera make. Although reasonable in a quick-and-dirty way, such approaches quickly lead to unreadable code. Your project will be easier to maintain and evolve if you work with richer, more declarative data structures.
The example below uses hash references to represent the camera brands. An even nicer approach is to use objects. When you're ready to take that step, look into Moose.
use strict;
use warnings;
demo_search_feature();
sub demo_search_feature {
my #camera_brands = (
{ make => 'Canon', id => 19 },
{ make => 'Sony', id => 13 },
{ make => 'HP', id => 25 },
);
my #test_searches = (
"Sony's Cyber-shot DSC-S600",
"Canon cameras",
"Sony HPX-32",
);
for my $ts (#test_searches){
print $ts, "\n";
my #hits = find_hits($ts, \#camera_brands);
print ' => ', cb_stringify($_), "\n" for #hits;
}
}
sub cb_stringify {
my $cb = shift;
return sprintf 'id=%d make=%s', $cb->{id}, $cb->{make};
}
sub find_hits {
my ($search, $camera_brands) = #_;
return grep { $search =~ $_->{make} } #$camera_brands;
}
This whole sub is really confusing, and I'm a fairly regular perl user. Here are some blanket suggestions.
Do not create your own undef ever -- use undef then return at the bottom return $var // 'NULL'.
Do not ever do this: foreach $row, because foreach my $row is less prone to create problems. Localizing variables is good.
Do not needlessly concatenate, for it offends the style god: not this, print "\nReturning: " . $thiskey . "\n";, but print "\nReturning: $thiskey\n";, or if you don't need the first \n: say "Returning: $thiskey;" (5.10 only)
greping over 0 .. $#array; is categorically lame, just grep over the array: grep {} #{$foo[0]}, and with that code being so complex you almost certainly don't want grep (though I don't understand what you're doing to be honest.). Check out perldoc -q first -- in short grep doesn't stop until the end.
Lastly, do not assign an array to a scalar: $thiskey = #result; is an implicit $thiskey = scalar #result; (see perldoc -q scalar) for more info. What you probably want is to return the array reference. Something like this (which eliminates $thiskey)
printf "\nReturning: %s\n", join ', ', #result;
#result ? \#result : 'NULL';
If you're intending to return whether a match is found, this code should work (inefficiently). If you're intending to return the key, though, it won't -- the scalar value of #result (which is what you're getting when you say $thiskey = #result;) is the number of items in the list, not the first entry.
$thiskey = #result; should probably be changed to $thiskey = $result[0];, if you want mostly-equivalent functionality to the code you based this off of. Note that it won't account for multiple matches anymore, though, unless you return #result in its entirety, which kinda makes more sense anyway.

Why does my Perl for loop exit early?

I am trying to get a perl loop to work that is working from an array that contains 6 elements. I want the loop to pull out two elements from the array, perform certain functions, and then loop back and pull out the next two elements from the array until the array runs out of elements. Problem is that the loop only pulls out the first two elements and then stops. Some help here would be greatly apperaciated.
my open(infile, 'dnadata.txt');
my #data = < infile>;
chomp #data;
#print #data; #Debug
my $aminoacids = 'ARNDCQEGHILKMFPSTWYV';
my $aalen = length($aminoacids);
my $i=0;
my $j=0;
my #matrix =();
for(my $i=0; $i<2; $i++){
for( my $j=0; $j<$aalen; $j++){
$matrix[$i][$j] = 0;
}
}
The guidelines for this program states that the program should ignore the presence of gaps in the program. which means that DNA code that is matched up with a gap should be ignored. So the code that is pushed through needs to have alignments linked with gaps removed.
I need to modify the length of the array by two since I am comparing two sequence in this part of the loop.
#$lemseqcomp = $lenarray / 2;
#print $lenseqcomp;
#I need to initialize these saclar values.
$junk1 = " ";
$junk2 = " ";
$seq1 = " ";
$seq2 = " ";
This is the loop that is causeing issues. I belive that the first loop should move back to the array and pull out the next element each time it loops but it doesn't.
for($i=0; $i<$lenarray; $i++){
#This code should remove the the last value of the array once and
#then a second time. The sequences should be the same length at this point.
my $last1 =pop(#data1);
my $last2 =pop(#data1);
for($i=0; $i<length($last1); $i++){
my $letter1 = substr($last1, $i, 1);
my $letter2 = substr($last2, $i, 1);
if(($letter1 eq '-')|| ($letter2 eq '-')){
#I need to put the sequences I am getting rid of somewhere. Here is a good place as any.
$junk1 = $letter1 . $junk1;
$junk2 = $letter1 . $junk2;
}
else{
$seq1 = $letter1 . $seq1;
$seq2 = $letter2 . $seq2;
}
}
}
print "$seq1\n";
print "$seq2\n";
print "#data1\n";
I am actually trying to create a substitution matrix from scratch and return the data. The reason why the code looks weird, is because it isn't actually finished yet and I got stuck.
This is the test sequence if anyone is curious.
YFRFR
YF-FR
FRFRFR
ARFRFR
YFYFR-F
YFRFRYF
First off, if you're going to work with sequence data, use BioPerl. Life will be so much easier. However...
Since you know you'll be comparing the lines from your input file as pairs, it makes sense to read them into a datastructure that reflects that. As elsewhere suggested, an array like #data[[line1, line2],[line3,line4]) ensures that the correct pairs of lines are always together.
What I'm not clear on what you're trying to do is:
a) are you generating a consensus
sequence where the 2 sequences are
difference only by gaps
b) are your 2 sequences significantly
different and you're trying to
exclude the non-aligning parts and
then generate a consensus?
So, does the first pair represent your data, or is it more like the second?
ATCG---AAActctgGGGGG--taGC
ATCGcccAAActctgGGGGGTTtaGC
ATCG---AAActctgGGGGG--taGCTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT
ATCGcccAAActctgGGGGGTTtaGCGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG
The problem is that you're using $i as the counter variable for both your loops, so the inner loop modifies the counter out from under the outer loop. Try changing the inner loop's counter to $j, or using my to localize them properly.
Don't store your values as an array, store as a two-dimensional array:
my #dataset = ([$val1, $val2], [$val3, $val4]);
or
my #dataset;
push (#dataset, [$val_n1, $val_n2]);
Then:
for my $value (#dataset) {
### Do stuff with $value->[0] and $value->[1]
}
There are lots of strange things in your code: you are initializing a matrix then not using it; reading a whole file into an array; scanning a string C style but then not doing anything with the unmatched values; and finally, just printing the two last processed values (which, in your case, are the two first elements of your array, since you are using pop.)
Here's a guess.
use strict;
my $aminoacids = 'ARNDCQEGHILKMFPSTWYV';
# Preparing a regular expression. This is kind of useful if processing large
# amounts of data. This will match anything that is not in the string above.
my $regex = qr([^$aminoacids]);
# Our work function.
sub do_something {
my ($a, $b) = #_;
$a =~ s/$regex//g; # removing unwanted characters
$b =~ s/$regex//g; # ditto
# Printing, saving, whatever...
print "Something: $a - $b\n";
return ($a, $b);
}
my $prev;
while (<>) {
chomp;
if ($prev) {
do_something($prev, $_);
$prev = undef;
} else {
$prev = $_;
}
}
print STDERR "Warning: trailing data: $prev\n"
if $prev;
Since you are a total Perl/programming newbie, I am going to show a rewrite of your first code block, then I'll offer you some general advice and links.
Let's look at your first block of sample code. There is a lot of stuff all strung together, and it's hard to follow. I, personally, am too dumb to remember more than a few things at a time, so I chop problems into small pieces that I can understand. This is (was) known as 'chunking'.
One easy way to chunk your program is use write subroutines. Take any particular action or idea that is likely to be repeated or would make the current section of code long and hard to understand, and wrap it up into a nice neat package and get it out of the way.
It also helps if you add space to your code to make it easier to read. Your mind is already struggling to grok the code soup, why make things harder than necessary? Grouping like things, using _ in names, blank lines and indentation all help. There are also conventions that can help, like making constant values (values that cannot or should not change) all capital letters.
use strict; # Using strict will help catch errors.
use warnings; # ditto for warnings.
use diagnostics; # diagnostics will help you understand the error messages
# Put constants at the top of your program.
# It makes them easy to find, and change as needed.
my $AMINO_ACIDS = 'ARNDCQEGHILKMFPSTWYV';
my $AMINO_COUNT = length($AMINO_ACIDS);
my $DATA_FILE = 'dnadata.txt';
# Here I am using subroutines to encapsulate complexity:
my #data = read_data_file( $DATA_FILE );
my #matrix = initialize_matrix( 2, $amino_count, 0 );
# now we are done with the first block of code and can do more stuff
...
# This section down here looks kind of big, but it is mostly comments.
# Remove the didactic comments and suddenly the code is much more compact.
# Here are the actual subs that I abstracted out above.
# It helps to document your subs:
# - what they do
# - what arguments they take
# - what they return
# Read a data file and returns an array of dna strings read from the file.
#
# Arguments
# data_file => path to the data file to read
sub read_data_file {
my $data_file = shift;
# Here I am using a 3 argument open, and a lexical filehandle.
open( my $infile, '<', $data_file )
or die "Unable to open dnadata.txt - $!\n";
# I've left slurping the whole file intact, even though it can be very inefficient.
# Other times it is just what the doctor ordered.
my #data = <$infile>;
chomp #data;
# I return the data array rather than a reference
# to keep things simple since you are just learning.
#
# In my code, I'd pass a reference.
return #data;
}
# Initialize a matrix (or 2-d array) with a specified value.
#
# Arguments
# $i => width of matrix
# $j => height of matrix
# $value => initial value
sub initialize_matrix {
my $i = shift;
my $j = shift;
my $value = shift;
# I use two powerful perlisms here: map and the range operator.
#
# map is a list contsruction function that is very very powerful.
# it calls the code in brackets for each member of the the list it operates against.
# Think of it as a for loop that keeps the result of each iteration,
# and then builds an array out of the results.
#
# The range operator `..` creates a list of intervening values. For example:
# (1..5) is the same as (1, 2, 3, 4, 5)
my #matrix = map {
[ ($value) x $i ]
} 1..$j;
# So here we make a list of numbers from 1 to $j.
# For each member of the list we
# create an anonymous array containing a list of $i copies of $value.
# Then we add the anonymous array to the matrix.
return #matrix;
}
Now that the code rewrite is done, here are some links:
Here's a response I wrote titled "How to write a program". It offers some basic guidelines on how to approach writing software projects from specification. It is aimed at beginners. I hope you find it helpful. If nothing else, the links in it should be handy.
For a beginning programmer, beginning with Perl, there is no better book than Learning Perl.
I also recommend heading over to Perlmonks for Perl help and mentoring. It is an active Perl specific community site with very smart, friendly people who are happy to help you. Kind of like Stack Overflow, but more focused.
Good luck!
Instead of using a C-style for loop, you can read data from an array two elements at a time using splice inside a while loop:
while (my ($letter1, $letter2) = splice(#data, 0, 2))
{
# stuff...
}
I've cleaned up some of your other code below:
use strict;
use warnings;
open(my $infile, '<', 'dnadata.txt');
my #data = <$infile>;
close $infile;
chomp #data;
my $aminoacids = 'ARNDCQEGHILKMFPSTWYV';
my $aalen = length($aminoacids);
# initialize a 2 x 21 array for holding the amino acid data
my $matrix;
foreach my $i (0 .. 1)
{
foreach my $j (0 .. $aalen-1)
{
$matrix->[$i][$j] = 0;
}
}
# Process all letters in the DNA data
while (my ($letter1, $letter2) = splice(#data, 0, 2))
{
# do something... not sure what?
# you appear to want to look up the letters in a reference table, perhaps $aminoacids?
}