What does it mean to pre-increment $#array? - perl

I've come across the following line of code. It has issues:
it is intended to do the same as push
it ought to have used push
it's hard to read, understand
I've since changed it to use push
it does something I thought was illegal, but clearly isn't
here it is:
$array [++$#array] = 'data';
My question is: what does it mean to pre-increment $#array? I always considered $#array to be an attribute of an array, and not writable.

perldata says:
"The length of an array is a scalar value. You may find the length of array #days by evaluating $#days , as in csh. However, this isn't the length of the array; it's the subscript of the last element, which is a different value since there is ordinarily a 0th element. Assigning to $#days actually changes the length of the array. Shortening an array this way destroys intervening values. Lengthening an array that was previously shortened does not recover values that were in those elements."
Modifying $#array is useful in some cases, but in this case, clearly push is better.

A post-increment will return the variable first and then increment it.
If you used post-increment you would be modifing the last element, since its returned first, and then pushing an empty element onto the end. On the second loop you would be modifing that empty value and pushing a new empty one for later. So it wouldn't work like a push at all.
The pre-increment will increment the variable and then return it. That way your example will always being writing to a new, last element of the array and work like push. Example below:
my (#pre, #post);
$pre[$#pre++] = '1';
$pre[$#pre++] = '2';
$pre[$#pre++] = '3';
$post[++$#post] = '1';
$post[++$#post] = '2';
$post[++$#post] = '3';
print "pre keys: ".#pre."\n";
print "pre: #pre\n";
print "post keys: ".#post."\n";
print "post: #post\n";
outputs:
pre keys: 3
pre: 2 3
post keys: 3
post: 1 2 3

Assigning a value larger than the current array length to $#array extends the array.

This code works too:
$ perl -le 'my #a; $a[#a]="Hello"; $a[#a]=" world!"; print #a'
Hello world!
Perl array is dynamic and grows when assign beyond limits.

First of all, that's foul.
That said, I'm also surprised that it works. I would have guessed that ++$#array would have gotten the "Can't modify constant" error you get when trying to increment a number. (Not that I ever accidentally do that, of course.) But, I guess that's exactly where we were wrong: $#array isn't a constant (a number); it's a variable expression. As such you can mess with it. Consider the following:
my #array = qw/1 2 3/;
++$#array;
$array[$#array] = qw/4/;
print "#array\n"
And even, for extra fun, this:
my #array = qw/1 2 3/;
$#array += 5;
foreach my $wtf (#array) {
if (defined $wtf) {
print "$wtf\n";
}
else {
print "undef\n";
}
}
And, yeah, the Perl Cookbook is happy to mess with $#array to grow or truncate arrays (Chapter 4, recipe 3). I still find it ugly, but maybe that's just a lingering "but it's a number" prejudice.

Related

How to check rows returned by fetchall_arrayref method in Perl?

I am trying to understand following piece of code, in particular what's happening in line 4, 5 and 6.
I have understood most of it , but just can't seem to understand what's being done with #$r != 1; in line 4 (does #$r represents the number of rows returned?) and similarly what's happening with #$r[0] in line 5 and #$rr[0] in line 6:
1 my $sth = $dbh->prepare(" a select statement ");
2 $sth->execute();
3 my $r = $sth->fetchall_arrayref();
4 die "Failed to select " if !defined($r) || #$r != 1;
5 my $rr = #$r[0];
6 my $rec = #$rr[0];
7 print "Rec is $rec\n";
The fetchall_arrayref() returns a reference to an array of all rows, in which each element is also a reference to an array, with that row's elements.
Then the die line checks
that the top-level reference $r is defined (that the call worked), and
that the size of that array, #$r, is exactly 1 – so, that the array contains exactly one element. This betrays the expectation that the query will return one row and since the code is prepared to die for this it may well ask for one row, by fetchrow_arrayref or fetchrow_array.
The #$r dereferences the $r arrayref, and in the scalar context imposed by != we indeed get the number of elements in the list.
The line 5 is very misleading, to say the least, even as the syntax happens to be legitimate: it extracts the first element and feeds it into a scalar, but using #$r[0] syntax which would normally imply that we are getting a list, by its sigil #. It's equivalent to #{$r}[0] and is an abuse of notation.
It should either clearly get the first element, if that's the intent
my $rr = $r->[0];
or also dereference it to get the whole array
my #row = #{ $r->[0] };
if that's wanted.
The last line that you query does the exact same, using the retrieved $rr reference. But the first element of the first arrayref (row) is easily obtained directly
my $rec = $r->[0]->[0]; # or $r->[0][0]
what replaces lines 5 and 6.
See perlreftut and perldsc.
Evaluating an array (reference) in scalar context returns the number of elements in the array. The answer to your other questions is that it's just standard dereferencing using ugly syntax and useless intermediate variables.
I believe in teaching people how to fish, though, so consider this code, which is essentially what you're working with:
use strict;
use warnings;
use Data::Dumper;
my $r = [[1, 2, 3]];
my $rr = #$r[0];
my $rec = #$rr[0];
print Dumper($r, $rr, $rec);
Output:
$VAR1 = [
[
1,
2,
3
]
];
$VAR2 = $VAR1->[0];
$VAR3 = 1;
It should be easy to see what's going on now that you can see what each variable holds, right?

Perl: If two elements match print elements, else iterate until match and then print

I'm new to Perl and I'm trying to iterate over two elements of an array with multiple indices in each element and look for a match. If element2 matches element1, I want to print both and move to the next position in element1 and continue the loop looking for the next match. If I don't have a match, loop until I get a match. Here is what I have:
#array = split(',',$row);
foreach $element1(#array[1])
{
foreach $element2(#array[2])
{
if($element1 == $element2)
{
print "1 = $element1 : 2 = $element2 \n";
}
}
}
I'm not getting the the matched output. I've tried multiple iterations with different syntactical changes.
I can get both elements when I do this:
foreach $element1(#array[1])
{
foreach $element2(#array[2])
{
print "1 = $element1 : 2 = $element2 \n";
}
}
I thought I might not be dereferencing correctly. Any guidance or suggestions would be appreciated. Thanks.
There are a number of issues with your script. Briefly:
You should always use strict and warnings.
Array indices start at 0, not 1.
You get an element of an array with $array[0], not #array[0]. This is a common frustration for new Perl programmers. The thing to remember is that the sigil (the symbol preceding a variable name) indicates the type of value being passed (e.g. $scalar, #array, or %hash) to the left-hand side of the expression, not the type of datastructure being accessed on the right-hand side.
As #sp-asic pointed out in the comments on the OP, string comparisons are performed with eq, not ==.
References to datastructures are stored in scalars, and you dereference by prepending the sigil of the original datastructure. If $foo is a reference to an array, #$foo gets you the original array.
You apparently want to break out of your inner loops when you find a match, but you'll want to make it clear (for people who look at this code in the future, which may include yourself) which loop you're breaking out of.
Most critically, #array will be an array of strings after you split another string (the row) on commas, so it's not clear why you expect to be able to treat the strings in the first and second position as arrays that you can loop through. I have a few guesses about what you're actually trying to do, and what your inputs and expected outputs actually look like, but I'll wait for you to provide some additional information and leave the information above as general guidance in the meantime, along with a lightly-reworked version of your code below.
use strict;
use warnings;
my #array = split(',', $row);
foreach my $element1 (#$array[0]) {
foreach my $element2 (#$array[1]) {
if ($element1 eq $element2) {
print "1 = $element1 : 2 = $element2\n";
last;
}
}
}

Using the $# operator

I've just taken over maintenance of a piece of Perl system. The machine it used to run on is dead so I do not know which version of Perl it was using, but it was working. It included the following line to count the lines in a page of ASCII text
my $lcnt = $#{#{$page{'lines'}}};
In Perl 5.10.1 ( we are now running this on CentOS 6.3 ) the above code no longer works. I instead use the following, which works fine.
my #arr = #{$page{'lines'}};
my $lcnt = $#arr;
I'll admit my Perl isn't great but from what I can see the first version should never have worked as it is trying to deference an array rather than an array ref
First question - is my guess at why this first line of code doesn't now work correct, and secondly did it work earlier due to a now fixed bug in a prior Perl version?
Thanks!
The first version never worked. Assuming $page{'lines'} is an arrayref, this is what you want:
my $lcnt = $#{$page{'lines'}};
Note that this is going to give you one less than the number of items in your arraref. The $# operator is the INDEX of the last item, not the number of items. If you want the number of items in $page{'lines'}, you probably want this:
my $lcnt = scalar(#{$page{'lines'}});
Some things about your code. This:
my $lcnt = $#{#{$page{'lines'}}};
Was never correct. Take a look at the three things going on here
$page{'lines'} # presumably an array ref
#{ ... } # dereference into an array
$#{ ... } # get last index of an array ref
This is equivalent to (continuing on your own code):
my #arr = #{$page{'lines'}};
my $foo = #arr; # foo is now the size of the array, e.g. 3
my $lcnt = $#$foo;
If you use
use strict;
use warnings;
Which you should always do, without question (!), you will get the informative fatal error message:
Can't use string ("3") as an ARRAY ref while "strict refs" in use
(Where 3 will be the size of your array)
The correct way to get the size (number of elements) of an array is to put the array in scalar context:
my $size = #{ $page{'lines'} };
The way to get the index of the last element is using the $# sigil:
my $last_index = $#{ $page{'lines'} };
As you'll note, the syntax is the same, it is just a matter of using # or $# to get what you want, just the same as when using a regular array
my $size = #array;
my $last = $#array;
So, to refer back to the beginning: Using both # and $# is not and was never correct.

= and , operators in Perl

Please explain this apparently inconsistent behaviour:
$a = b, c;
print $a; # this prints: b
$a = (b, c);
print $a; # this prints: c
The = operator has higher precedence than ,.
And the comma operator throws away its left argument and returns the right one.
Note that the comma operator behaves differently depending on context. From perldoc perlop:
Binary "," is the comma operator. In
scalar context it evaluates its left
argument, throws that value away, then
evaluates its right argument and
returns that value. This is just like
C's comma operator.
In list context, it's just the list
argument separator, and inserts both
its arguments into the list. These
arguments are also evaluated from left
to right.
As eugene's answer seems to leave some questions by OP i try to explain based on that:
$a = "b", "c";
print $a;
Here the left argument is $a = "b" because = has a higher precedence than , it will be evaluated first. After that $a contains "b".
The right argument is "c" and will be returned as i show soon.
At that point when you print $a it is obviously printing b to your screen.
$a = ("b", "c");
print $a;
Here the term ("b","c") will be evaluated first because of the higher precedence of parentheses. It returns "c" and this will be assigned to $a.
So here you print "c".
$var = ($a = "b","c");
print $var;
print $a;
Here $a contains "b" and $var contains "c".
Once you get the precedence rules this is perfectly consistent
Since eugene and mugen have answered this question nicely with good examples already, I am going to setup some concepts then ask some conceptual questions of the OP to see if it helps to illuminate some Perl concepts.
The first concept is what the sigils $ and # mean (we wont descuss % here). # means multiple items (said "these things"). $ means one item (said "this thing"). To get first element of an array #a you can do $first = $a[0], get the last element: $last = $a[-1]. N.B. not #a[0] or #a[-1]. You can slice by doing #shorter = #longer[1,2].
The second concept is the difference between void, scalar and list context. Perl has the concept of the context in which your containers (scalars, arrays etc.) are used. An easy way to see this is that if you store a list (we will get to this) as an array #array = ("cow", "sheep", "llama") then we store the array as a scalar $size = #array we get the length of the array. We can also force this behavior by using the scalar operator such as print scalar #array. I will say it one more time for clarity: An array (not a list) in scalar context will return, not an element (as a list does) but rather the length of the array.
Remember from before you use the $ sigil when you only expect one item, i.e. $first = $a[0]. In this way you know you are in scalar context. Now when you call $length = #array you can see clearly that you are calling the array in scalar context, and thus you trigger the special property of an array in list context, you get its length.
This has another nice feature for testing if there are element in the array. print '#array contains items' if #array; print '#array is empty' unless #array. The if/unless tests force scalar context on the array, thus the if sees the length of the array not elements of it. Since all numerical values are 'truthy' except zero, if the array has non-zero length, the statement if #array evaluates to true and you get the print statement.
Void context means that the return value of some operation is ignored. A useful operation in void context could be something like incrementing. $n = 1; $n++; print $n; In this example $n++ (increment after returning) was in void context in that its return value "1" wasn't used (stored, printed etc).
The third concept is the difference between a list and an array. A list is an ordered set of values, an array is a container that holds an ordered set of values. You can see the difference for example in the gymnastics one must do to get particular element after using sort without storing the result first (try pop sort { $a cmp $b } #array for example, which doesn't work because pop does not act on a list, only an array).
Now we can ask, when you attempt your examples, what would you want Perl to do in these cases? As others have said, this depends on precedence.
In your first example, since the = operator has higher precedence than the ,, you haven't actually assigned a list to the variable, you have done something more like ($a = "b"), ("c") which effectively does nothing with the string "c". In fact it was called in void context. With warnings enabled, since this operation does not accomplish anything, Perl attempts to warn you that you probably didn't mean to do that with the message: Useless use of a constant in void context.
Now, what would you want Perl to do when you attempt to store a list to a scalar (or use a list in a scalar context)? It will not store the length of the list, this is only a behavior of an array. Therefore it must store one of the values in the list. While I know it is not canonically true, this example is very close to what happens.
my #animals = ("cow", "sheep", "llama");
my $return;
foreach my $animal (#animals) {
$return = $animal;
}
print $return;
And therefore you get the last element of the list (the canonical difference is that the preceding values were never stored then overwritten, however the logic is similar).
There are ways to store a something that looks like a list in a scalar, but this involves references. Read more about that in perldoc perlreftut.
Hopefully this makes things a little more clear. Finally I will say, until you get the hang of Perl's precedence rules, it never hurts to put in explicit parentheses for lists and function's arguments.
There is an easy way to see how Perl handles both of the examples, just run them through with:
perl -MO=Deparse,-p -e'...'
As you can see, the difference is because the order of operations is slightly different than you might suspect.
perl -MO=Deparse,-p -e'$a = a, b;print $a'
(($a = 'a'), '???');
print($a);
perl -MO=Deparse,-p -e'$a = (a, b);print $a'
($a = ('???', 'b'));
print($a);
Note: you see '???', because the original value got optimized away.

Perl: Can I access two different levels of an array of hashes at the same time by using indices?

I'm completely new to Perl and I need to write a program that clusters found matches if they are at a certain distance from each other. So I got an array of hashes containing on each level the begin position, the end position and the number of matches present in a cluster(1 in the beginning).
If I want to know if the distance between two matches is ok, I do Begin2-End1
my $DEBUG = 1;
my #hitsarray =();
my ($beginarray,$endarray,$aantalarray);
my $hit = { BEGIN => $beginarray, EIND => $endarray, MATCHES => $aantalarray, };
for (my $k = 0;$k <= $#beginarray;$k++)
{
print $beginarray[$k],"\t",$endarray[$k],"\t",$aantalarray[$k],"\n" if ($DEBUG);
$hit = ();
$hit->{BEGIN} = $beginarray[$k];
$hit->{END} = $endarray[$k];
$hit->{MATCHES} = $aantalarray[$k];
push (#hitsarray,$hit);
}
for ( my $m = 0; $m <= $#hitsarray; $m++)
{
while($hitsarray[$m+1]{BEGIN} - $hitsarray[$m]{END} < 5 && $hitsarray[$m+1]{BEGIN} - $hitsarray[$m]{END} > 3)
{
$hitsarray[$m]{END} = $hitsarray[$m+1]{EIND};
$hitsarray[$m]{MATCHES} +=1;
delete $hitsarray[$m+1];
print $beginarray[$m],"\t",$endarray[$m],"\t",$aantalarray[$m],"\n" if ($DEBUG);
}
}
But it doesn't work! My pc gets in a loop and states "Use of uninitialized value in subtraction (-) at script line 55."
It probably has to do something with using references but I don't really understand those..
I also tried a simpler structure with two non-connected arrays but I've got the same problem;
How do you use elements from different lines (and from different arrays) for subtraction?
Any help is totally welcome!!
I know that this might not seem to be the most helpful, but your code is so wrong that there is not a single problem or a single correction. Here are some of the problems.
put use warnings; use strict; at the top of your script.
$beginarray, $endarray and $aantalarray are all scalars, not arrays. You might want them to be references to arrays, but they aren't because you never assign them. NOTE: when you do $beginarray[$m] that is referencing an array variable called #beginarray which is the same name but actually a different variable from $beginarray which is a scalar.
You aren't showing us everything if you are having a problem on line 55
this, $hit = (); actually just sets the SCALAR variable $hit to 0 because that is the length of the array ().
$#beginarray is going to be -1 because #beginarray is not declared. Even if you changed your code to declare it, it still has no data so the first look won't run.
delete $hitsarray[$m+1] will remove that value from the array, but that index will just be empty, the items above it will not move in the array. To remove an item from an array you need either to grep into a new array or splice the existing array.
You need to make a much smaller example or working with arrays to figure out what you are doing wrong.
for ( my $m = 0; $m <= $#hitsarray; $m++)
{
while($hitsarray[$m+1]{BEGIN} - $hitsarray[$m]{END} < 5 && $hitsarray[$m+1]{BEGIN} - $hitsarray[$m]{END} > 3)
Here you are using element $m+1, which is beyond the end of the array on the final for iteration. Perhaps your for loop should say $m < $#hitsarray.