Where does a Perl subroutine get values missing from the actual parameters? - perl

I came across the following Perl subroutine get_billable_pages while chasing a bug. It takes 12 arguments.
sub get_billable_pages {
my ($dbc,
$bill_pages, $page_count, $cover_page_count,
$domain_det_page, $bill_cover_page, $virtual_page_billing,
$job, $bsj, $xqn,
$direction, $attempt,
) = #_;
my $billable_pages = 0;
if ($virtual_page_billing) {
my #row;
### Below is testing on the existence of the 11th and 12th parameters ###
if ( length($direction) && length($attempt) ) {
$dbc->xdb_execute("
SELECT convert(int, value)
FROM job_attribute_detail_atmp_tbl
WHERE job = $job
AND billing_sub_job = $bsj
AND xqn = $xqn
AND direction = '$direction'
AND attempt = $attempt
AND attribute = 1
");
}
else {
$dbc->xdb_execute("
SELECT convert(int, value)
FROM job_attribute_detail_tbl
WHERE job = $job
AND billing_sub_job = $bsj
AND xqn = $xqn
AND attribute = 1
");
}
$cnt = 0;
...;
But is sometimes called with only 10 arguments
$tmp_det = get_billable_pages(
$dbc2,
$row[6], $row[8], $row[7],
$domain_det_page, $bill_cover_page, $virtual_page_billing,
$job1, $bsj1, $row[3],
);
The function does a check on the 11th and 12th arguments.
What are the 11th and 12th arguments when the function is passed only 10 arguments?
Is it a bug to call the function with only 10 arguments because the 11th and 12th arguments end up being random values?
I am thinking this may be the source of the bug because the 12th argument had a funky value when the program failed.
I did not see another definition of the function which takes only 10 arguments.

The values are copied out of the parameter array #_ to the list of scalar variables.
If the array is shorter than the list, then the excess variables are set to undef. If the array is longer than the list, then excess array elements are ignored.
Note that the original array #_ is unmodified by the assignment. No values are created or lost, so it remains the definitive source of the actual parameters passed when the subroutine is called.
ikegami suggested that I should provide some Perl code to demonstrate the assignment of arrays to lists of scalars. Here is that Perl code, based mostly on his edit
use strict;
use warnings;
use Data::Dumper;
my $x = 44; # Make sure that we
my $y = 55; # know if they change
my #params = (8); # Make a dummy parameter array with only one value
($x, $y) = #params; # Copy as if this is were a subroutine
print Dumper $x, $y; # Let's see our parameters
print Dumper \#params; # And how the parameter array looks
output
$VAR1 = 8;
$VAR2 = undef;
$VAR1 = [ 8 ];
So both $x and $y are modified, but if there are insufficient values in the array then undef is used instead. It is as if the source array was extended indefinitely with undef elements.
Now let's look at the logic of the Perl code. undef evaluates as false for the purposes of conditional tests, but you apply the length operator like this
if ( length($direction) && length($attempt) ) { ... }
If you have use warnings in place as you should, Perl would normally produce a Use of uninitialized value warning. However length is unusual in that, if you ask for the length of an undef value (and you are running version 12 or later of Perl 5) it will just return undef instead of warning you.
Regarding "I did not see another definition of the function which takes only 10 arguments", Perl doesn't have function templates like C++ and Java - it is up to the code in the subroutine to look at what it has been passed and behave accordingly.

No, it's not a bug. The remaining arguments are "undef" and you can check for this situation
sub foo {
my ($x, $y) = #_;
print " x is undef\n" unless defined $x;
print " y is undef\n" unless defined $y;
}
foo(1);
prints
y is undef

Related

Is it possible to match a string $x = "foo" with a variable named $foo and assign a value only if it matches?

What I am trying to ask is easily shown.
Imagine 2 variables named as
my ($foo, $bar) = (0,0);
and
my #a = ("foo","bar","beyond","recognition");
is it possible to string match $a[0] with the variable named $foo, and assign a value to it (say "hi") only if it matches identically?
I am trying to debug this some code (not mine), and I ran into a tough spot. Basically, I have a part of the script where I have a bunch of variables.
my ($p1, $p2, $p3, $p4)= (0,0,0,0); # *Edited*
my #ids = ("p1","p2","p3","p4")
I have a case where I need to pass each of those variables as a hash key to call a certain operation inside a loop.
for (0..3){
my $handle = get_my_stuff(#ids);
my $ret = $p1->do_something(); # <- $p1 is used for the first instance of loop.
...
...
...
}
FOr the first iteration of the loop, I need to use $p1, but for the second iteration of the loop I need to pass (or call)
my $ret = $p2->do_something(); # not $p1
So what I did was ;
my $p;
for (1..4){
my $handle = get_my_stuff(#ids);
no strict 'refs';
my $ret = $p{$_}->do_something();
...
...
...
use strict 'refs';
...
}
But the above operation is not allowed, and I am unable to call my key in such a manner :(. As it turns out, $p1 became a blessed hash as soon after get_my_stuff() was called. And to my biggest surprise, somehow the script in the function (too much and too long to paste here) assign or pass a hash reference to my variables only if they match.
You don't need to try to invent something to deal with variable names. Your idea to use a hash is correct, but your approach is flawed.
It seems your function get_my_stuff takes a list of arguments and transforms them somehow. It then returns a list of objects that correspond to the arguments. Instead of doing that in the loop, do it before you loop through the numbers and build up your hash by assigning each id to an object.
Perl allows you to assign to a hash slice. In that case, the sigil changes to an #. My below implementation uses DateTime with years to show that the objects are different.
use strict;
use warnings;
use feature 'say';
use DateTime;
# for illustration purposes
sub get_my_stuff {
return map { DateTime->new( year => (substr $_, 1) + 2000 ) } #_;
}
my #ids = qw(p1 p2 p3 p4);
my %p;
# create this outside of the loop
#p{#ids} = get_my_stuff(#ids);
foreach my $i ( 1.. 4 ) {
say $p{'p' . $i}->ymd; # "do_something"
}
This will output
2001-01-01
2002-01-01
2003-01-01
2004-01-01

Why I can't do "shift subroutine_name()" in Perl?

Why does this code return an Not an ARRAY reference error?
sub Prog {
my $var1 = 1;
my $var2 = 2;
($var1, $var2);
}
my $variable = shift &Prog;
print "$variable\n";
If I use an intermediate array, I avoid the error:
my #intermediate_array = &Prog;
my $variable = shift #intermediate_array;
print "$variable\n";
The above code now outputs "1".
The subroutine Prog returns a list of scalars. The shift function only operates on an array. Arrays and lists are not the same thing. Arrays have storage, but lists do not.
If what you want is to get the first element of the list that Prog returns, do this:
sub Prog {
return ( 'this', 'that' );
}
my $var = (Prog())[0];
print "$var\n";
I changed the sub invocation to Prog() instead of &Prog because the latter is decidedly old style.
You can also assign the first element to a scalar like others are showing:
my ($var) = Prog();
This is roughly the same as:
my ($var, $ignored_var) = Prog();
and then ignoring $ignored_var. If you want to make it clear that you're ignoring the second value without actually giving it a variable, you can do this:
my ($var, undef) = Prog();
Prog is returning a list, not an array. Operations like shift modify the array and cannot be used on lists.
You can instead do:
my ($variable) = Prog; # $variable is now 1:
# Prog is evaluated in list context
# and the results assigned to the list ($variable)
Note that you don't need the &.

Where is the error "Use of uninitialized value in string ne" coming from?

Where is the uninitialised value in the below code?
#!/usr/bin/perl
use warnings;
my #sites = (undef, "a", "b");
my $sitecount = 1;
my $url;
while (($url = $sites[$sitecount]) ne undef) {
$sitecount++;
}
Output:
Use of uninitialized value in string ne at t.pl line 6.
Use of uninitialized value in string ne at t.pl line 6.
Use of uninitialized value in string ne at t.pl line 6.
Use of uninitialized value in string ne at t.pl line 6.
You can't use undef in a string comparison without a warning.
if ("a" ne undef) { ... }
will raise a warning. If you want to test if a variable is defined or not, use:
if (defined $var) { ... }
Comments about the original question:
That's a strange way to iterate over an array. The more usual way of doing this would be:
foreach my $url (#sites) { ... }
and drop the $sitecount variable completely, and don't overwrite $url in the loop body. Also drop the undef value in that array. If you don't want to remove that undef for some reason (or expect undefined values to be inserted in there), you could do:
foreach my $url (#sites) {
next unless defined $url;
...
}
If you do want to test for undefined with your form of loop construct, you'd need:
while (defined $sites[$sitecount]) {
my $url = $sites[$sitecount];
...
$sitecount++;
}
to avoid the warnings, but beware of autovivification, and that loop would stop short if you have undefs mixed in between other live values.
The correct answers have already been given (defined is how you check a value for definedness), but I wanted to add something.
In perlop you will read this description of ne:
Binary "ne" returns true if the left argument is stringwise not equal
to the right argument.
Note the use of "stringwise". It basically means that just like with other operators, such as ==, where the argument type is pre-defined, any arguments to ne will effectively be converted to strings before the operation is performed. This is to accommodate operations such as:
if ($foo == "1002") # string "1002" is converted to a number
if ($foo eq 1002) # number 1002 is converted to a string
Perl has no fixed data types, and relies on conversion of data. In this case, undef (which coincidentally is not a value, it is a function: undef(), which returns the undefined value), is converted to a string. This conversion will cause false positives, that may be hard to detect if warnings is not in effect.
Consider:
perl -e 'print "" eq undef() ? "yes" : "no"'
This will print "yes", even though clearly the empty string "" is not equal to not defined. By using warnings, we can catch this error.
What you want is probably something like:
for my $url (#sites) {
last unless defined $url;
...
}
Or, if you want to skip to a certain array element:
my $start = 1;
for my $index ($start .. $#sites) {
last unless defined $sites[$index];
...
}
Same basic principle, but using an array slice, and avoiding indexes:
my $start = 1;
for my $url (#sites[$start .. $#sites]) {
last unless defined $url;
...
}
Note that the use of last instead of next is the logical equivalent of your while loop condition: When an undefined value is encountered, the loop is exited.
More debugging: http://codepad.org/Nb5IwX0Q
If you, like in this paste above, print out the iteration counter and the value, you will quite clearly see when the different warnings appear. You get one warning for the first comparison "a" ne undef, one for the second, and two for the last. The last warnings come when $sitecount exceeds the max index of #sites, and you are comparing two undefined values with ne.
Perhaps the message would be better to understand if it was:
You are trying to compare an uninitialized value with a string.
The uninitialized value is, of course, undef.
To explicitely check if $something is defined, you need to write
defined $something
ne is for string comparison, and undef is not a string:
#!/usr/bin/perl
use warnings;
('l' ne undef) ? 0 : 0;
Use of uninitialized value in string ne at t.pl line 3.
It does work, but you get a [slightly confusing] warning (at least with use warnings) because undef is not an "initialized value" for ne to use.
Instead, use the operator defined to find whether a value is defined:
#!/usr/bin/perl
use warnings;
my #sites = (undef, "a", "b");
my $sitecount = 1;
my $url;
while (defined $sites[$sitecount]) { # <----------
$url = $sites[$sitecount];
# ...
$sitecount++;
}
... or loop over the #sites array more conventionally, as Mat explores in his answer.

What perl code samples can lead to undefined behaviour?

These are the ones I'm aware of:
The behaviour of a "my" statement modified with a statement modifier conditional or loop construct (e.g. "my $x if ...").
Modifying a variable twice in the same statement, like $i = $i++;
sort() in scalar context
truncate(), when LENGTH is greater than the length of the file
Using 32-bit integers, "1 << 32" is undefined. Shifting by a negative number of bits is also undefined.
Non-scalar assignment to "state" variables, e.g. state #a = (1..3).
One that is easy to trip over is prematurely breaking out of a loop while iterating through a hash with each.
#!/usr/bin/perl
use strict;
use warnings;
my %name_to_num = ( one => 1, two => 2, three => 3 );
find_name(2); # works the first time
find_name(2); # but fails this time
exit;
sub find_name {
my($target) = #_;
while( my($name, $num) = each %name_to_num ) {
if($num == $target) {
print "The number $target is called '$name'\n";
return;
}
}
print "Unable to find a name for $target\n";
}
Output:
The number 2 is called 'two'
Unable to find a name for 2
This is obviously a silly example, but the point still stands - when iterating through a hash with each you should either never last or return out of the loop; or you should reset the iterator (with keys %hash) before each search.
These are just variations on the theme of modifying a structure that is being iterated over:
map, grep and sort where the code reference modifies the list of items to sort.
Another issue with sort arises where the code reference is not idempotent (in the comp sci sense)--sort_func($a, $b) must always return the same value for any given $a and $b.

What is the difference between the scalar and list contexts in Perl?

What is the difference between the scalar and list contexts in Perl and does this have any parallel in other languages such as Java or Javascript?
Various operators in Perl are context sensitive and produce different results in list and scalar context.
For example:
my(#array) = (1, 2, 4, 8, 16);
my($first) = #array;
my(#copy1) = #array;
my #copy2 = #array;
my $count = #array;
print "array: #array\n";
print "first: $first\n";
print "copy1: #copy1\n";
print "copy2: #copy2\n";
print "count: $count\n";
Output:
array: 1 2 4 8 16
first: 1
copy1: 1 2 4 8 16
copy2: 1 2 4 8 16
count: 5
Now:
$first contains 1 (the first element of the array), because the parentheses in the my($first) provide an array context, but there's only space for one value in $first.
both #copy1 and #copy2 contain a copy of #array,
and $count contains 5 because it is a scalar context, and #array evaluates to the number of elements in the array in a scalar context.
More elaborate examples could be constructed too (the results are an exercise for the reader):
my($item1, $item2, #rest) = #array;
my(#copy3, #copy4) = #array, #array;
There is no direct parallel to list and scalar context in other languages that I know of.
Scalar context is what you get when you're looking for a single value. List context is what you get when you're looking for multiple values. One of the most common places to see the distinction is when working with arrays:
#x = #array; # copy an array
$x = #array; # get the number of elements in an array
Other operators and functions are context sensitive as well:
$x = 'abc' =~ /(\w+)/; # $x = 1
($x) = 'abc' =~ /(\w+)/; # $x = 'abc'
#x = localtime(); # (seconds, minutes, hours...)
$x = localtime(); # 'Thu Dec 18 10:02:17 2008'
How an operator (or function) behaves in a given context is up to the operator. There are no general rules for how things are supposed to behave.
You can make your own subroutines context sensitive by using the wantarray function to determine the calling context. You can force an expression to be evaluated in scalar context by using the scalar keyword.
In addition to scalar and list contexts you'll also see "void" (no return value expected) and "boolean" (a true/false value expected) contexts mentioned in the documentation.
This simply means that a data-type will be evaluated based on the mode of the operation. For example, an assignment to a scalar means the right-side will be evaluated as a scalar.
I think the best means of understanding context is learning about wantarray. So imagine that = is a subroutine that implements wantarray:
sub = {
return if ( ! defined wantarray ); # void: just return (doesn't make sense for =)
return #_ if ( wantarray ); # list: return the array
return $#_ + 1; # scalar: return the count of the #_
}
The examples in this post work as if the above subroutine is called by passing the right-side as the parameter.
As for parallels in other languages, yes, I still maintain that virtually every language supports something similar. Polymorphism is similar in all OO languages. Another example, Java converts objects to String in certain contexts. And every untyped scripting language i've used has similar concepts.