Using a sorting subroutine from another package - perl

I have a script and a package like so:
# file: sortscript.pl
use strict;
use warnings;
use SortPackage;
my #arrays = ([1,"array1"],[10,"array3"],[4,"array2"]);
print "Using sort outside package\n";
foreach (sort SortPackage::simplesort #arrays){
print $_->[1],"\n";
}
print "\nUsing sort in same package\n";
SortPackage::sort_from_same_package(#arrays);
--
# file: SortPackage.pm
use strict;
use warnings;
package SortPackage;
sub simplesort{
return ($a->[0] <=> $b->[0]);
}
sub sort_from_same_package{
my #arrs = #_;
foreach (sort simplesort #arrs){
print $_->[1],"\n";
}
}
1;
Running the script produces the output:
$ perl sortscript.pl
Using sort outside package
Use of uninitialized value in numeric comparison (<=>) at SortPackage.pm line 15.
Use of uninitialized value in numeric comparison (<=>) at SortPackage.pm line 15.
Use of uninitialized value in numeric comparison (<=>) at SortPackage.pm line 15.
Use of uninitialized value in numeric comparison (<=>) at SortPackage.pm line 15.
Use of uninitialized value in numeric comparison (<=>) at SortPackage.pm line 15.
Use of uninitialized value in numeric comparison (<=>) at SortPackage.pm line 15.
array1
array3
array2
Using sort in same package
array1
array2
array3
Why am I not able to correctly use the subroutine to sort with when it is in another package?

As has been mentioned, $a and $b are package globals, so another solution is to temporarily alias the globals at the call site to the ones in package SortPackage:
{
local (*a, *b) = (*SortPackage::a, *SortPackage::b);
foreach (sort SortPackage::simplesort #arrays){
print $_->[1],"\n";
}
}
But this is pretty ugly, of course. I would just have SortPackage export a complete sorting routine, not just a comparator:
package SortPackage;
use strict;
sub _sort_by_first_element_comparator {
return $a->[0] <=> $b->[0];
}
sub sort_by_first_element {
return sort _sort_by_first_element_comparator #_;
}

$a and $b are special "package global" variables.
To use the main scope's $a and $b your comparator function would have to refer to $::a or $main::a (and likewise for $b).
However that comparator function then wouldn't work when called from any other package, or even from within its own package.
See the perlvars help and the perldoc for the sort function. The solution is also in the latter help text:
If the subroutine’s prototype is "($$)", the elements to be
compared are passed by reference in #_, as for a normal
subroutine. This is slower than unprototyped subroutines,
where the elements to be compared are passed into the
subroutine as the package global variables $a and $b (see
example below). Note that in the latter case, it is usually
counter‐productive to declare $a and $b as lexicals.

The special variables $a and $b are package globals. Your subroutine expects $SortPackage::a and $SortPackage::b. When you call it from sortscript.pl, the variables $main::a and $main::b are being set by sort.
The solution is to use a prototyped subroutine:
package SortPackage;
sub simplesort ($$) {
return ($_[0]->[0] <=> $_[1]->[0]);
}
It's a little slower (since you have actual parameters being passed, instead of reading pre-set globals), but it allows you to use subroutines from other packages by name, as you are attempting.

Related

in Perl, how to assign the print function to a variable?

I need to control the print method using a variable
My code is below
#!/usr/bin/perl
# test_assign_func.pl
use strict;
use warnings;
sub echo {
my ($string) = #_;
print "from echo: $string\n\n";
}
my $myprint = \&echo;
$myprint->("hello");
$myprint = \&print;
$myprint->("world");
when I ran, I got the following error for the assignment of print function
$ test_assign_func.pl
from echo: hello
Undefined subroutine &main::print called at test_assign_func.pl line 17.
Looks like I need to prefix a namespace to print function but I cannot find the name space. Thank you for any advice!
print is an operator, not a sub.
perlfunc:
The functions in this section can serve as terms in an expression. They fall into two major categories: list operators and named unary operators.
Perl provides a sub for named operators that can be duplicated by a sub with a prototype. A reference to these can be obtained using \&CORE::name.
my $f = \&CORE::length;
say $f->("abc"); # 3
But print isn't such an operator (because of the way it accepts a file handle). For these, you'll need to create a sub with a more limited calling convention.
my $f = sub { print #_ };
$f->("abc\n");
Related:
What are Perl built-in operators/functions?
As mentioned in CORE, some functions can't be called as subroutines, only as barewords. print is one of them.

Perl sort won't use function from another package

I have a function for case insensitive sorting. It works if it's from the same package, but not otherwise.
This works:
my #arr = sort {lc $a cmp lc $b} #list;
This works (if a function called "isort" is defined in the same file):
my #arr = sort isort #list;
This does not (function exported with Exporter from another package):
my #arr = sort isort #list;
This does not (function referred to explicitly by package name):
my #arr = sort Utils::isort #list;
What is going on? How do I put a sorting function in another package?
What evidence do you have for it not working? Have you put a print() statement in the subroutine to see if it's being called?
I suspect you're being tripped up by this (from perldoc -f sort):
$a and $b are set as package globals in the package the sort() is called from. That means $main::a and $main::b (or $::a and $::b ) in the main package, $FooPack::a and $FooPack::b in the FooPack package, etc.
Oh, and later on it's more specific:
Sort subroutines written using $a and $b are bound to their calling package. It is possible, but of limited interest, to define them in a different package, since the subroutine must still refer to the calling package's $a and $b:
package Foo;
sub lexi { $Bar::a cmp $Bar::b }
package Bar;
... sort Foo::lexi ...
Use the prototyped versions (see above) for a more generic alternative.
The "prototyped versions" are described above like this:
If the subroutine's prototype is ($$) , the elements to be compared are passed by reference in #_, as for a normal subroutine. This is slower than unprototyped subroutines, where the elements to be compared are passed into the subroutine as the package global variables $a and $b (see example below).
So you could try rewriting your subroutine like this:
package Utils;
sub isort ($$) {
my ($a, $b) = #_;
# existing code...
}
And then calling it using one of your last two alternatives.

Can someone explain why Perl behaves this way (variable scoping)?

My test goes like this:
use strict;
use warnings;
func();
my $string = 'string';
func();
sub func {
print $string, "\n";
}
And the result is:
Use of uninitialized value $string in print at test.pl line 10.
string
Perl allows us to call a function before it has been defined. However when the function uses a variable declared only after the function call, the variable appears to be undefined. Is this behavior documented somewhere? Thank you!
The behaviour of my is documented in perlsub - it boils down to this - perl knows $string is in scope - because the my tells it so.
The my operator declares the listed variables to be lexically confined to the enclosing block, conditional (if/unless/elsif/else), loop (for/foreach/while/until/continue), subroutine, eval, or do/require/use'd file.
It means it's 'in scope' from the point at which it's first 'seen' until the closing bracket of the current 'block'. (Or in your example - the end of the code)
However - in your example my also assigns a value.
This scoping process happens at compile time - where perl checks where it's valid to use $string or not. (Thanks to strict). However - it can't know what the value was, because that might change during code execution. (and is non-trivial to analyze)
So if you do this it might be a little clearer what's going on:
#!/usr/bin/env perl
use strict;
use warnings;
my $string; #undefined
func();
$string = 'string';
func();
sub func {
print $string, "\n";
}
$string is in scope in both cases - because the my happened at compile time - before the subroutine has been called - but it doesn't have a value set beyond the default of undef prior to the first invocation.
Note this contrasts with:
#!/usr/bin/env perl
use strict;
use warnings;
sub func {
print $string, "\n";
}
my $string; #undefined
func();
$string = 'string';
func();
Which errors because when the sub is declared, $string isn't in scope.
First of all, I would consider this undefined behaviour since it skips executing my like my $x if $cond; does.
That said, the behaviour is currently consistent and predictable. And in this instance, it behaves exactly as expected if the optimization that warranted the undefined behaviour notice didn't exit.
At compile-time, my has the effect of declaring and allocating the variable[1]. Scalars are initialized to undef when created. Arrays and hashes are created empty.
my $string was encountered by the compiler, so the variable was created. But since you haven't executed the assignment yet, it still has its default value (undefined) during the first call to func.
This model allows variables to be captured by closures.
Example 1:
{
my $x = "abc";
sub foo { $x } # Named subs capture at compile-time.
}
say foo(); # abc, even though $x fell out of scope before foo was called.
Example 2:
sub make_closure {
my ($x) = #_;
return sub { $x }; # Anon subs capture at run-time.
}
my $foo = make_closure("foo");
my $bar = make_closure("bar");
say $foo->(); # foo
say $bar->(); # bar
The allocation is possibly deferred until the variable is actually used.

Difference between a BLOCK and a function in terms of scoping in Perl

Guys I'm a little bit confused, I was playing with scoping in Perl, when i encountered this one:
#! usr/bin/perl
use warnings;
use strict;
sub nested {
our $x = "nested!";
}
print $x; # Error "Variable "$x" is not imported at nested line 10."
print our $x; # Doesn't print "nested!"
print our($x) # Doesn't print "nested!"
But when i do this:
{
our $x = "nested";
}
print our($x); # Prints "nested"
print our $x; # Prints "nested"
print $x; # Prints "nested"
So guys can you explain to me why those works and not?
To restate DVK's answer, our is just a handy aliasing tool. Every variable you use in these examples is actually named $main::x. Within any lexical scope you can use our to make an alias to that variable, with a shortened name, in that same scope; the variable doesn't reset or get removed outside, only the alias. This is unlike the my keyword which makes a new variable bound to that lexical scope.
To explain why the block example works the way it does, let's look at our explanation from "Modern Perl" book, chapter 5
Our Scope
Within given scope, declare an alias to a package variable with the our builtin.
The fully-qualified name is available everywhere, but the lexical alias is visible only within its scope.
This explains why the first two prints of your second example work (our is re-declared in print's scope), whereas the third one does not (as our only aliases $x to the package variable within the block's scope). Please note that printing $main::x will work correctly - it's only the alias that is scoped to the block, not the package variable itself.
As far as with the function:
print our $x; and print our($x) "don't work" - namely, correctly claim the value is uninitialized - since you never called the function which would initialize the variable. Observe the difference:
c:\>perl -e "use strict; use warnings; sub x { our $x = 1;} print our $x"
Use of uninitialized value $x in print at -e line 1.
c:\>perl -e "use strict; use warnings; sub x { our $x = 1;} x(); print our $x"
1
print $x; won't work for the same reason as with the block - our only scopes the alias to the block (i.e. in this case body of the sub) therefore you MUST either re-alias it in the main block's scope (as per print our $x example), OR use fully qualified package global outside the sub, in which case it will behave as expected:
c:\>perl -e "use strict; use warnings; sub x { our $x = 1;} print $main::x"
Use of uninitialized value $x in print at -e line 1.
c:\>perl -e "sub x { our $x = 1;} x(); print $main::x"
1

Why do I get "uninitialized value" warnings when I use Date::Manip's sortByLength?

How might this block of code in Date/Manip.pm from the Date::Manip module:
#*Get rid of a problem with old versions of perl
no strict "vars";
# This sorts from longest to shortest element
sub sortByLength {
return (length $b <=> length $a);
}
use strict "vars";
I get this warning:
Use of uninitialized value in length at /perl/lib/perl5.8/Date/Manip.pm line 244.
The problem is not actually located there; the function is just being called with invalid (undef) parameters. To get a better trace of where it came from, try this:
$SIG{__WARN__} = sub {
require Carp;
Carp::confess("Warning: $_[0]");
};
This will print a stacktrace for all warnings.
Either $a or $b are undef. Check the list you are feeding to the sort that uses this subroutine to see if you have an undefined value.
How are you using this code?
If warnings for uninitialized diagnostics were enabled (perhaps via blanket -w or use warnings;) and if sortByLength were somehow called as a normal subroutine, rather than as a sort {} function, you would likely see this error:
$ perl -Mwarnings=uninitialized -e 'sub sbl { (length $b <=> length $a) } sbl'
Use of uninitialized value in length at -e line 1.
Use of uninitialized value in length at -e line 1.
Here I get two warnings, because both $a and $b are uninitialized. Hard to say without more context.