Assignment of subroutine references in Perl script - perl

I'm learning Perl from Intermediate Perl by Randal Schwartz. Can somebody explain the assignment of the variables $callback and $getter in the following code?
use File::Find;
sub create_find_callbacks_that_sum_the_size {
my $total_size = 0;
return(sub {$total_size += -s if -f}, sub { return $total_size });
}
my %subs;
foreach my $dir (qw(bin lib man)) {
my ($callback, $getter) = create_find_callbacks_that_sum_the_size( );
$subs{$dir}{CALLBACK} = $callback;
$subs{$dir}{GETTER} = $getter;
}
for (keys %subs) {
find($subs{S_}{CALLBACK}, $_);
for (sort keys %subs) {
my $sum = $subs{$_}{GETTER}->( );
print "$_ has $sum bytes\n";
}
Am I right in thinking that $callback gets the value of the first subroutine reference:
sub {$total_size += -s if -f}
And that $getter gets the second subroutine reference:
sub { return $total_size }
Is this a list assignment?
many thanks

This is a list assignment. The subroutine returns two things. The first thing goes into $callback and the second thing goes into $getter:
my ($callback, $getter) = create_find_callbacks_that_sum_the_size( );
So, yes, your answer is right. Each ends up with one of the anonymous subroutines created in the create_find_callbacks_that_sum_the_size factory.

Related

Perl overload #{} so that you can supply an object to foreach()

Is there a way I can make a class overload the array dereferencer in such a way that I can supply an instance of that class to foreach and have a custom function run at every iteration?
I tried:
Overloading #{} and assigning it to a sub. However, in this case, the sub is only executed once and is supposed to return a reference to the array you want to iterate over. Almost but not quite what I wanted.
Overloading <>. This executes the assigned function with every read attempt. Exactly how I want it, however it works with a while loop and not with foreach.
overloading.pl
use overloaded;
$class = overloaded->new();
$class->add("Hi");
$class->add("Hello");
$class->add("Yeet");
foreach $string (#$class) {
print("NEXT ITEM: ".$string."\n");
}
while ($item = <$class>) {
print("NEXT ITEM: ".$item."\n");
}
overloaded.pm
package overloaded;
use strict;
use warnings;
use Data::Dumper;
use overload '#{}' => \&next, '<>' => \&fetchNext;
sub new {
my ($class, %args) = #_;
$args{fields} = ();
$args{pos} = 0;
return bless { %args }, $class;
}
sub add {
my ($self, $elem) = #_;
push(#{$self->{fields}}, $elem);
}
sub fetchNext {
my ($self) = #_;
print("reading next line...\n");
return $self->{fields}->[$self->{pos}++];
}
sub next {
my ($self) = #_;
print ("getting next array item\n");
return \#{$self->{fields}};
}
1;
output:
$ perl overloading.pl
getting next array item
NEXT ITEM: Hi
NEXT ITEM: Hello
NEXT ITEM: Yeet
reading next line...
NEXT ITEM: Hi
reading next line...
NEXT ITEM: Hello
reading next line...
NEXT ITEM: Yeet
reading next line...
The problem that you're having here is not the difference between #{...} and <...> but the difference between foreach and while.
A while loop acts a bit like an iterator in as much as its execution looks like this:
while (you can run a piece of code and get back a value) {
do something
}
So each time around the loop, it executes the piece of code in the while condition and expects to get a single value back. The code in the while condition is run in scalar context.
A foreach loop, on the other hand, only executes its code once and expects to get a list of values back. The code between parentheses at the start of the loop is executed in list context.
This is why you read a large file a line at a time using:
while (<$file_handle>) {
...
}
This only reads a single record from the file at a time. If you used a foreach loop instead like this:
foreach (<$file_handle>) {
...
}
then you would get all of the records back from the filehandle at once - which, obviously, takes far more memory.
Dereferencing an array reference works like that too. You get all of the values back at the same time. The overridden method (next()) will only be called once and will be expected to return a list of values.
I show some techniques in Object::Iterate. Give your object some methods to supply the next value and go from there.
[ This adds to Dave Cross's answer rather than being an answer on its own ]
Having a single iterator instance per collection cause all kinds of problem. That's why everyone uses keys instead of each, for example. As such, I strongly recommend that your class produces an iterator rather than being one itself, as the following does:
package My::Collection;
use strict;
use warnings;
use Iterator::Simple qw( iarray );
sub new {
my ($class) = #_;
my $self = bless({}, $class);
$self->{fields} = [];
return $self;
}
sub add {
my ($self, $elem) = #_;
push #{ $self->{fields} }, $elem;
}
sub iter {
my ($self) = #_;
return iarray($self->{fields});
}
1;
Use it as follows:
use strict;
use warnings;
use feature qw( say );
use My::Collection qw( );
my $collection = My::Collection->new();
$collection->add("Hi");
$collection->add("Hello");
$collection->add("Yeet");
my $iter = $collection->iter();
while ( my ($item) = $iter->next() ) {
say $item;
}
You can also use either of the following more magical options:
while ( my ($item) = $iter->() )
while ( my ($item) = <$iter> )

Perl hash returning value

I am a newbie in newbies for perl. I am trying to create a function which returns the value of the hash. The following piece of code simply returns the last index of the hash. I googled around and couldnt find what i need. Appreciate if anyone can tell me where I am going wrong.
I am expecting, if I pass "he_1", I should get a return back value of 1, etc.. but all I see is 9.
#!/usr/bin/perl
my %IndexMap = ();
my $MAX_V = 5;
my $MAX_T = 10;
sub InitIndexMap {
foreach my $i (0..$MAX_V-1) {
$IndexMap["he_".$i] = $i;
print "he_".$i;
print $IndexMap["he_".$i];
}
foreach my $i ($MAX_V..$MAX_T-1) {
$IndexMap["un".$i] = $i;
print "un".$i;
print $IndexMap["un".$i];
}
}
sub GetVal {
my ($name) = #_;
return $IndexMap[$name];
}
&InitIndexMap();
my ($index) = &Getval("he_4");
print $index;
To read a hash, use curly braces, not brackets. Try this:
sub InitIndexMap {
foreach my $i (0..$MAX_V-1) {
$IndexMap{ "he_" . $i } = $i;
print "he_".$i;
print $IndexMap{ "he_" . $i };
}
foreach my $i ($MAX_V..$MAX_T-1) {
$IndexMap{ "un" . $i } = $i;
print "un".$i;
print $IndexMap{ "un" . $i };
}
}
sub GetVal {
my ( $name ) = #_;
return $IndexMap{ $name };
}
You should add this to the top of the script:
use strict;
use warnings;
The general rule to always turn those pragmas. They warnings and errors that that cause would have probably led you to the answer to your question.
You should access hashes with curly brackets like { and }.
$hash_name{$key} = $value;
In your example.
$IndexMap{"he_".$i} = $i;
You should consider doing some tutorials.
This is VERY BASIC knowledge in Perl.

"Passing arguments to subroutine"-question?

Is routine2 ok too or shouldn't I do this? (I don't need a copy of #list in the subroutine)
#!/usr/bin/perl
use 5.012;
use warnings;
my #list = 0 .. 9;
sub routine1 {
my $list = shift;
for (#$list) { $_++ };
return $list
}
my $l = routine1( \#list );
say "#$l";
sub routine2 {
for (#list) { $_++ };
}
routine2();
say "#list";
If it works for you, then it's ok too. But the first sub can do the job for any array you pass to it which makes it more general.
P.S. Remember that #_ contains aliases for the parameters passed to the function. So you could also use this:
sub increment { $_++ for #_ }
increment(#list);
If you're worried about making the syntax look nice, try this:
sub routine3 (\#) {
for (#{$_[0]}) { $_++ }
}
my #list = (0 .. 9);
routine3(#list);
say "#list"; # prints 1 .. 10
This declares routine3 with a prototype - it takes an array argument by reference. So $_[0] is a reference to #list, no rather unsightly \ needed by the caller. (Some people discourage prototypes, so take this as you will. I like them.)
But unless this is a simplification for what your actual routine does, what I'd do is this:
my #list = 0 .. 9;
my #new_list = map { $_ + 1 } #list;
say "#new_list";
Unless routine is actually really complicated, and it's vital somehow that you modify the original array, I'd just use map. Especially with map, you can plug in a subroutine:
sub complex_operation { ... }
my #new_list = map { complex_operation($_) } #list;
Of course, you could prototype complex_operation with (_) and then just write map(complex_operation, #list); but I like the bracket-syntax personally.

In Perl, how can I call a method whose name I have in a string?

I'm trying to write some abstract code for searching through a list of similar objects for the first one whose attributes match specific values. In order to do this, I need to call a bunch of accessor methods and check all their values one by one. I'd like to use an abstraction like this:
sub verify_attribute {
my ($object, $attribute_method, $wanted_value) = #_;
if ( call_method($object, $attribute_method) ~~ $wanted_value ) {
return 1;
}
else {
return;
}
}
Then I can loop through a hash whose keys are accessor method names and whose values are the values I'm looking for for those attributes. For example, if that hash is called %wanted, I might use code like this to find the object I want:
my $found_object;
FINDOBJ: foreach my $obj (#list_of_objects) {
foreach my $accessor (keys %wanted) {
next FINDOBJ unless verify_attribute($obj, $accessor, $wanted{$accessor});
}
# All attrs verified
$found_object = $obj;
last FINDOBJ;
}
Of course, the only problem is that call_method does not exsit. Or does it? How can I call a method if I have a string containing its name? Or is there a better solution to this whole problem?
my $found_object;
FINDOBJ: foreach my $obj (#list_of_objects) {
foreach my $accessor (keys %wanted) {
next FINDOBJ unless $obj->$accessor() == $wanted{$accessor};
}
# All attrs verified
$found_object = $obj;
last;
}
Yes, you can call methods this way. No string (or any other) eval involved.
Also, substitute == with eq or =~ depending on the type of the data...
Or, for some extra credits, do it the functional way: (all() should really be part of List::Util!)
use List::Util 'first';
sub all (&#) {
my $code = shift;
$code->($_) || return 0 for #_;
return 1;
}
my $match = first {
my $obj = $_;
all { $obj->$_ == $attrs{$_} }
keys %wanted
} #list_of_objects;
Update: Admittedly, the first solution is the less obfuscated one, so it's preferable. But as somebody answering questions, you have add a little sugar to make it interesting for yourself, too! ;-)
Functional way is cool, but for dummies like me eval rules:
test.pl
#!/usr/bin/perl -l
use F;
my $f = F->new();
my $fun = 'lol'; # method of F
eval '$f->'.$fun.'() '; # call method of F, which name is in $fun var
F.pm
package F;
sub new
{
bless {};
}
sub lol
{
print "LoL";
}
1;
[root#ALT-24 root]# perl test.pl
LoL

Perl: if ( element in list )

I'm looking for presence of an element in a list.
In Python there is an in keyword and I would do something like:
if element in list:
doTask
Is there something equivalent in Perl without having to manually iterate through the entire list?
UPDATE:
The smartmatch family of features are now experimental
Smart match, added in v5.10.0 and significantly revised in v5.10.1, has been a regular point of complaint. Although there are a number of ways in which it is useful, it has also proven problematic and confusing for both users and implementors of Perl. There have been a number of proposals on how to best address the problem. It is clear that smartmatch is almost certainly either going to change or go away in the future. Relying on its current behavior is not recommended.
Warnings will now be issued when the parser sees ~~, given, or when.
If you can get away with requiring Perl v5.10, then you can use any of the following examples.
The smart match ~~ operator.
if( $element ~~ #list ){ ... }
if( $element ~~ [ 1, 2, 3 ] ){ ... }
You could also use the given/when construct. Which uses the smart match functionality internally.
given( $element ){
when( #list ){ ... }
}
You can also use a for loop as a "topicalizer" ( meaning it sets $_ ).
for( #elements ){
when( #list ){ ... }
}
One thing that will come out in Perl 5.12 is the ability to use the post-fix version of when. Which makes it even more like if and unless.
given( $element ){
... when #list;
}
If you have to be able to run on older versions of Perl, there still are several options.
You might think you can get away with using List::Util::first, but there are some edge conditions that make it problematic.
In this example it is fairly obvious that we want to successfully match against 0. Unfortunately this code will print failure every time.
use List::Util qw'first';
my $element = 0;
if( first { $element eq $_ } 0..9 ){
print "success\n";
} else {
print "failure\n";
}
You could check the return value of first for defined-ness, but that will fail if we actually want a match against undef to succeed.
You can safely use grep however.
if( grep { $element eq $_ } 0..9 ){ ... }
This is safe because grep gets called in a scalar context. Arrays return the number of elements when called in scalar context. So this will continue to work even if we try to match against undef.
You could use an enclosing for loop. Just make sure you call last, to exit out of the loop on a successful match. Otherwise you might end up running your code more than once.
for( #array ){
if( $element eq $_ ){
...
last;
}
}
You could put the for loop inside the condition of the if statement ...
if(
do{
my $match = 0;
for( #list ){
if( $element eq $_ ){
$match = 1;
last;
}
}
$match; # the return value of the do block
}
){
...
}
... but it might be more clear to put the for loop before the if statement.
my $match = 0;
for( #list ){
if( $_ eq $element ){
$match = 1;
last;
}
}
if( $match ){ ... }
If you're only matching against strings, you could also use a hash. This can speed up your program if #list is large and, you are going to match against %hash several times. Especially if #array doesn't change, because then you only have to load up %hash once.
my %hash = map { $_, 1 } #array;
if( $hash{ $element } ){ ... }
You could also make your own subroutine. This is one of the cases where it is useful to use prototypes.
sub in(&#){
local $_;
my $code = shift;
for( #_ ){ # sets $_
if( $code->() ){
return 1;
}
}
return 0;
}
if( in { $element eq $_ } #list ){ ... }
if( $element ~~ #list ){
do_task
}
~~ is the "smart match operator", and does more than just list membership detection.
grep is helpful here
if (grep { $_ eq $element } #list) {
....
}
If you plan to do this many times, you can trade-off space for lookup time:
#!/usr/bin/perl
use strict; use warnings;
my #array = qw( one ten twenty one );
my %lookup = map { $_ => undef } #array;
for my $element ( qw( one two three ) ) {
if ( exists $lookup{ $element }) {
print "$element\n";
}
}
assuming that the number of times the element appears in #array is not important and the contents of #array are simple scalars.
List::Util::first
$foo = first { ($_ && $_ eq "value" } #list; # first defined value in #list
Or for hand-rolling types:
my $is_in_list = 0;
foreach my $elem (#list) {
if ($elem && $elem eq $value_to_find) {
$is_in_list = 1;
last;
}
}
if ($is_in_list) {
...
A slightly different version MIGHT be somewhat faster on very long lists:
my $is_in_list = 0;
for (my $i = 0; i < scalar(#list); ++$i) {
if ($list[i] && $list[i] eq $value_to_find) {
$is_in_list = 1;
last;
}
}
if ($is_in_list) {
...
TIMTOWTDI
sub is (&#) {
my $test = shift;
$test->() and return 1 for #_;
0
}
sub in (#) {#_}
if( is {$_ eq "a"} in qw(d c b a) ) {
print "Welcome in perl!\n";
}
List::MoreUtils
On perl >= 5.10 the smart match operator is surely the easiest way, as many others have already said.
On older versions of perl, I would instead suggest List::MoreUtils::any.
List::MoreUtils is not a core module (some say it should be) but it's very popular and it's included in major perl distributions.
It has the following advantages:
it returns true/false (as Python's in does) and not the value of the element, as List::Util::first does (which makes it hard to test, as noted above);
unlike grep, it stops at the first element which passes the test (perl's smart match operator short circuits as well);
it works with any perl version (well, >= 5.00503 at least).
Here is an example which works with any searched (scalar) value, including undef:
use List::MoreUtils qw(any);
my $value = 'test'; # or any other scalar
my #array = (1, 2, undef, 'test', 5, 6);
no warnings 'uninitialized';
if ( any { $_ eq $value } #array ) {
print "$value present\n"
}
P.S.
(In production code it's better to narrow the scope of no warnings 'uninitialized').
Probably Perl6::Junction is the clearest way to do. No XS dependencies, no mess and no new perl version required.
use Perl6::Junction qw/ any /;
if (any(#grant) eq 'su') {
...
}
This blog post discusses the best answers to this question.
As a short summary, if you can install CPAN modules then the best solutions are:
if any(#ingredients) eq 'flour';
or
if #ingredients->contains('flour');
However, a more usual idiom is:
if #any { $_ eq 'flour' } #ingredients
which i find less clear.
But please don't use the first() function! It doesn't express the intent of your code at all. Don't use the "Smart match" operator: it is broken. And don't use grep() nor the solution with a hash: they iterate through the whole list. While any() will stop as soon as it finds your value.
Check out the blog post for more details.
PS: i'm answering for people who will have the same question in the future.
You can accomplish a similar enough syntax in Perl if you do some Autoload hacking.
Create a small package to handle the autoload:
package Autoloader;
use strict;
use warnings;
our $AUTOLOAD;
sub AUTOLOAD {
my $self = shift;
my ($method) = (split(/::/, $AUTOLOAD))[-1];
die "Object does not contain method '$method'" if not ref $self->{$method} eq 'CODE';
goto &{$self->{$method}};
}
1;
Then your other package or main script will contain a subroutine that returns the blessed object which gets handled by Autoload when its method attempts to be called.
sub element {
my $elem = shift;
my $sub = {
in => sub {
return if not $_[0];
# you could also implement this as any of the other suggested grep/first/any solutions already posted.
my %hash; #hash{#_} = ();
return (exists $hash{$elem}) ? 1 : ();
}
};
bless($sub, 'Autoloader');
}
This leaves you with usage looking like:
doTask if element('something')->in(#array);
If you reorganize the closure and its arguments, you can switch the syntax around the other way to make it look like this, which is a bit closer to the autobox style:
doTask if search(#array)->contains('something');
function to do that:
sub search {
my #arr = #_;
my $sub = {
contains => sub {
my $elem = shift or return;
my %hash; #hash{#arr} = ();
return (exists $hash{$elem}) ? 1 : ();
}
};
bless($sub, 'Autoloader');
}