perl foreach loop with function closure rules - perl

the following code
#!/usr/bin/env perl
use strict;
use warnings;
my #foo = (0,1,2,3,4);
foreach my $i (#foo) {
sub printer {
my $blah = shift #_;
print "$blah-$i\n";
}
printer("test");
}
does not do what I would expect.
What exactly is happening?
(I would expect it to print out "test-0\ntest-1\ntest-2\ntest-3\ntest-4\n")

The problem is that the sub name {...} construct can not be nested like that in a for loop.
The reason is because sub name {...} really means BEGIN {*name = sub {...}} and begin blocks are executed as soon as they are parsed. So the compilation and variable binding of the subroutine happens at compile time, before the for loop ever gets a chance to run.
What you want to do is to create an anonymous subroutine, which will bind its variables at runtime:
#!/usr/bin/env perl
use strict;
use warnings;
my #foo = (0,1,2,3,4);
foreach my $i (#foo) {
my $printer = sub {
my $blah = shift #_;
print "$blah-$i\n";
};
$printer->("test");
}
which prints
test-0
test-1
test-2
test-3
test-4
Presumably in your real use case, these closures will be loaded into an array or hash so that they can be accessed later.
You can still use bareword identifiers with closures, but you need to do a little extra work to make sure the names are visible at compile time:
BEGIN {
for my $color (qw(red blue green)) {
no strict 'refs';
*$color = sub {"<font color='$color'>#_</font>"}
}
}
print "Throw the ", red 'ball'; # "Throw the <font color='red'>ball</font>"

Eric Strom's answer is correct, and probably what you wanted to see, but doesn't go into the details of the binding.
A brief note about lexical lifespan: lexicals are created at compile time and are actually available even before their scope is entered, as this example shows:
my $i;
BEGIN { $i = 42 }
print $i;
Thereafter, when they go out of scope, they become unavailable until the next time they are in scope:
print i();
{
my $i;
BEGIN { $i = 42 }
# in the scope of `my $i`, but doesn't actually
# refer to $i, so not a closure over it:
sub i { eval '$i' }
}
print i();
In your code, the closure is bound to the initial lexical $i at compile time.
However, foreach loops are a little odd; while the my $i actually creates a lexical, the foreach loop does not use it; instead it aliases it to one of the looped over values each iteration and then restores it to its original state after the loop. Your closure thus is the only thing referencing the original lexical $i.
A slight variation shows more complexity:
foreach (#foo) {
my $i = $_;
sub printer {
my $blah = shift #_;
print "$blah-$i\n";
}
printer("test");
}
Here, the original $i is created at compile time and the closure binds to that; the first iteration of the loop sets it, but the second iteration of the loop creates a new $i unassociated with the closure.

Related

How to pass entire subroutine into hashtable data using perl?

I have the following subroutine which i should pass the routine as hashtable and that hashtable should be again called inside another subroutine using perl?
input file(from linux command bdata):
NAME PEND RUN SUSP JLIM JLIMR RATE HAPPY
achandra 0 48 0 2000 50:2000 151217 100%
agutta 1 5 0 100 50:100 16561 83%
My subroutine:
sub g_usrs_data()
{
my($lines) = #_;
my $header_found = 0;
my #headers = ();
my $row_count = 0;
my %table_data = ();
my %row_data = ();
$lines=`bdata`;
#print $lines;
foreach (split("\n",$lines)) {
if (/NAME\s*PEND/) {
$header_found = 1;
#headers =split;
}
elsif (/^\s*$/)
{
$header_found=0;
}
$row_data{$row_count++} = $_;
#print $_;
}
My query:
How can i pass my subroutine as hash into another subroutine?
example:
g_usrs_data() -> this is my subroutine .
the above subroutine should be passed into another subroutine (i.e into usrs_hash as hash table)
example:
create_db(usrs_hash,$sql1m)
Subroutines can be passed around as code references. See perlreftut and perlsub.
An example with an anonymous subroutine
use warnings;
use strict;
my $rc = sub {
my #args = #_;
print "\tIn coderef. Got: |#_|\n";
return 7;
}; # note the semicolon!
sub use_rc {
my ($coderef, #other_args) = #_;
my $ret = $coderef->('arguments', 'to', 'pass');
return $ret;
}
my $res = use_rc($rc);
print "$res\n";
This silly program prints
In coderef. Got: |arguments to pass|
7
Notes on code references
The anonymous subroutine is assigned to a scalar $rc, making that a code reference
With an existing (named) sub, say func, a code reference is made by my $rc = \&func;
This $rc is a normal scalar variable, that can be passed to subroutines like any other
The sub is then called by $rc->(); where in parenthesis we can pass it arguments
Note that the syntax for creating and using them are just like for other data types
As anonymous assign by = sub { }, much like = [ ] (arrayref) and = { } (hashref)
For a named sub use & instead of a sigil, so \& for sub vs. \# (array) and \% (hash)
They are used by ->(), much like ->[] (arrayref) and ->{} (hashref)
For references in general see perlreftut. Subroutines are covered in depth in perlsub.
See for example this post on anonymous subs, with a number of answers.
For far more see this article from Mastering Perl and this article from The Effective Perler.

perl subroutine argument lists - "pass by alias"?

I just looked in disbelief at this sequence:
my $line;
$rc = getline($line); # read next line and store in $line
I had understood all along that Perl arguments were passed by value, so whenever I've needed to pass in a large structure, or pass in a variable to be updated, I've passed a ref.
Reading the fine print in perldoc, however, I've learned that #_ is composed of aliases to the variables mentioned in the argument list. After reading the next bit of data, getline() returns it with $_[0] = $data;, which stores $data directly into $line.
I do like this - it's like passing by reference in C++. However, I haven't found a way to assign a more meaningful name to $_[0]. Is there any?
You can, its not very pretty:
use strict;
use warnings;
sub inc {
# manipulate the local symbol table
# to refer to the alias by $name
our $name; local *name = \$_[0];
# $name is an alias to first argument
$name++;
}
my $x = 1;
inc($x);
print $x; # 2
The easiest way is probably just to use a loop, since loops alias their arguments to a name; i.e.
sub my_sub {
for my $arg ( $_[0] ) {
code here sees $arg as an alias for $_[0]
}
}
A version of #Steve's code that allows for multiple distinct arguments:
sub my_sub {
SUB:
for my $thisarg ( $_[0] ) {
for my $thatarg ($_[1]) {
code here sees $thisarg and $thatarg as aliases
last SUB;
}
}
}
Of course this brings multilevel nestings and its own code readability issues, so use it only when absolutely neccessary.

How to run an anonymous function in Perl?

(sub {
print 1;
})();
sub {
print 1;
}();
I tried various ways, all are wrong...
(sub { ... }) will give you the pointer to the function so you must call by reference.
(sub { print "Hello world\n" })->();
The other easy method, as pointed out by Blagovest Buyukliev would be to dereference the function pointer and call that using the { } operators
&{ sub { print "Hello World" }}();
Yay, I didn't expect you folks to come up with that much possibilities. But you're right, this is perl and TIMTOWTDI: +1 for creativitiy!
But to be honest, I use hardly another form than the following:
The Basic Syntax
my $greet = sub {
my ( $name ) = #_;
print "Hello $name\n";
};
# ...
$greet->( 'asker' )
It's pretty straight forward: sub {} returns a reference to a sub routine, which you can store and pass around like any other scalar. You can than call it by dereferencing. There is also a second syntax to dereference: &{ $sub }( 'asker' ), but I personally prefer the arrow syntax, because I find it more readable and it pretty much aligns with dereferencing hashes $hash->{ $key } and arrays $array->[ $index ]. More information on references can be found in perldoc perlref.
I think the other given examples are a bit advanced, but why not have a look at them:
Goto
sub bar {goto $foo};
bar;
Rarely seen and much feared these days. But at least it's a goto &function, which is considered less harmful than it's crooked friends: goto LABEL or goto EXPRESSION ( they are deprecated since 5.12 and raise a warning ). There are actually some circumstances, when you want to use that form, because this is not a usual function call. The calling function ( bar in the given example ) will not appear in the callling stack. And you don't pass your parameters, but the current #_ will be used. Have a look at this:
use Carp qw( cluck );
my $cluck = sub {
my ( $message ) = #_;
cluck $message . "\n";
};
sub invisible {
#_ = ( 'fake' );
goto $cluck;
}
invisible( 'real' );
Output:
fake at bar.pl line 5
main::__ANON__('fake') called at bar.pl line 14
And there is no hint of an invisible function in the stack trace. More info on goto in perldoc -f goto.
Method Calls
''->$foo;
# or
undef->$foo;
If you call a method on an object, the first parameter passed to that method will be the invocant ( usually an instance or the class name ). Did i already say that TIMTOWTCallAFunction?
# this is just a normal named sub
sub ask {
my ( $name, $question ) = #_;
print "$question, $name?\n";
};
my $ask = \&ask; # lets take a reference to that sub
my $question = "What's up";
'asker'->ask( $question ); # 1: doesn't work
my $meth_name = 'ask';
'asker'->$meth_name( $question ); # 2: doesn't work either
'asker'->$ask( $question ); # 1: this works
In the snippet above are two calls, which won't work, because perl will try to find a method called ask in package asker ( actually it would work if that code was in the said package ). But the third one succeeds, because you already give perl the right method and it doesn't need to search for it. As always: more info in the perldoc I can't find any reason right now, to excuse this in production code.
Conclusion
Originally I didn't intend to write that much, but I think it's important to have the common solution at the beginning of an answer and some explanations to the unusual constructs. I admit to be kind of selfish here: Every one of us could end up maintaining someones code, who found this question and just copied the topmost example.
There is not much need in Perl to call an anonymous subroutine where it is defined. In general you can achieve any type of scoping you need with bare blocks. The one use case that comes to mind is to create an aliased array:
my $alias = sub {\#_}->(my ($x, $y, $z));
$x = $z = 0;
$y = 1;
print "#$alias"; # '0 1 0'
Otherwise, you would usually store an anonymous subroutine in a variable or data structure. The following calling styles work with both a variable and a sub {...} declaration:
dereference arrow: sub {...}->(args) or $code->(args)
dereference sigil: &{sub {...}}(args) or &$code(args)
if you have the coderef in a scalar, you can also use it as a method on regular and blessed values.
my $method = sub {...};
$obj->$method # same as $method->($obj)
$obj->$method(...) # $method->($obj, ...)
[1, 2, 3]->$method # $method->([1, 2, 3])
[1, 2, 3]->$method(...) # $method->([1, 2, 3], ...)
I'm endlessly amused by finding ways to call anonymous functions:
$foo = sub {say 1};
sub bar {goto $foo};
bar;
''->$foo; # technically a method, along with the lovely:
undef->$foo;
() = sort $foo 1,1; # if you have only two arguments
and, of course, the obvious:
&$foo();
$foo->();
You need arrow operator:
(sub { print 1;})->();
You might not even need an anonymous function if you want to run a block of code and there is zero or one input. You can use map instead.
Just for the side effect:
map { print 1 } 1;
Transform data, take care to assign to a list:
my ($data) = map { $_ * $_ } 2;
# ------------------------------------------------------
# perl: filter array using given function
# ------------------------------------------------------
sub filter {
my ($arr1, $func) = #_;
my #arr2=();
foreach ( #{$arr1} ) {
push ( #arr2, $_ ) if $func->( $_ );
};
return #arr2;
}
# ------------------------------------------------------
# get files from dir
# ------------------------------------------------------
sub getFiles{
my ($p) = #_;
opendir my $dir, $p or die "Cannot open directory: $!";
my #files=readdir $dir;
closedir $dir;
#return files and directories that not ignored but not links
return filter \#files, (sub { my $f = $p.(shift);return ((-f $f) || (-d $f)) && (! -l $f) } );
}

Declaring a scalar inside an if statement?

Why can't I declare a scalar variable inside an if statement? Does it have something to do with the scope of the variable?
Every block {...} in Perl creates a new scope. This includes bare blocks, subroutine blocks, BEGIN blocks, control structure blocks, looping structure blocks, inline blocks (map/grep), eval blocks, and the bodies of statement modifier loops.
If a block has an initialization section, that section is considered within the scope of the following block.
if (my $x = some_sub()) {
# $x in scope here
}
# $x out of scope
In a statement modifier loop, the initialization section is not contained within the scope of the pseudo block:
$_ = 1 for my ($x, $y, $z);
# $x, $y, and $z are still in scope and each is set to 1
Who says you can't?
#! /usr/bin/env perl
use warnings;
no warnings qw(uninitialized);
use strict;
use feature qw(say);
use Data::Dumper;
my $bar;
if (my $foo eq $bar) {
say "\$foo and \$bar match";
}
else {
say "Something freaky happened";
}
$ ./test.pl
$foo and $bar match
Works perfectly! Of course it makes no sense since what are you comparing $foo too? It has no value.
Can you give me an example of what you're doing and the results you're getting?
Or, is this more what you mean?:
if (1 == 1) {
my $foo = "bar";
say "$foo"; #Okay, $foo is in scope
}
say "$foo;" #Fail: $foo doesn't exist because it's out of scope
So, which one do you mean?
Just to follow up my comment. Statements such as the following is perfectly legal:
if( my( $foo, $bar ) = $baz =~ /^(.*?)=(.*?)$/ ) {
# Do stuff
}
Courtesy of one of my colleagues.
There's an exception: You may not conditionally declare a variable and use it under different conditions. This means the following isn't allowed:
my $x = ... if ...;

What is the magic behind perl read() function and buffer which is not a ref?

I do not get to understand how the Perl read($buf) function is able to modify the content of the $buf variable. $buf is not a reference, so the parameter is given by copy (from my c/c++ knowledge). So how come the $buf variable is modified in the caller ?
Is it a tie variable or something ? The C documentation about setbuf is also quite elusive and unclear to me
# Example 1
$buf=''; # It is a scalar, not a ref
$bytes = $fh->read($buf);
print $buf; # $buf was modified, what is the magic ?
# Example 2
sub read_it {
my $buf = shift;
return $fh->read($buf);
}
my $buf;
$bytes = read_it($buf);
print $buf; # As expected, this scope $buf was not modified
No magic is needed -- all perl subroutines are call-by-alias, if you will. Quoth perlsub:
The array #_ is a local array, but its elements are aliases
for the actual scalar parameters. In particular, if an element $_[0]
is updated, the corresponding argument is updated (or an error occurs
if it is not updatable).
For example:
sub increment {
$_[0] += 1;
}
my $i = 0;
increment($i); # now $i == 1
In your "Example 2", your read_it sub copies the first element of #_ to the lexical $buf, which copy is then modified "in place" by the call to read(). Pass in $_[0] instead of copying, and see what happens:
sub read_this {
$fh->read($_[0]); # will modify caller's variable
}
sub read_that {
$fh->read(shift); # so will this...
}
read() is a built-in function, and so can do magic. You can accomplish something similar with your own functions, though, by declaring a function prototype:
sub myread(\$) { ... }
The argument declaration \$ means that the argument is implicitly passed as a reference.
The only magic in the built-in read is that it works even when called indirectly or as a filehandle method, which doesn't work for regular functions.