How to run an anonymous function in Perl? - perl

(sub {
print 1;
})();
sub {
print 1;
}();
I tried various ways, all are wrong...

(sub { ... }) will give you the pointer to the function so you must call by reference.
(sub { print "Hello world\n" })->();
The other easy method, as pointed out by Blagovest Buyukliev would be to dereference the function pointer and call that using the { } operators
&{ sub { print "Hello World" }}();

Yay, I didn't expect you folks to come up with that much possibilities. But you're right, this is perl and TIMTOWTDI: +1 for creativitiy!
But to be honest, I use hardly another form than the following:
The Basic Syntax
my $greet = sub {
my ( $name ) = #_;
print "Hello $name\n";
};
# ...
$greet->( 'asker' )
It's pretty straight forward: sub {} returns a reference to a sub routine, which you can store and pass around like any other scalar. You can than call it by dereferencing. There is also a second syntax to dereference: &{ $sub }( 'asker' ), but I personally prefer the arrow syntax, because I find it more readable and it pretty much aligns with dereferencing hashes $hash->{ $key } and arrays $array->[ $index ]. More information on references can be found in perldoc perlref.
I think the other given examples are a bit advanced, but why not have a look at them:
Goto
sub bar {goto $foo};
bar;
Rarely seen and much feared these days. But at least it's a goto &function, which is considered less harmful than it's crooked friends: goto LABEL or goto EXPRESSION ( they are deprecated since 5.12 and raise a warning ). There are actually some circumstances, when you want to use that form, because this is not a usual function call. The calling function ( bar in the given example ) will not appear in the callling stack. And you don't pass your parameters, but the current #_ will be used. Have a look at this:
use Carp qw( cluck );
my $cluck = sub {
my ( $message ) = #_;
cluck $message . "\n";
};
sub invisible {
#_ = ( 'fake' );
goto $cluck;
}
invisible( 'real' );
Output:
fake at bar.pl line 5
main::__ANON__('fake') called at bar.pl line 14
And there is no hint of an invisible function in the stack trace. More info on goto in perldoc -f goto.
Method Calls
''->$foo;
# or
undef->$foo;
If you call a method on an object, the first parameter passed to that method will be the invocant ( usually an instance or the class name ). Did i already say that TIMTOWTCallAFunction?
# this is just a normal named sub
sub ask {
my ( $name, $question ) = #_;
print "$question, $name?\n";
};
my $ask = \&ask; # lets take a reference to that sub
my $question = "What's up";
'asker'->ask( $question ); # 1: doesn't work
my $meth_name = 'ask';
'asker'->$meth_name( $question ); # 2: doesn't work either
'asker'->$ask( $question ); # 1: this works
In the snippet above are two calls, which won't work, because perl will try to find a method called ask in package asker ( actually it would work if that code was in the said package ). But the third one succeeds, because you already give perl the right method and it doesn't need to search for it. As always: more info in the perldoc I can't find any reason right now, to excuse this in production code.
Conclusion
Originally I didn't intend to write that much, but I think it's important to have the common solution at the beginning of an answer and some explanations to the unusual constructs. I admit to be kind of selfish here: Every one of us could end up maintaining someones code, who found this question and just copied the topmost example.

There is not much need in Perl to call an anonymous subroutine where it is defined. In general you can achieve any type of scoping you need with bare blocks. The one use case that comes to mind is to create an aliased array:
my $alias = sub {\#_}->(my ($x, $y, $z));
$x = $z = 0;
$y = 1;
print "#$alias"; # '0 1 0'
Otherwise, you would usually store an anonymous subroutine in a variable or data structure. The following calling styles work with both a variable and a sub {...} declaration:
dereference arrow: sub {...}->(args) or $code->(args)
dereference sigil: &{sub {...}}(args) or &$code(args)
if you have the coderef in a scalar, you can also use it as a method on regular and blessed values.
my $method = sub {...};
$obj->$method # same as $method->($obj)
$obj->$method(...) # $method->($obj, ...)
[1, 2, 3]->$method # $method->([1, 2, 3])
[1, 2, 3]->$method(...) # $method->([1, 2, 3], ...)

I'm endlessly amused by finding ways to call anonymous functions:
$foo = sub {say 1};
sub bar {goto $foo};
bar;
''->$foo; # technically a method, along with the lovely:
undef->$foo;
() = sort $foo 1,1; # if you have only two arguments
and, of course, the obvious:
&$foo();
$foo->();

You need arrow operator:
(sub { print 1;})->();

You might not even need an anonymous function if you want to run a block of code and there is zero or one input. You can use map instead.
Just for the side effect:
map { print 1 } 1;
Transform data, take care to assign to a list:
my ($data) = map { $_ * $_ } 2;

# ------------------------------------------------------
# perl: filter array using given function
# ------------------------------------------------------
sub filter {
my ($arr1, $func) = #_;
my #arr2=();
foreach ( #{$arr1} ) {
push ( #arr2, $_ ) if $func->( $_ );
};
return #arr2;
}
# ------------------------------------------------------
# get files from dir
# ------------------------------------------------------
sub getFiles{
my ($p) = #_;
opendir my $dir, $p or die "Cannot open directory: $!";
my #files=readdir $dir;
closedir $dir;
#return files and directories that not ignored but not links
return filter \#files, (sub { my $f = $p.(shift);return ((-f $f) || (-d $f)) && (! -l $f) } );
}

Related

How to pass entire subroutine into hashtable data using perl?

I have the following subroutine which i should pass the routine as hashtable and that hashtable should be again called inside another subroutine using perl?
input file(from linux command bdata):
NAME PEND RUN SUSP JLIM JLIMR RATE HAPPY
achandra 0 48 0 2000 50:2000 151217 100%
agutta 1 5 0 100 50:100 16561 83%
My subroutine:
sub g_usrs_data()
{
my($lines) = #_;
my $header_found = 0;
my #headers = ();
my $row_count = 0;
my %table_data = ();
my %row_data = ();
$lines=`bdata`;
#print $lines;
foreach (split("\n",$lines)) {
if (/NAME\s*PEND/) {
$header_found = 1;
#headers =split;
}
elsif (/^\s*$/)
{
$header_found=0;
}
$row_data{$row_count++} = $_;
#print $_;
}
My query:
How can i pass my subroutine as hash into another subroutine?
example:
g_usrs_data() -> this is my subroutine .
the above subroutine should be passed into another subroutine (i.e into usrs_hash as hash table)
example:
create_db(usrs_hash,$sql1m)
Subroutines can be passed around as code references. See perlreftut and perlsub.
An example with an anonymous subroutine
use warnings;
use strict;
my $rc = sub {
my #args = #_;
print "\tIn coderef. Got: |#_|\n";
return 7;
}; # note the semicolon!
sub use_rc {
my ($coderef, #other_args) = #_;
my $ret = $coderef->('arguments', 'to', 'pass');
return $ret;
}
my $res = use_rc($rc);
print "$res\n";
This silly program prints
In coderef. Got: |arguments to pass|
7
Notes on code references
The anonymous subroutine is assigned to a scalar $rc, making that a code reference
With an existing (named) sub, say func, a code reference is made by my $rc = \&func;
This $rc is a normal scalar variable, that can be passed to subroutines like any other
The sub is then called by $rc->(); where in parenthesis we can pass it arguments
Note that the syntax for creating and using them are just like for other data types
As anonymous assign by = sub { }, much like = [ ] (arrayref) and = { } (hashref)
For a named sub use & instead of a sigil, so \& for sub vs. \# (array) and \% (hash)
They are used by ->(), much like ->[] (arrayref) and ->{} (hashref)
For references in general see perlreftut. Subroutines are covered in depth in perlsub.
See for example this post on anonymous subs, with a number of answers.
For far more see this article from Mastering Perl and this article from The Effective Perler.

How to know if perl subroutine takes parameters

I'm quite new to perl and I can't seem to find any information on how one would know if a subroutine takes a parameter.
In other languages (e.g python, java, etc), it is very clear, a method/function usually looks like this:
def my_func(arg1, arg2):
# do something
But in perl, it's simply:
sub my_func {
my params = #_;
# do something
}
But I've seen examples where my params = #_ isn't even included, but the subroutine is called and passed an argument.
e.g
sub my_func {
my $self = shift;
my $fieldstr = 'tbl'.$self->db_func.'*,';
my $sql = "SELECT $fieldstr FROM tbl".$self->db_func.'WHERE'.$self->...;
my $sth = $self->other_func->my_func($sql) or return undef;
}
So I was wondering if there is some sort of guideline to know if a subroutine takes a parameter(s)?
Please take a look at my answer to a similar question.
Perl is hyper-flexible, in that a subroutine can simply ignore excess parameters, provide defaults for missing ones, or die if the parameters aren't exactly what is expected.
Perl's author, Larry Wall, is a linguist, and the rationale behind this is that subroutines behave like imperative verbs; so just as you could say "fetch the wood", "fetch the wood from behind the shed", or "fetch the wood from behind the shed using the tractor", you can equally say
fetch($wood)
fetch($wood, $shed)
fetch($wood, $shed, $tractor)
or even
fetch($wood, {from => 'shed', using => 'tractor'})
and it is up to the subroutine to work out what is required.
Subroutine prototypes should be avoided, as they have side-effects that change the behaviour of the code in ways that aren't obvious. They are intended only for writing additional language constructs; a good example is Try::Tiny.
Arguments are passed in #_, so you'd need to look for uses of #_. For example,
my ($x, $y) = #_;
shift in a sub is the same as shift(#_)
$_[$i]
&f (but not &f()), as in sub log_warn { unshift #_, 'warn'; &log }
As a sub writer, you should use these as near the top of the sub as possible to make them obvious. The last two are usually used as optimizations or to have by-reference parameters.
The parameters for a subroutine call are contained in the array #_.
Two important functions for array manipulation are shift and pop. Suppose you have a list (1, 2, 3, 4).
shift removes a value from the left side of the list and returns it.
my #list = ( 1, 2, 3, 4 );
my $value = shift #list;
print "$value\n"; # will print "1"
print "#list\n"; # will print "2 3 4"
pop does the same thing from the right side instead of the left:
my #list = ( 1, 2, 3, 4 );
my $value = pop #list;
print "$value\n"; # will print "4"
print "#list\n"; # will print "1 2 3"
When used inside a subroutine, pop and shift will use the #_ array by default:
some_function( 'James', 99 ); #these 2 arguments will be passed to the function in "#_"
sub some_function {
my $name = shift; #remove 'James' from list #_ and store it in $name
my $age = shift; #remove 99 from list #_ and store it in $age
print "Your name is $name and your age is $age\n";
}

perl subroutine argument lists - "pass by alias"?

I just looked in disbelief at this sequence:
my $line;
$rc = getline($line); # read next line and store in $line
I had understood all along that Perl arguments were passed by value, so whenever I've needed to pass in a large structure, or pass in a variable to be updated, I've passed a ref.
Reading the fine print in perldoc, however, I've learned that #_ is composed of aliases to the variables mentioned in the argument list. After reading the next bit of data, getline() returns it with $_[0] = $data;, which stores $data directly into $line.
I do like this - it's like passing by reference in C++. However, I haven't found a way to assign a more meaningful name to $_[0]. Is there any?
You can, its not very pretty:
use strict;
use warnings;
sub inc {
# manipulate the local symbol table
# to refer to the alias by $name
our $name; local *name = \$_[0];
# $name is an alias to first argument
$name++;
}
my $x = 1;
inc($x);
print $x; # 2
The easiest way is probably just to use a loop, since loops alias their arguments to a name; i.e.
sub my_sub {
for my $arg ( $_[0] ) {
code here sees $arg as an alias for $_[0]
}
}
A version of #Steve's code that allows for multiple distinct arguments:
sub my_sub {
SUB:
for my $thisarg ( $_[0] ) {
for my $thatarg ($_[1]) {
code here sees $thisarg and $thatarg as aliases
last SUB;
}
}
}
Of course this brings multilevel nestings and its own code readability issues, so use it only when absolutely neccessary.

Exploring the uses of anonymous subs

I've always been somewhat confused about the purpose and usage of anonymous subs in perl. I understand the concept, but looking for examples and explanations on the value of this practice.
To be clear:
sub foo { ... } # <--- named sub
sub { ... } # <--- anonymous sub
For example:
$ perl -e 'print sub { 1 }'
CODE(0xa4ab6c)
Tells me that sub returns a scalar value. So, I can do:
$ perl -e '$a = sub { 1 }; print $a'
For the same output as above. This of course holds true for all scalar values, so you can load arrays or hashes with anonymous subs.
The question is, how do I use these subs? Why would I want to use them?
And for a gold star, is there any problem which can only be resolved with an anonymous sub?
Anonymous subroutines can be used for all sorts of things.
Callbacks for event handling systems:
my $obj = Some::Obj->new;
$obj->on_event(sub {...});
Iterators:
sub stream {my $args = \#_; sub {shift #$args}}
my $s = stream 1, 2, 3;
say $s->(); # 1
say $s->(); # 2
Higher Order Functions:
sub apply (&#) {
my $code = shift;
$code->() for my #ret = #_;
#ret
}
my #clean = apply {s/\W+/_/g} 'some string', 'another string.';
say $clean[0]; # 'some_string'
Creating aliased arrays:
my $alias = sub {\#_}->(my $x, my $y);
$alias[0]++;
$alias[1] = 5;
say "$x $y"; # '1 5''
Dynamic programming with closures (such as creating a bunch of subroutines that only differ by a small amount):
for my $name (qw(list of names)) {
no strict 'refs';
*$name = sub {... something_with($name) ...};
}
There is no situation where an anonymous subroutine can do anything that a named subroutine can not. The my $ref = sub {...} constructor is equivalent to the following:
sub throw_away_name {...}
my $ref = \&throw_away_name;
without having to bother with deciding on a unique 'throw_away_name' for each sub.
The equivalence also goes the other way, with sub name {...} being equivalent to:
BEGIN {*name = sub {...}}
So other than the name, the code reference created by either method is the same.
To call a subroutine reference, you can use any of the following:
$code->(); # calls with no args
$code->(1, 2, 3); # calls with args (1, 2, 3)
&$code(); # calls with no args
&$code; # calls with whatever #_ currently is
You can even use code references as methods on blessed or unblessed scalars:
my $list = sub {#{ $_[0] }};
say for [1 .. 10]->$list # which prints 1 .. 10
You can use it to create iterators.
use strict;
use warnings;
use 5.012;
sub fib_it {
my ($m, $n) = (0, 0);
return sub {
my $val = ( $m + $n );
$val = 1 unless $val;
($m, $n) = ($n, $val);
return $val;
}
}
my $fibber = fib_it;
say $fibber->() for (1..3); ### 1 1 2
my $fibber2 = fib_it;
say $fibber2->() for (1..5); ### 1 1 2 3 5
say $fibber->() for (1..3); #### 3 5 8
Anonymous subroutines can be used to create closures.
Closure is a notion out of the Lisp world that says if you define an anonymous function in a particular lexical context, it pretends to run in that context even when it's called outside the context.
perlref
What's a closure?
Here's something similar you might have seen before:
#new_list = map { $_ + 1 } #old_list;
And also:
#sorted = sort { $a <=> $b } #unsorted;
Neither of those are anonymous subs, but their behavior can be imitated in your functions with anonymous subs. They don't need the sub keyword because the functions are (essentially) prototyped to have their first argument be a subroutine, and Perl recognizes that as a special case where sub can be left off. (The functions also set the requisite variables to meaningful values before calling the subroutines you provided in order to simplify argument passing, but that's not related.)
You can write your own map-like function:
sub mapgrep (&#) { # make changes and also filter based on defined-ness
my ($func, #list) = #_;
my #new;
for my $i (#list) {
my $j = $func->($i);
push #new, $j if defined $j;
}
}
The magic to make it work with $_ is a bit much to write here - the above version only works for subs that take arguments.
Well I wrote a SAX parser for perl that is event driven. You can pass anonymous subs to the begin/end events on an element.
my $str = "<xml><row><data></data></row></xml>":
my $parser = SAXParser->new();
$parser->when('row')->begin(sub {
my ($element) = #_;
push(#rows, $row);
});
$parser->when('row')->end(sub {
## do something like serialize it or whatever
});
$parser->parse($str);
They are generally used when you want to pass a sub to another bit of code. Often this is a case of "When X happens (in third party code) do Y".
For example. When defining an attribute in Moose, you can specify the default value of that attribute using a sub. Given a class which has, as part of its definition:
has 'size' => (
is => 'ro',
default =>
sub { ( 'small', 'medium', 'large' )[ int( rand 3 ) ] },
predicate => 'has_size',
);
Whenever an instance of that class is created without an explicit size being passed, the sub will be called and the return value will be the size for that object.
If we switch to another language to give a different example, you'll find a similar concept in JavaScript.
var b = document.getElementById('my_button').
b.addEventListener('click', function (e) { alert('clicked!'); });
In your example, you haven't actually called created subroutine. Call is performed with either &$a or $a->() syntax. What you've done is that you stored a reference to subroutine in $a, then stringifyed it and printed result. Compare:
my $a = sub {1};
my $b = sub {1};
print join("\n", $a, $a->(), $b, $b->());
These are subs for the lazy programmer. You can use them for local throw-away functions and can save some typing. Instead of
sub x { ... }
my $function_ptr = \&x;
you can now use
my $function_ptr = sub { ... };
The anonymous functions are also private, and can only be accessed through the $function_ptr, so they don't have an entry in the symbol table.

Can you take a reference of a builtin function in Perl?

What syntax, if any, is able to take a reference of a builtin like shift?
$shift_ref = $your_magic_syntax_here;
The same way you could to a user defined sub:
sub test { ... }
$test_ref = \&test;
I've tried the following, which all don't work:
\&shift
\&CORE::shift
\&{'shift'}
\&{'CORE::shift'}
Your answer can include XS if needed, but I'd prefer not.
Clarification: I am looking for a general purpose solution that can obtain a fully functional code reference from any builtin. This coderef could then be passed to any higher order function, just like a reference to a user defined sub. It seems to be the consensus so far that this is not possible, anyone care to disagree?
No, you can't. What is the underlying problem you are trying to solve? There may be some way to do whatever that is.
Re the added part of the question "Your answer can include XS if needed, but I'd prefer not.",
calling builtins from XS is really hard, since the builtins are set up to assume they are running as part of a compiled optree and have some global variables set. Usually it's much easier to call some underlying function that the builtin itself uses, though there isn't always such a function, so you see things like:
buffer = sv_2mortal(newSVpvf("(caller(%d))[3]", (int) frame));
caller = eval_pv(SvPV_nolen(buffer), 1);
(doing a string eval from XS rather than go through the hoops required to directly call pp_caller).
I was playing around with general purpose solutions to this one, and came up with the following dirty hack using eval. It basically uses the prototype to pull apart #_ and then call the builtin. This has only been lightly tested, and uses the string form of eval, so some may say its already broken :-)
use 5.10.0;
use strict;
use warnings;
sub builtin {
my ($sub, $my, $id) = ($_[0], '');
my $proto = prototype $sub //
prototype "CORE::$sub" //
$_[1] //
($sub =~ /map|grep/ ? '&#' : '#;_');
for ($proto =~ /(\\?.)/g) { $id++;
if (/(?|(\$|&)|.(.))/) {
$my .= "my \$_$id = shift;";
$sub .= " $1\$_$id,";
} elsif (/([#%])/) {
$my .= "my $1_$id = splice \#_, 0, \#_;";
$sub .= " $1_$id,";
} elsif (/_/) {
$my .= "my \$_$id = \#_ ? shift : \$_;";
$sub .= " \$_$id,"
}
}
eval "sub ($proto) {$my $sub}"
or die "prototype ($proto) failed for '$_[0]', ".
"try passing a prototype string as \$_[1]"
}
my $shift = builtin 'shift';
my #a = 1..10;
say $shift->(\#a);
say "#a";
my $uc = builtin 'uc';
local $_ = 'goodbye';
say $uc->('hello '), &$uc;
my $time = builtin 'time';
say &$time;
my $map = builtin 'map';
my $reverse = builtin 'reverse';
say $map->(sub{"$_, "}, $reverse->(#a));
my %h = (a=>1, b=>2);
my $keys = builtin 'keys';
say $keys->(\%h);
# which prints
# 1
# 2 3 4 5 6 7 8 9 10
# HELLO GOODBYE
# 1256088298
# 10, 9, 8, 7, 6, 5, 4, 3, 2,
# ab
Revised with below and refactored.
You could do this if you patched the internal method first (which would give you the coderef of your patch):
use strict;
use warnings;
BEGIN {
*CORE::GLOBAL::die = sub { warn "patched die: '$_[0]'"; exit 3 };
}
print "ref to patched die: " . \&CORE::GLOBAL::die . "\n";
die "ack, I am slain";
gives the output:
ref to patched die: CODE(0x1801060)
patched die: 'ack, I am slain' at patch.pl line 5.
BTW: I would appreciate if anyone can explain why the override needs to be done as *CORE::GLOBAL::die rather than *CORE::die. I can't find any references for this. Additionally, why must the override be done in a BEGIN block? The die() call is done at runtime, so why can't the override be done at runtime just prior?
You can wrap shift with something that you can reference, but you have to use a prototype to use it, since shift is special.
sub my_shift (\#) { my $ll = shift; return shift #$ll }
The problem is that the prototype system can't magically figure out that when it calls some random ref-to-sub in a scalar, that it needs to take the reference before calling the subroutine.
my #list = (1,2,3,4);
sub my_shift (\#) { my $ll = shift; return shift #$ll }
my $a = shift #list;
my $my_shift_ref = \&my_shift;
my $b = (&{$my_shift_ref} (\#list) ); # see below
print "a=$a, b=$b\n";
for (my $i = 0; $i <= $#list; ++$i) { print "\$list[$i] = ",$list[$i],"\n"; }
If this is called as just #list, perl barfs, because it can't automagically make references the way shift does.
See also: [http://www.perl.com/language/misc/fmproto.html][Tom Christensen's article].
Of course, for builtins that aren't special like shift, you can always do
sub my_fork { return fork; }
and then &my_fork all you want.
As I understand you want to have coderef that will be called on some data, and it might point to some your function or to builtin.
If I'm right, just put the builtin in closure:
#!/usr/bin/perl -w
use strict;
my $coderef = \&test;
$coderef->( "Test %u\n", 1 );
$coderef = sub { printf #_ };
$coderef->( "Test %u\n", 2 );
exit;
sub test {
print join(' ', map { "[$_]" } #_) . "\n";
}
Doing it with shift is also possible, but remember that shift without explicit array to work on, works on different arrays based on where it was called.
If you want to see what it takes to fake it in production quality code, look at the code for autodie. The meat is in Fatal. Helps if you're a mad pirate Jedi Australian.
The only way I can get it to work is to make a reference to sub{shift}.
perl -e '#a=(1..3); $f=sub{shift}; print($f->(#a), "\n");'
This is functionally equivalent to:
perl -e '#a=(1..3); print(shift(#a), "\n");'
Which could be just perl -e 'print 1, "\n"' but then we wouldn't be talking about a builtin.
For your information I'm surprised that one cannot reference a builtin, and now that it's been made clear to me I can't help but think of it as a deficiency in Perl.
Update Eric correctly points out that $f=sub{shift}; $f->(#a) leaves #a unchanged. It should be more like:
perl -e '#a=(1..3); $f=sub{shift #{+shift}}; print($f->(\#a), "\n");
Thanks Eric.