Can you take a reference of a builtin function in Perl? - perl

What syntax, if any, is able to take a reference of a builtin like shift?
$shift_ref = $your_magic_syntax_here;
The same way you could to a user defined sub:
sub test { ... }
$test_ref = \&test;
I've tried the following, which all don't work:
\&shift
\&CORE::shift
\&{'shift'}
\&{'CORE::shift'}
Your answer can include XS if needed, but I'd prefer not.
Clarification: I am looking for a general purpose solution that can obtain a fully functional code reference from any builtin. This coderef could then be passed to any higher order function, just like a reference to a user defined sub. It seems to be the consensus so far that this is not possible, anyone care to disagree?

No, you can't. What is the underlying problem you are trying to solve? There may be some way to do whatever that is.
Re the added part of the question "Your answer can include XS if needed, but I'd prefer not.",
calling builtins from XS is really hard, since the builtins are set up to assume they are running as part of a compiled optree and have some global variables set. Usually it's much easier to call some underlying function that the builtin itself uses, though there isn't always such a function, so you see things like:
buffer = sv_2mortal(newSVpvf("(caller(%d))[3]", (int) frame));
caller = eval_pv(SvPV_nolen(buffer), 1);
(doing a string eval from XS rather than go through the hoops required to directly call pp_caller).

I was playing around with general purpose solutions to this one, and came up with the following dirty hack using eval. It basically uses the prototype to pull apart #_ and then call the builtin. This has only been lightly tested, and uses the string form of eval, so some may say its already broken :-)
use 5.10.0;
use strict;
use warnings;
sub builtin {
my ($sub, $my, $id) = ($_[0], '');
my $proto = prototype $sub //
prototype "CORE::$sub" //
$_[1] //
($sub =~ /map|grep/ ? '&#' : '#;_');
for ($proto =~ /(\\?.)/g) { $id++;
if (/(?|(\$|&)|.(.))/) {
$my .= "my \$_$id = shift;";
$sub .= " $1\$_$id,";
} elsif (/([#%])/) {
$my .= "my $1_$id = splice \#_, 0, \#_;";
$sub .= " $1_$id,";
} elsif (/_/) {
$my .= "my \$_$id = \#_ ? shift : \$_;";
$sub .= " \$_$id,"
}
}
eval "sub ($proto) {$my $sub}"
or die "prototype ($proto) failed for '$_[0]', ".
"try passing a prototype string as \$_[1]"
}
my $shift = builtin 'shift';
my #a = 1..10;
say $shift->(\#a);
say "#a";
my $uc = builtin 'uc';
local $_ = 'goodbye';
say $uc->('hello '), &$uc;
my $time = builtin 'time';
say &$time;
my $map = builtin 'map';
my $reverse = builtin 'reverse';
say $map->(sub{"$_, "}, $reverse->(#a));
my %h = (a=>1, b=>2);
my $keys = builtin 'keys';
say $keys->(\%h);
# which prints
# 1
# 2 3 4 5 6 7 8 9 10
# HELLO GOODBYE
# 1256088298
# 10, 9, 8, 7, 6, 5, 4, 3, 2,
# ab
Revised with below and refactored.

You could do this if you patched the internal method first (which would give you the coderef of your patch):
use strict;
use warnings;
BEGIN {
*CORE::GLOBAL::die = sub { warn "patched die: '$_[0]'"; exit 3 };
}
print "ref to patched die: " . \&CORE::GLOBAL::die . "\n";
die "ack, I am slain";
gives the output:
ref to patched die: CODE(0x1801060)
patched die: 'ack, I am slain' at patch.pl line 5.
BTW: I would appreciate if anyone can explain why the override needs to be done as *CORE::GLOBAL::die rather than *CORE::die. I can't find any references for this. Additionally, why must the override be done in a BEGIN block? The die() call is done at runtime, so why can't the override be done at runtime just prior?

You can wrap shift with something that you can reference, but you have to use a prototype to use it, since shift is special.
sub my_shift (\#) { my $ll = shift; return shift #$ll }
The problem is that the prototype system can't magically figure out that when it calls some random ref-to-sub in a scalar, that it needs to take the reference before calling the subroutine.
my #list = (1,2,3,4);
sub my_shift (\#) { my $ll = shift; return shift #$ll }
my $a = shift #list;
my $my_shift_ref = \&my_shift;
my $b = (&{$my_shift_ref} (\#list) ); # see below
print "a=$a, b=$b\n";
for (my $i = 0; $i <= $#list; ++$i) { print "\$list[$i] = ",$list[$i],"\n"; }
If this is called as just #list, perl barfs, because it can't automagically make references the way shift does.
See also: [http://www.perl.com/language/misc/fmproto.html][Tom Christensen's article].
Of course, for builtins that aren't special like shift, you can always do
sub my_fork { return fork; }
and then &my_fork all you want.

As I understand you want to have coderef that will be called on some data, and it might point to some your function or to builtin.
If I'm right, just put the builtin in closure:
#!/usr/bin/perl -w
use strict;
my $coderef = \&test;
$coderef->( "Test %u\n", 1 );
$coderef = sub { printf #_ };
$coderef->( "Test %u\n", 2 );
exit;
sub test {
print join(' ', map { "[$_]" } #_) . "\n";
}
Doing it with shift is also possible, but remember that shift without explicit array to work on, works on different arrays based on where it was called.

If you want to see what it takes to fake it in production quality code, look at the code for autodie. The meat is in Fatal. Helps if you're a mad pirate Jedi Australian.

The only way I can get it to work is to make a reference to sub{shift}.
perl -e '#a=(1..3); $f=sub{shift}; print($f->(#a), "\n");'
This is functionally equivalent to:
perl -e '#a=(1..3); print(shift(#a), "\n");'
Which could be just perl -e 'print 1, "\n"' but then we wouldn't be talking about a builtin.
For your information I'm surprised that one cannot reference a builtin, and now that it's been made clear to me I can't help but think of it as a deficiency in Perl.
Update Eric correctly points out that $f=sub{shift}; $f->(#a) leaves #a unchanged. It should be more like:
perl -e '#a=(1..3); $f=sub{shift #{+shift}}; print($f->(\#a), "\n");
Thanks Eric.

Related

Share lexical scope between successive eval statements in perl

Can I make different snippets of evaled Perl code share the same lexical scope and yet get their return values?
Background
Perl's eval command evaluates a string as Perl code and upon success returns the value of the last statement in that code. However, lexical variables created in that code are dropped at the end of the code. This means when eval of code1 has ended and I have a second code chunk code2, which refers to the lexical variables set in code1, this will fail.
my $code1 = 'my $c = 4';
my $code2 = 'printf "%g\n", $c;';
printf 'evaluated "%s" to %s' . "\n", $code1, eval $code1;
printf 'evaluated "%s"' . "\n", $code2;
yields
evaluated "my $c = 4" to 4
evaluated "printf "%g\n", $c;"
but not a line containing just 4 as I would wish, because $code2 should use the variable $c if lexical scopes are re-used. (I generally agree with the default that lexical scopes are constrained to only one evaled code, so I expect that some conscious modification of code is required to make the above work.)
Approaches considered
I experimented with use PadWalker qw( peek_my ); to save the lexical scope at the end of each code snippet aiming to load it into the scope of the following snippet, but then I realised that this would make inaccessible the return value of the code snippet, which is needed by the calling code.
As another alternative appears to pattern-match (probably using a dedicated parser) all my-declarations in the perl code snippets and essentially translate them on the fly, but this would amount to a considerably bigger task.
Template example for discussion (see comments)
\perlExec{
use PDL;
my $v = vpdl [ 1, 2 ];
my $w = vpdl [ 3, 4 ];
sub list ($) {
my $pdl = shift;
return join ',', map { at( $pdl, $_, 0 ) } 0..1;
}
}
The vector [ \perlValue{ list $v } ]
plus the vector [ \perlValue{ list $w } ]
makes [ \perlValue{ my $s = $v + $w; list $s } ].
Perhaps you would like to use Eval::WithLexicals? It does exactly what you are asking for. It was designed to power REPLs, and it's pure Perl. You simply make a new instance of Eval::WithLexicals, and then you call $ewl->eval($code) instead of eval $code, and the variables will persist between successive calls on the same object.
As I commented, my feeling was that this was an XY Problem in that there may be other solutions to the underlying problem. As it turns out from the discussion in the comments, you seem to be implementing your own templating system, so my first suggestion would be to have a look at existing ones, such as maybe Template::Toolkit.
If you still want to stick with your current approach, then it seems that #hobbs has given an answer that seems to answer your question directly, Eval::WithLexicals (Update: since accepted). As I mentioned, I see two other possible solutions. The first, which feels the most natural to me personally, would be to not use lexicals in the fist place. When I see the code like what you showed:
\perlExec{ my $v = [ 1, 2 ]; }
The vector [ \perlValue{ $v } ]
then, simply because of the braces, I would not be surprised that each of those has their own lexical scope. If you instead were to use package variables, it would seem more natural to me. Just for example, you could use eval qq{ package $packname; no strict "vars"; $code } (of course with the caveat that strict "vars" is disabled), or your could use fully qualified variable names ($package::v) throughout.
The second thing I mentioned was to translate the entire input file into a Perl script and eval that - in other words, write your own templating system. Although I would only recommend reinventing this wheel as your last option, you did ask how to adapt this code I wrote for your purposes, so here it is. One limitation of the following is that any braces in the code blocks must be balanced (but see update below), and since this is a somewhat simplistic demonstration, there are bound to be more limitations. Use at your own risk!
use warnings;
use strict;
use feature qw/say state/;
use Data::Dumper;
use Regexp::Common qw/balanced/;
use Capture::Tiny qw/capture_stdout/;
my $DEBUG = 1;
local $/=undef;
while (my $input = <>) {
my $code = translate($ARGV,$input);
$DEBUG and say ">>>>> Generated Code:\n", $code, "<<<<<";
my ($output, $rv) = capture_stdout { eval $code };
$rv or die "eval failed: ".($#//'unknown error');
say ">>>>> Output:\n", $output, "<<<<<";
}
sub translate {
my ($fn,$input) = #_;
state $packcnt = 1;
$fn =~ tr/A-Za-z0-9/_/cs;
my $pack = "Generated".$packcnt++."_$fn";
my $output = "{ package $pack;\n";
$output.= "no warnings; no strict;\n";
$output.= "#line 1 \"$pack\"\n";
while ( $input=~m{ \G (?<str> .*? ) \\perl(?<type> Exec|Value )
(?<code> $RE{balanced}{-parens=>'{}'} ) }xsgc ) {
my ($str,$type,$code) = #+{qw/str type code/};
$output.= "print ".perlstr($str).";\n" if length($str);
($code) = $code=~/\A\s*\{(.*)\}\s*\z/s or die $code;
$code .= ";" unless $code=~/;\s*\z/;
$code = "print do { $code };" if $type eq 'Value';
$output.= "$code\n";
}
my $str = substr $input, pos($input)//0;
$output.= "print ".perlstr($str).";\n" if length($str);
$output.= "} # end package $pack\n1;\n";
return $output;
}
sub perlstr { Data::Dumper->new([''.shift])
->Terse(1)->Indent(0)->Useqq(1)->Dump }
Input File:
\perlExec{
use warnings; use strict;
print "Hello, World\n";
my $v = [ 1, 2 ];
my $w = [ 3, 4 ];
sub list ($) {
my $pdl = shift;
return join ',', #$pdl;
}
}
The vector [ \perlValue{ list $v } ]
plus the vector [ \perlValue{ list $w } ]
makes [ \perlValue{ my $s = [#$v + #$w]; list $s } ].
Output:
>>>>> Generated Code:
{ package Generated1_input_txt;
no warnings; no strict;
#line 1 "Generated1_input_txt"
use warnings; use strict;
print "Hello, World\n";
my $v = [ 1, 2 ];
my $w = [ 3, 4 ];
sub list ($) {
my $pdl = shift;
return join ',', #$pdl;
}
;
print "\nThe vector [ ";
print do { list $v ; };
print " ]\nplus the vector [ ";
print do { list $w ; };
print " ]\nmakes [ ";
print do { my $s = [#$v + #$w]; list $s ; };
print " ].\n";
} # end package Generated1_input_txt
1;
<<<<<
>>>>> Output:
Hello, World
The vector [ 1,2 ]
plus the vector [ 3,4 ]
makes [ 4 ].
<<<<<
Update: As suggested by #HåkonHægland in the comments, it's possible to use PPR to parse out the blocks. The only changes needed are to replace use Regexp::Common qw/balanced/; by use PPR; and in the regex replace (?<code> $RE{balanced}{-parens=>'{}'} ) by (?<code> (?&PerlBlock) ) $PPR::GRAMMAR - then the parser will handle a case like print "Hello, World }\n"; too!
Here is one approach: Parse the template file twice. On first parse, write Perl statements from the template to a temp file, for example /tmp/MyTemplate.pm, add some header code to this file such as to make it a valid Perl module. Also use a sequential numbering into package variables for \perlValue statements, i.e. translate the first \perlValue{ list $v } into for example: our $perl_value1 = list $v;, the next \perlValue{ list $w } becomes our $perl_value2 = list $w; and so on..
Then require the module: require "/tmp/MyTmplate.pm"; Then parse the template a second time, extracting the correct values corresponding to Perl code in the template from the symbol table of MyTemplate. For example to get the value of \perlValue{ list $v } use $MyTemplate::perl_value1 and so on..

Is it possible to convert a stringified reference from a SCALAR back to a REF? [duplicate]

Is there any way to get Perl to convert the stringified version e.g (ARRAY(0x8152c28)) of an array reference to the actual array reference?
For example
perl -e 'use Data::Dumper; $a = [1,2,3];$b = $a; $a = $a.""; warn Dumper (Then some magic happens);'
would yield
$VAR1 = [
1,
2,
3
];
Yes, you can do this (even without Inline C). An example:
use strict;
use warnings;
# make a stringified reference
my $array_ref = [ qw/foo bar baz/ ];
my $stringified_ref = "$array_ref";
use B; # core module providing introspection facilities
# extract the hex address
my ($addr) = $stringified_ref =~ /.*(0x\w+)/;
# fake up a B object of the correct class for this type of reference
# and convert it back to a real reference
my $real_ref = bless(\(0+hex $addr), "B::AV")->object_2svref;
print join(",", #$real_ref), "\n";
but don't do that. If your actual object is freed or reused, you may very well
end up getting segfaults.
Whatever you are actually trying to achieve, there is certainly a better way.
A comment to another answer reveals that the stringification is due to using a reference as a hash key. As responded to there, the better way to do that is the well-battle-tested
Tie::RefHash.
The first question is: do you really want to do this?
Where is that string coming from?
If it's coming from outside your Perl program, the pointer value (the hex digits) are going to be meaningless, and there's no way to do it.
If it's coming from inside your program, then there's no need to stringify it in the first place.
Yes, it's possible: use Devel::FindRef.
use strict;
use warnings;
use Data::Dumper;
use Devel::FindRef;
sub ref_again {
my $str = #_ ? shift : $_;
my ($addr) = map hex, ($str =~ /\((.+?)\)/);
Devel::FindRef::ptr2ref $addr;
}
my $ref = [1, 2, 3];
my $str = "$ref";
my $ref_again = ref_again($str);
print Dumper($ref_again);
The stringified version contains the memory address of the array object, so yes, you can recover it. This code works for me, anyway (Cygwin, perl 5.8):
use Inline C;
#a = (1,2,3,8,12,17);
$a = \#a . "";
print "Stringified array ref is $a\n";
($addr) = $a =~ /0x(\w+)/;
$addr = hex($addr);
$c = recover_arrayref($addr);
#c = #$c;
print join ":", #c;
__END__
__C__
AV* recover_arrayref(int av_address) { return (AV*) av_address; }
.
$ perl ref-to-av.pl
Stringified array ref is ARRAY(0x67ead8)
1:2:3:8:12:17
I'm not sure why you want to do this, but if you really need it, ignore the answers that use the tricks to look into memory. They'll only cause you problems.
Why do you want to do this? There's probably a better design. Where are you getting that stringified reference from.
Let's say you need to do it for whatever reason. First, create a registry of objects where the hash key is the stringified form, and the value is a weakened reference:
use Scalar::Util qw(weaken);
my $array = [ ... ];
$registry{ $array } = $array;
weaken( $registry{ $array } ); # doesn't count toward ref count
Now, when you have the stringified form, you just look it up in the hash, checking to see that it's still a reference:
if( ref $registry{$string} ) { ... }
You could also try Tie::RefHash and let it handle all of the details of this.
There is a longer example of this in Intermediate Perl.
In case someone finds this useful, I'm extending tobyink's answer by adding support for detecting segmentation faults. There are two approaches I discovered. The first way locally replaces $SIG{SEGV} and $SIG{BUS} before dereferencing. The second way masks the child signal and checks if a forked child can dereference successfully. The first way is significantly faster than the second.
Anyone is welcome to improve this answer.
First Approach
sub unstringify_ref($) {
use bigint qw(hex);
use Devel::FindRef;
my $str = #_ ? shift : $_;
if (defined $str and $str =~ /\((0x[a-fA-F0-9]+)\)$/) {
my $addr = (hex $1)->bstr;
local $#;
return eval {
local $SIG{SEGV} = sub { die };
local $SIG{BUS} = sub { die };
return Devel::FindRef::ptr2ref $addr;
};
}
return undef;
}
I'm not sure if any other signals can occur in an attempt to access illegal memory.
Second Approach
sub unstringify_ref($) {
use bigint qw(hex);
use Devel::FindRef;
use Signal::Mask;
my $str = #_ ? shift : $_;
if (defined $str and $str =~ /\((0x[a-fA-F0-9]+)\)$/) {
my $addr = (hex $1)->bstr;
local $!;
local $?;
local $Signal::Mask{CHLD} = 1;
if (defined(my $kid = fork)) {
# Child -- This might seg fault on invalid address.
exit(not Devel::FindRef::ptr2ref $addr) unless $kid;
# Parent
waitpid $kid, 0;
return Devel::FindRef::ptr2ref $addr if $? == 0;
} else {
warn 'Unable to fork: $!';
}
}
return undef;
}
I'm not sure if the return value of waitpid needs to be checked.

Perl custom syntax for passing function arguments

I have been using perl for some time now.
I want to know how I can run the following operation in perl:
subtract(40)(20)
To get the result:
20
I think I would have to look at custom parsing techniques for Perl.
This is what I am looking at right now:
Devel::Declare
Devel::CallParser
and
http://www.perl.com/pub/2012/10/an-overview-of-lexing-and-parsing.html
Now, I am not sure what to look for or what to do.
Any help on HOW to go about this, WHAT to read would be appreciated. Please be clear.
Thank you.
I recommend trying Parse::Keyword. Parse::Keyword is really great for parsing custom syntax, as it lets you call back various parts of the Perl parser, such as parse_listexpr, parse_block, parse_fullstmt, etc (see perlapi).
It has a drawback in that if you use those to parse expressions that close over variables, these are handled badly, but this can be worked around with PadWalker.
Parse::Keyword (including PadWalker trickery) is what Kavorka uses; and that does some pretty complex stuff! Early versions of p5-mop-redux used it too.
Anyway, here's a demonstration of how your weird function could be parsed...
use v5.14;
use strict;
use warnings;
# This is the package where we define the functions...
BEGIN {
package Math::Weird;
# Set up parsing for the functions
use Parse::Keyword {
add => \&_parser,
subtract => \&_parser,
multiply => \&_parser,
divide => \&_parser,
};
# This package is an exporter of course
use parent 'Exporter::Tiny';
our #EXPORT = qw( add subtract multiply divide );
# We'll need these things from PadWalker
use PadWalker qw( closed_over set_closed_over peek_my );
sub add {
my #numbers = _grab_args(#_);
my $sum = 0;
$sum += $_ for #numbers;
return $sum;
}
sub subtract {
my #numbers = _grab_args(#_);
my $diff = shift #numbers;
$diff -= $_ for #numbers;
return $diff;
}
sub multiply {
my #numbers = _grab_args(#_);
my $product = 1;
$product *= $_ for #numbers;
return $product;
}
sub divide {
my #numbers = _grab_args(#_);
my $quotient = shift #numbers;
$quotient /= $_ for #numbers;
return $quotient;
}
sub _parser {
lex_read_space;
my #args;
while (lex_peek eq '(')
{
# read "("
lex_read(1);
lex_read_space;
# read a term within the parentheses
push #args, parse_termexpr;
lex_read_space;
# read ")"
lex_peek eq ')' or die;
lex_read(1);
lex_read_space;
}
return sub { #args };
}
# In an ideal world _grab_args would be implemented like
# this:
#
# sub _grab_args { map scalar(&$_), #_ }
#
# But because of issues with Parse::Keyword, we need
# something slightly more complex...
#
sub _grab_args {
my $caller_vars = peek_my(2);
map {
my $code = $_;
my $closed_over = closed_over($code);
$closed_over->{$_} = $caller_vars->{$_} for keys %$closed_over;
set_closed_over($code, $closed_over);
scalar $code->();
} #_;
}
# We've defined a package inline. Mark it as loaded, so
# that we can `use` it below.
$INC{'Math/Weird.pm'} = __FILE__;
};
use Math::Weird qw( add subtract multiply );
say add(2)(3); # says 5
say subtract(40)(20); # says 20
say multiply( add(2)(3) )( subtract(40)(20) ); # says 100
If you can live with the additions of a sigil and an arrow, you could curry subtract as in
my $subtract = sub {
my($x) = #_;
sub { my($y) = #_; $x - $y };
};
Call it as in
my $result = $subtract->(40)(20);
If the arrow is acceptable but not the sigil, recast subtract as
sub subtract {
my($x) = #_;
sub { my($y) = #_; $x - $y };
};
Invocation in this case looks like
my $result = subtract(40)->(20);
Please don't tack on broken syntax extensions on your program to solve a solved problem.
What you want are closures, and a technique sometimes called currying.
Currying is the job of transforming a function that takes multiple arguments into a function that is invoked multiple times with one argument each. For example, consider
sub subtract {
my ($x, $y) = #_;
return $x - $y;
}
Now we can create a subroutine that already provides the first argument:
sub subtract1 { subtract(40, #_) }
Invoking subtract1(20) now evaluates to 20.
We can use anonymous subroutines instead, which makes this more flexible:
my $subtract = sub { subtract(40, #_) };
$subtract->(20);
We don't need that variable:
sub { subtract(40, #_) }->(20); # equivalent to subtract(40, 20)
We can write subtract in a way that does this directly:
sub subtract_curried {
my $x = shift;
# don't return the result, but a subroutine that calculates the result
return sub {
my $y = shift;
return $x - $y;
};
}
Now: subtract_curried(40)->(20) – notice the arrow in between, as we are dealing with a code reference (another name for anonymous subroutine, or closures).
This style of writing functions is much more common in functional languages like Haskell or OCaml where the syntax for this is prettier. It allows very flexible combinations of functions. If you are interested in this kind of programming in Perl, you might want to read Higher-Order Perl.
#Heartache: Please forget this challenge as it makes no sense for the parser and the user.
You can think of using fn[x][y] or fn{x}{y} which are valid syntax variants - i.e. you can stack [] and {} but not lists,
or fn(x,y) or fn(x)->(y) which do look nice, are also valid and meaningful syntax variants.
But fn(x)(y) will not know in which context the second list should be used.
For fn(x)(y) the common interpretation would be fn(x); (y) => (y). It returns the 2nd list after evaluating the first call.
You may create the source code filter:
package BracketFilter;
use Filter::Util::Call;
sub import {
filter_add(sub {
my $status;
s/\)\(/, /g if ($status = filter_read()) > 0;
return $status ;
});
}
1;
And use it:
#!/usr/bin/perl
use BracketFilter;
subtract(40)(20);
sub subtract {
return $_[0] - $_[1];
}

How to run an anonymous function in Perl?

(sub {
print 1;
})();
sub {
print 1;
}();
I tried various ways, all are wrong...
(sub { ... }) will give you the pointer to the function so you must call by reference.
(sub { print "Hello world\n" })->();
The other easy method, as pointed out by Blagovest Buyukliev would be to dereference the function pointer and call that using the { } operators
&{ sub { print "Hello World" }}();
Yay, I didn't expect you folks to come up with that much possibilities. But you're right, this is perl and TIMTOWTDI: +1 for creativitiy!
But to be honest, I use hardly another form than the following:
The Basic Syntax
my $greet = sub {
my ( $name ) = #_;
print "Hello $name\n";
};
# ...
$greet->( 'asker' )
It's pretty straight forward: sub {} returns a reference to a sub routine, which you can store and pass around like any other scalar. You can than call it by dereferencing. There is also a second syntax to dereference: &{ $sub }( 'asker' ), but I personally prefer the arrow syntax, because I find it more readable and it pretty much aligns with dereferencing hashes $hash->{ $key } and arrays $array->[ $index ]. More information on references can be found in perldoc perlref.
I think the other given examples are a bit advanced, but why not have a look at them:
Goto
sub bar {goto $foo};
bar;
Rarely seen and much feared these days. But at least it's a goto &function, which is considered less harmful than it's crooked friends: goto LABEL or goto EXPRESSION ( they are deprecated since 5.12 and raise a warning ). There are actually some circumstances, when you want to use that form, because this is not a usual function call. The calling function ( bar in the given example ) will not appear in the callling stack. And you don't pass your parameters, but the current #_ will be used. Have a look at this:
use Carp qw( cluck );
my $cluck = sub {
my ( $message ) = #_;
cluck $message . "\n";
};
sub invisible {
#_ = ( 'fake' );
goto $cluck;
}
invisible( 'real' );
Output:
fake at bar.pl line 5
main::__ANON__('fake') called at bar.pl line 14
And there is no hint of an invisible function in the stack trace. More info on goto in perldoc -f goto.
Method Calls
''->$foo;
# or
undef->$foo;
If you call a method on an object, the first parameter passed to that method will be the invocant ( usually an instance or the class name ). Did i already say that TIMTOWTCallAFunction?
# this is just a normal named sub
sub ask {
my ( $name, $question ) = #_;
print "$question, $name?\n";
};
my $ask = \&ask; # lets take a reference to that sub
my $question = "What's up";
'asker'->ask( $question ); # 1: doesn't work
my $meth_name = 'ask';
'asker'->$meth_name( $question ); # 2: doesn't work either
'asker'->$ask( $question ); # 1: this works
In the snippet above are two calls, which won't work, because perl will try to find a method called ask in package asker ( actually it would work if that code was in the said package ). But the third one succeeds, because you already give perl the right method and it doesn't need to search for it. As always: more info in the perldoc I can't find any reason right now, to excuse this in production code.
Conclusion
Originally I didn't intend to write that much, but I think it's important to have the common solution at the beginning of an answer and some explanations to the unusual constructs. I admit to be kind of selfish here: Every one of us could end up maintaining someones code, who found this question and just copied the topmost example.
There is not much need in Perl to call an anonymous subroutine where it is defined. In general you can achieve any type of scoping you need with bare blocks. The one use case that comes to mind is to create an aliased array:
my $alias = sub {\#_}->(my ($x, $y, $z));
$x = $z = 0;
$y = 1;
print "#$alias"; # '0 1 0'
Otherwise, you would usually store an anonymous subroutine in a variable or data structure. The following calling styles work with both a variable and a sub {...} declaration:
dereference arrow: sub {...}->(args) or $code->(args)
dereference sigil: &{sub {...}}(args) or &$code(args)
if you have the coderef in a scalar, you can also use it as a method on regular and blessed values.
my $method = sub {...};
$obj->$method # same as $method->($obj)
$obj->$method(...) # $method->($obj, ...)
[1, 2, 3]->$method # $method->([1, 2, 3])
[1, 2, 3]->$method(...) # $method->([1, 2, 3], ...)
I'm endlessly amused by finding ways to call anonymous functions:
$foo = sub {say 1};
sub bar {goto $foo};
bar;
''->$foo; # technically a method, along with the lovely:
undef->$foo;
() = sort $foo 1,1; # if you have only two arguments
and, of course, the obvious:
&$foo();
$foo->();
You need arrow operator:
(sub { print 1;})->();
You might not even need an anonymous function if you want to run a block of code and there is zero or one input. You can use map instead.
Just for the side effect:
map { print 1 } 1;
Transform data, take care to assign to a list:
my ($data) = map { $_ * $_ } 2;
# ------------------------------------------------------
# perl: filter array using given function
# ------------------------------------------------------
sub filter {
my ($arr1, $func) = #_;
my #arr2=();
foreach ( #{$arr1} ) {
push ( #arr2, $_ ) if $func->( $_ );
};
return #arr2;
}
# ------------------------------------------------------
# get files from dir
# ------------------------------------------------------
sub getFiles{
my ($p) = #_;
opendir my $dir, $p or die "Cannot open directory: $!";
my #files=readdir $dir;
closedir $dir;
#return files and directories that not ignored but not links
return filter \#files, (sub { my $f = $p.(shift);return ((-f $f) || (-d $f)) && (! -l $f) } );
}

How can I convert the stringified version of array reference to actual array reference in Perl?

Is there any way to get Perl to convert the stringified version e.g (ARRAY(0x8152c28)) of an array reference to the actual array reference?
For example
perl -e 'use Data::Dumper; $a = [1,2,3];$b = $a; $a = $a.""; warn Dumper (Then some magic happens);'
would yield
$VAR1 = [
1,
2,
3
];
Yes, you can do this (even without Inline C). An example:
use strict;
use warnings;
# make a stringified reference
my $array_ref = [ qw/foo bar baz/ ];
my $stringified_ref = "$array_ref";
use B; # core module providing introspection facilities
# extract the hex address
my ($addr) = $stringified_ref =~ /.*(0x\w+)/;
# fake up a B object of the correct class for this type of reference
# and convert it back to a real reference
my $real_ref = bless(\(0+hex $addr), "B::AV")->object_2svref;
print join(",", #$real_ref), "\n";
but don't do that. If your actual object is freed or reused, you may very well
end up getting segfaults.
Whatever you are actually trying to achieve, there is certainly a better way.
A comment to another answer reveals that the stringification is due to using a reference as a hash key. As responded to there, the better way to do that is the well-battle-tested
Tie::RefHash.
The first question is: do you really want to do this?
Where is that string coming from?
If it's coming from outside your Perl program, the pointer value (the hex digits) are going to be meaningless, and there's no way to do it.
If it's coming from inside your program, then there's no need to stringify it in the first place.
Yes, it's possible: use Devel::FindRef.
use strict;
use warnings;
use Data::Dumper;
use Devel::FindRef;
sub ref_again {
my $str = #_ ? shift : $_;
my ($addr) = map hex, ($str =~ /\((.+?)\)/);
Devel::FindRef::ptr2ref $addr;
}
my $ref = [1, 2, 3];
my $str = "$ref";
my $ref_again = ref_again($str);
print Dumper($ref_again);
The stringified version contains the memory address of the array object, so yes, you can recover it. This code works for me, anyway (Cygwin, perl 5.8):
use Inline C;
#a = (1,2,3,8,12,17);
$a = \#a . "";
print "Stringified array ref is $a\n";
($addr) = $a =~ /0x(\w+)/;
$addr = hex($addr);
$c = recover_arrayref($addr);
#c = #$c;
print join ":", #c;
__END__
__C__
AV* recover_arrayref(int av_address) { return (AV*) av_address; }
.
$ perl ref-to-av.pl
Stringified array ref is ARRAY(0x67ead8)
1:2:3:8:12:17
I'm not sure why you want to do this, but if you really need it, ignore the answers that use the tricks to look into memory. They'll only cause you problems.
Why do you want to do this? There's probably a better design. Where are you getting that stringified reference from.
Let's say you need to do it for whatever reason. First, create a registry of objects where the hash key is the stringified form, and the value is a weakened reference:
use Scalar::Util qw(weaken);
my $array = [ ... ];
$registry{ $array } = $array;
weaken( $registry{ $array } ); # doesn't count toward ref count
Now, when you have the stringified form, you just look it up in the hash, checking to see that it's still a reference:
if( ref $registry{$string} ) { ... }
You could also try Tie::RefHash and let it handle all of the details of this.
There is a longer example of this in Intermediate Perl.
In case someone finds this useful, I'm extending tobyink's answer by adding support for detecting segmentation faults. There are two approaches I discovered. The first way locally replaces $SIG{SEGV} and $SIG{BUS} before dereferencing. The second way masks the child signal and checks if a forked child can dereference successfully. The first way is significantly faster than the second.
Anyone is welcome to improve this answer.
First Approach
sub unstringify_ref($) {
use bigint qw(hex);
use Devel::FindRef;
my $str = #_ ? shift : $_;
if (defined $str and $str =~ /\((0x[a-fA-F0-9]+)\)$/) {
my $addr = (hex $1)->bstr;
local $#;
return eval {
local $SIG{SEGV} = sub { die };
local $SIG{BUS} = sub { die };
return Devel::FindRef::ptr2ref $addr;
};
}
return undef;
}
I'm not sure if any other signals can occur in an attempt to access illegal memory.
Second Approach
sub unstringify_ref($) {
use bigint qw(hex);
use Devel::FindRef;
use Signal::Mask;
my $str = #_ ? shift : $_;
if (defined $str and $str =~ /\((0x[a-fA-F0-9]+)\)$/) {
my $addr = (hex $1)->bstr;
local $!;
local $?;
local $Signal::Mask{CHLD} = 1;
if (defined(my $kid = fork)) {
# Child -- This might seg fault on invalid address.
exit(not Devel::FindRef::ptr2ref $addr) unless $kid;
# Parent
waitpid $kid, 0;
return Devel::FindRef::ptr2ref $addr if $? == 0;
} else {
warn 'Unable to fork: $!';
}
}
return undef;
}
I'm not sure if the return value of waitpid needs to be checked.