I have the following Perl code:
sub merge_xml {
foreach my $repository ('repo1', 'repo2') {
my #xml_files;
sub match_xml {
my $filename = $File::Find::dir . '/' . $_;
if ($filename =~ m%/(main|test)\.xml$%) {
push(#xml_files, $filename);
}
}
find(\&match_xml, $repository);
print Dumper(\#xml_files);
}
}
And I am getting the warning:
Variable "#xml_files" will not stay shared at ./script/src/repair.pl line 87.
How does one fix this?
PS find as in File::Find
"Nested" named subs in fact aren't -- they're compiled as separate subroutines, and so having them written as "nested" can only be misleading.
Further, this creates a problem since the "inner" subroutine supposedly closes over the variable #xml_files that it uses, which gets redefined on each new call, being lexical. But the sub, built at compile-time and not being a lexical closure, only keeps the refernce to the value at first call and so it works right only upon the first call to the outer sub (merge_xml here).
We do get the warning though. With use diagnostics; (or see it in perldiag)
Variable "$x" will not stay shared at -e line 1 (#1)
(W closure) An inner (nested) named subroutine is referencing a
lexical variable defined in an outer named subroutine.
When the inner subroutine is called, it will see the value of
the outer subroutine's variable as it was before and during the first
call to the outer subroutine; in this case, after the first call to the
outer subroutine is complete, the inner and outer subroutines will no
longer share a common value for the variable. In other words, the
variable will no longer be shared.
This problem can usually be solved by making the inner subroutine
anonymous, using the sub {} syntax. When inner anonymous subs that
reference variables in outer subroutines are created, they
are automatically rebound to the current values of such variables.
So pull out that "inner" sub (match_xml) and use it normally from the "outer" one (merge_xml). In general you'd pass the reference to the array (#xml_files) to it; or, since in this case one cannot pass to File::Find's find, can have the array in such scope so to be seen as needed.
Or, since the purpose of match_xml is to be find's "wanted" function, can use an anonymous sub for that purpose so there is no need for a separate named sub
find( sub { ... }, #dirs );
Or store that coderef in a variable, as shown in Ed Heal's answer
my $wanted_coderef = sub { ... };
find( $wanted_coderef, #dirs );
With help from zdim I came up with:
sub merge_xml {
foreach my $repository ('repo1', 'repo2') {
my #xml_files;
my match_xml = sub {
my $filename = $File::Find::dir . '/' . $_;
if ($filename =~ m%/(main|test)\.xml$%) {
push(#xml_files, $filename);
}
};
find($match_xml, $repository);
print Dumper(\#xml_files);
}
}
Might I suggest another alternative. By using a factory function, you can eliminate the need to hand write a find subroutine each time.
A factory is a function that generates another function (or subroutine in this case). You feed it some parameters and it creates a custom subroutine with those parameters bolted in. My example uses a closure but you could also build it with a string eval if the closure is costly for some reason.
sub match_factory {
my ($filespec, $array) = #_;
# Validate arguments
die qq{File spec must be a regex, not "$filespec"\n}
unless ref $filespec eq "Regexp";
die qq{Second argument must be an array reference, not "$array"\n}
unless ref $array eq "ARRAY";
# Generate an anonymous sub to perform the match check
# that creates a closure on $filespec and $array
my $matcher = sub {
# Automatically compare against $_
m/$filespec/ and
push(#$array, $File::Find::name);
};
return $matcher;
}
sub merge_xml {
my #repos = #_ # or qw/foo bar .../;
foreach my $repository (#repos) {
my #matched_files;
find(
match_factory( qr/\/(main|test)\.xml$/, \#matched_files ),
$repository
);
print Dumper(\#matched_files);
}
}
HTH
Related
Perl 5.18.2 accepts "local subroutines", it seems.
Example:
sub outer()
{
my $x = 'x'; # just to make a simple example
sub inner($)
{
print "${x}$_[0]\n";
}
inner('foo');
}
Without "local subroutines" I would have written:
#...
my $inner = sub ($) {
print "${x}$_[0]\n";
}
$inner->('foo');
#...
And most importantly I would consider both to be equivalent.
However the first variant does not work as Perl complains:
Variable $x is not available at ...
where ... describes the line there $x is referenced in the "local subroutine".
Who can explain this; are Perl's local subroutines fundamentally different from Pascal's local subroutines?
The term "local subroutine" in the question seems to be referring to lexical subroutines. These are private subroutines visible only within the scope (block) where they are defined, after the definition; just like private variables.
But they are defined (or pre-declared) with my or state, as my sub subname { ... }
Just writing a sub subname { ... } inside of another doesn't make it "local" (in any version of Perl), but it is compiled just as if it were written alongside that other subroutine and is placed in their package's symbol table (main:: for example).
The question mentions closure in the title and here is a comment on that
A closure in Perl is a structure in a program, normally a scalar variable, with a reference to a sub and which carries environment (variables) from its scope at its (runtime) creation. See also a perlfaq7 entry on it. Messy to explain. For example:
sub gen {
my $args = "#_";
my $cr = sub { say "Closed over: $args, my args: #_" }
return $cr;
}
my $f = gen( qw(args for gen) );
$f->("hi closed");
# Prints:
# Closed over: args for gen, my args: hi closed
The anonymous sub "closes over" the variables in scope where it's defined, in a sense that when its generating function returns its reference and goes out of scope those variables still live on, because of the existence of that reference.
Since anonymous subs are created at runtime, every time its generating function is called and lexicals in it remade so is the anon sub, so it always has access to current values.
Thus the returned reference to the anon-sub uses lexical data, which would otherwise be gone. A little piece of magic.†
Back to the question of "local" subs. If we want to introduce actual closures to the question, we'd need to return a code reference from the outer subroutine, like
sub outer {
my $x = 'x' . "#_";
return sub { say "$x #_" }
}
my $f = outer("args");
$f->( qw(code ref) ); # prints: xargs code ref
Or, per the main question, as introduced in v5.18.0 and stable from v5.26.0, we can use a named lexical (truly nested!) subroutine
sub outer {
my $x = 'x' . "#_";
my sub inner { say "$x #_" };
return \&inner;
}
In both cases my $f = outer(...); has the code reference returned from outer which correctly uses the local lexical variables ($x), with their most current values.
But we cannot use a plain named sub inside outer for a closure
sub outer {
...
sub inner { ... } # misleading, likely misguided and buggy
return \&inner; # won't work correctly
}
This inner is made at compile time and is global so any variables it uses from outer will have their values baked from when outer was called the first time. So inner will be correct only until outer is called the next time -- when the lexical environment in outer gets remade but inner doesn't. As an example I can readily find this post, and see the entry in perldiag (or add use diagnostics; to the program).
† And in my view a poor-man's object in a way, as it has functionality and data, made elsewhere at another time and which can be used with data passed to it (and both can be updated)
If you want "local" subs, you can use one of the following based on the level of backward compatibility you want:
5.26+:
my sub inner { ... }
5.18+:
use experimental qw( lexical_subs ); # Safe: Accepted in 5.26.
my sub inner { ... }
"Any" version:
local *inner = sub { ... };
However, you should not, use sub inner { ... }.
sub f { ... }
is basically the same as
BEGIN { *f = sub { ... } }
so
sub outer {
...
sub inner { ... }
...
}
is basically
BEGIN {
*outer = sub {
...
BEGIN {
*inner = sub { ... };
}
...
};
}
As you can see, inner is visible even outside of outer, so it's not "local" at all.
And as you can see, the assignment to *inner is done at compile-time, which introduces another major problem.
use strict;
use warnings;
use feature qw( say );
sub outer {
my $arg = shift;
sub inner {
say $arg;
}
inner();
}
outer( 123 );
outer( 456 );
Variable "$arg" will not stay shared at a.pl line 9.
123
123
5.18 did introduce lexical ("local") subroutines.
use strict;
use warnings;
use feature qw( say );
use experimental qw( lexical_subs ); # Safe: Accepted in 5.26.
sub outer {
my $arg = shift;
my sub inner {
say $arg;
};
inner();
}
outer( 123 );
outer( 456 );
123
456
If you need to support older versions of Perl, you can use the following:
use strict;
use warnings;
use feature qw( say );
sub outer {
my $arg = shift;
local *inner = sub {
say $arg;
};
inner();
}
outer( 123 );
outer( 456 );
123
456
I found a rather good explanation from man perldiag:
Variable "%s" is not available
(W closure) During compilation, an inner named subroutine or eval
is attempting to capture an outer lexical that is not currently
available. This can happen for one of two reasons. First, the
outer lexical may be declared in an outer anonymous subroutine
that has not yet been created. (Remember that named subs are
created at compile time, while anonymous subs are created at run-
time.) For example,
sub { my $a; sub f { $a } }
At the time that f is created, it can't capture the current value
of $a, since the anonymous subroutine hasn't been created yet.
So this would be a possible fix:
sub outer()
{
my $x = 'x'; # just to make a simple example
eval 'sub inner($)
{
print "${x}$_[0]\n";
}';
inner('foo');;
}
...while this one won't:
sub outer()
{
my $x = 'x'; # just to make a simple example
eval {
sub inner($)
{
print "${x}$_[0]\n";
}
};
inner('foo');;
}
In the following script, I declare and modify #basearray in the main program. Inside the dosomething subroutine, I access #basearray, assign it to an array local to the script, and modify the local copy. Because I have been careful to change the value only of local variables inside the subroutine, #basearray is not changed.
If I had made the mistake of assigning a value to #basearray inside the subroutine, though, it would have been changed and that value would have persisted after the call to the subroutine.
This is demonstrated in the 2nd subroutine, doagain.
Also, doagain receives the reference \#basearray as an argument rather than accessing #basearray directly. But going to that additional trouble provides no additional safety. Why do it that way at all?
Is there a way to guarantee that I cannot inadvertently change #basearray inside any subroutine? Any kind of hard safety device that I can build into my code, analogous to use strict;, some combination perhaps of my and local?
Am I correct in thinking that the answer is No, and that the only solution is to not make careless programmer errors?
#!/usr/bin/perl
use strict; use warnings;
my #basearray = qw / amoeba /;
my $count;
{
print "\#basearray==\n";
$count = 0;
foreach my $el (#basearray) { $count++; print "$count:\t$el\n" };
}
sub dosomething
{
my $sb_name = (caller(0))[3];
print "entered $sb_name\n";
my #sb_array=( #basearray , 'dog' );
{
print "\#sb_array==\n";
$count = 0;
foreach my $el (#sb_array) { $count++; print "$count:\t$el\n" };
}
print "return from $sb_name\n";
}
dosomething();
#basearray = ( #basearray, 'rats' );
{
print "\#basearray==\n";
$count = 0;
foreach my $el (#basearray) { $count++; print "$count:\t$el\n" };
}
sub doagain
{
my $sb_name = (caller(0))[3];
print "entered $sb_name\n";
my $sf_array=$_[0];
my #sb_array=#$sf_array;
#sb_array=( #sb_array, "piglets ... influenza" );
{
print "\#sb_array==\n";
$count = 0;
foreach my $el (#sb_array) { $count++; print "$count:\t$el\n" };
}
print "now we demonstrate that passing an array as an argument to a subroutine does not protect it from being globally changed by programmer error\n";
#basearray = ( #sb_array );
print "return from $sb_name\n";
}
doagain( \#basearray );
{
print "\#basearray==\n";
$count = 0;
foreach my $el (#basearray) { $count++; print "$count:\t$el\n" };
}
There isn't a pragma or a keyword or such, but there are well established "good practices," which in this case completely resolve what you reasonably ponder about.
The first sub, dosomething, commits the sin of using variables visible in its scope but defined in the higher scope. Instead, always pass needed data to a subroutine (exceptions are rare, in crystal clear cases).
Directly using data from "outside" defies the idea of a function as an encapsulated procedure, exchanging data with its users via a well defined and clear interface. It entangles ("couples") sections of code that are in principle completely unrelated. In practice, it can also be outright dangerous.
Also, the fact the #basearray is up for grabs in the sub is best considered an accident -- what when that sub gets moved to a module? Or another sub is introduced to consolidate code where #basearray is defined?
The second sub, doagain, nicely takes a reference to that array. Then, to protect the data in the caller, one can copy the caller's array to another one which is local to the sub
sub doagain {
my ($ref_basearray) = #_;
my #local_ba = #$ref_basearray;
# work with local_ba and the caller's basearray is safe
}
The names of local lexical variables are of course arbitrary, but a convention where they resemble the caller's data names may be useful.
Then you can adopt a general practice, for safety, to always copy input variables to local ones. Work directly with references that are passed in only when you want to change the caller's data (relatively rare in Perl). This may hurt efficiency if it's done a lot with sizeable data, or when really large data structures are involved. So perhaps then make an exception and change data via its reference, and be extra careful.
(Putting my comment as answer)
One way to guarantee not changing a variable inside a subroutine is to not change it. Use only lexically scoped variables inside the subroutine, and pass whatever values you need inside the subroutine as arguments to the subroutine. It is a common enough coding practice, encapsulation.
One idea that you can use -- mainly as practice, I would say -- to force yourself to use encapsulation, is to put a block around your "main" code, and place subroutines outside of it. That way, if you should accidentally refer to a (formerly) global variable, use strict will be able to do it's job and produce a fatal error. Before runtime.
use strict;
use warnings;
main: { # lexical scope reduced to this block
my #basearray = qw / amoeba /;
print foo(#basearray); # works
print bar(); # fatal error
} # END OF MAIN lexical scope of #basearray ends here
sub foo {
my #basearray = #_; # encapsulated
return $basearray[1]++;
}
sub bar {
return $basearray[1]++; # out of scope ERROR
}
This will not compile, and will produce the error:
Global symbol "#basearray" requires explicit package name at foo.pl line 15.
Execution of foo.pl aborted due to compilation errors.
I would consider this a training device to force yourself to using good coding practices, and not something to necessarily use in production code.
There are several solutions with various levels of pithiness from "just don't change it" to "use an object or tied array and lock down the update functions". An intermediate solution, not unlike using an object with a getter method, is to define a function that returns your array but can only operate as an rvalue, and to use that function inside subroutines.
my #basearray = (...);
sub basearray { return #basearray }
sub foo {
foreach my $elem (basearray()) {
...
}
#bar = map { $_ *= 2 } basearray(); # ok
#bar = map { $_ *= 2 } #basearray; # modifies #basearray!
}
TLDR: yes, but.
I'll start with the "but". But it's better to design your code so that the variable simply doesn't exist in the scope where the untrusted function is defined.
sub untrusted_function {
...
}
my #basearray = qw( ... ); # declared after untrusted_function
If untrusted_function needs to be able to access the contents of the array, pass it a copy of the array as a parameter, so it can't modify the original.
Now here's the "yes".
You can mark the array as read-only before calling the untrusted function.
Internals::SvREADONLY($_, 1) for #basearray;
Internals::SvREADONLY(#basearray, 1);
Then mark it read-write again after the function has finished.
Internals::SvREADONLY(#basearray, 0);
Internals::SvREADONLY($_, 0) for #basearray;
Using Internals::SvREADONLY(#basearray, $bool) modifies the read-only state of the array itself, preventing elements from being added or removed from it; Internals::SvREADONLY($_, $bool) for #basearray modifies the read-only state of each element in the array too, which you probably want.
Of course, if your array contains references like blessed objects, you then need to consider whether you need to recurse into the references, marking them read-only too. (But can also be a concern with the shallow copy of the array I mentioned in the preferred solution!)
So yes, it is possible to prevent a sub from accidentally modifying a variable by marking that variable read-only before calling the sub, but it's a better idea to restructure your code so the sub simply doesn't have access to the variable at all.
Yes, but.
Here is a prototype that uses #TLP's answer.
#!/usr/bin/perl
use strict; use warnings;
{ # block_main BEG
my #basearray = qw / amoeba elephants sequoia /;
print join ( ' ', 'in main, #basearray==', join ( ' ', #basearray ), "\n" );
print "Now we call subroutine to print it:\n"; enumerateprintarray ( \#basearray );
my $ref_basearray = changearray ( \#basearray, 'wolves or coyotes . . . ' );
#basearray = #$ref_basearray;
print "Now we call subroutine to print it:\n"; enumerateprintarray ( \#basearray );
} # block_main END
sub enumerateprintarray
{
my $sb_name = (caller(0))[3];
#print join ( '' , #basearray ); # mortal sin! for in the day that thou eatest thereof thou shalt surely die.
my $sb_exact_count_arg = 1;
die "$sb_name must have exactly $sb_exact_count_arg arguments" unless ( ( scalar #_ ) == $sb_exact_count_arg );
my $sf_array = $_[0];
my #sb_array = #$sf_array;
my $sb_count = 0;
foreach (#sb_array)
{
$sb_count++;
print "\t$sb_count:\t$_\n";
}
}
sub changearray
{
my $sb_name = (caller(0))[3];
#print join ( '' , #basearray ); # in the day that thou eatest thereof thou shalt surely die.
my $sb_exact_count_arg = 2;
die "$sb_name must have exactly $sb_exact_count_arg arguments" unless ( ( scalar #_ ) == $sb_exact_count_arg );
my ( $sf_array, $addstring ) = #_;
my #sb_array = #$sf_array;
push #sb_array, $addstring;
return ( \#sb_array );
}
I'm getting this error and cannot understand why this happens. It happens when I jump to another subroutine. Perhaps there is something I need to understand about Mojolicious on why this happens.
Here is the source code of my program:
#!/usr/bin/perl
use Mojolicious::Lite;
get '/' => sub { &start_home; };
app->start;
sub start_home {
my $d = shift;
my $something = $d->param('something');
### Do things with $something.... etc.. etc..
&go_somewhere_else; ### Go somewhere else
}
sub go_somewhere_else {
my $c = shift;
$c->render(text => "Hello World!");
### End of program
}
I am passing a value on to the renderer and there is a value - Why would it say it is undefined? My understanding is that this only happens if you jump to a subroutine and try to render output.
My operating system is Windows and I am using Strawberry Perl.
You need to pass the context object $c/$d to your second function. The undefined value is your $c in go_somewhere_else, because you call it without a parameter.
Initially, to make it work, do this.
sub start_home {
my $d = shift;
my $something = $d->param('something');
go_somewhere_else($d);
}
You are now passing the context, which you named $d (that's not the conventional name), to the other function, and the warning will go away.
That's because the form &subname; without parenthesis () makes #_ (that's the list of arguments to the function) available inside of go_somewhere_else, but because you shifted $d off, #_ is now empty, and hence your $c inside go_somewhere_else is undef.
Alternatively, you could also change the shift to an assignment with #_. But please, don't do that!
sub start_home {
my ( $d ) = #_;
my $something = $d->param('something');
&go_somewhere_else;
}
There are more things odd to the point of almost wrong here.
get '/' => sub { &start_home; };
You are currying the the start_home function, but you are not actually adding another parameter. I explained above why this works. But it's not great. In fact, it's confusing and complicated.
Instead, you should use a code reference for the route.
get '/' => \&start_home;
Inside of start_home, you should call your context $c as is the convention. You should also not use the ampersand & notation for calling functions. That changes the behavior in a way you most certainly do not want.
sub start_home {
my $c = shift;
my $something = $c->param('something');
# ...
go_somewhere_else($c);
}
To learn more about how function calls work in Perl, refer to perlsub.
I'm maintaining old Perl code and need to enable strict pragma in all modules. I have a problem in passing a file handle as a reference between modules and subs. We have a common module responsible for opening the log file which is passed as typeglob reference. In other modules, the run function first calls open_log() from the common module, then it passes this file handle to other subs.
Here I've written a simple test to simulate the situation.
#!/usr/bin/perl -w
use strict;
$::STATUS_OK = 0;
$::STATUS_NOT_OK = 1;
sub print_header {
our $file_handle = #_;
print { $$file_handle } "#### HEADER ####"; # reference passing fails
}
sub print_text {
my ($file_handle, $text)= #_;
print_header(\$file_handle);
print { $$file_handle } $text;
}
sub open_file_handle {
my ($file_handle, $path, $name) = #_;
my $filename = $path."\\".$name;
unless ( open ($$file_handle, ">".$filename)) {
print STDERR "Failed to open file_handle $filename for writing.\n";
return $::STATUS_NOT_OK;
}
print STDERR "File $filename was opened for writing successfully.\n";
return $::STATUS_OK;
}
my $gpath = "C:\\Temp";
my $gname = "mylogfile.log";
my $gfile_handle;
if (open_file_handle(\$gfile_handle, $gpath, $gname) == $::STATUS_OK) {
my $text = "BIG SUCCESS!!!\n";
print_text(\$gfile_handle, $text);
print STDERR $text;
} else {
print STDERR "EPIC FAIL!!!!!!!!\n";
}
The Main function first calls open_file_handle and passes a file handle reference to the print_text function. If I comment out the row:
print_header(\$file_handle);
Everything works fine, but I need to pass the file handle reference to other functions from the print_text function, and this doesn't work.
I'm a Java developer and Perl's reference handling is not familiar to me. I don't want to change the open_log() sub to return a file handle (now it returns only status), since I have lots of modules and hundreds of code lines to go through to make this change in all places.
How can I fix my code to make it work?
There are two types of filehandles in Perl. Lexical and global bareword filehandles:
open my $fh, '>', '/path/to/file' or die $!;
open FILEHANDLE, '>', '/path/to/file' or die $!;
You are dealing with the first, which is good. The second one is global and should not be used.
The file handles you have are lexical, and they are stored in a scalar variable. It's called scalar because it has a dollar sign $. These can be passed as arguments to subs.
foo($fh);
They can also be referenced. In that case, you get a scalar reference.
my $ref = \$fh;
Usually you reference stuff if you hand it over to a function so Perl does not make a copy of the data. Think of a reference like a pointer in C. It's only the memory location of the data (structure). The piece of data itself remains where it is.
Now, in your code you have references to these scalars. You can tell because it is dereferenced in the print statement by saying $$fh.
sub print_text {
my ($file_handle, $text)= #_;
print_header(\$file_handle);
print { $$file_handle } $text;
}
So the $file_handle you get as a parameter (that's what the = #_ does) is actually a reference. You do not need to reference it again when you pass it to a function.
I guess you wrote the print_header yourself:
sub print_header {
our $file_handle = #_;
print { $$file_handle } "#### HEADER ####"; # reference passing fails
}
There are a few things here:
- our is for globals. Do not use that. Use my instead.
- Put parenthesis around the parameter assignment: my ($fh) = #_
- Since you pass over a reference to a reference to a scalar, you need to dereference twice: ${ ${ $file_handle } }
Of course the double-deref is weird. Get rid of it passing the variable $file_hanlde to print_header instead of a refence to it:
sub print_text {
my ($file_handle, $text)= #_;
print_header($file_handle); # <-- NO BACKSLASH HERE
print { $$file_handle } $text;
}
That is all you need to to make it work.
In general, I would get rid of all the references to the $file_handle vars here. You don't need them. The lexical filehandle is already a reference to an IO::Handle object, but don't concern yourself with that right now, it is not important. Just remember:
use filehandles that have a $ up front
pass them without references and you do not need to worry about \ and ${} and stuff like that
For more info, see perlref and perlreftut.
You are having difficulties because you added multiple extra level of references. Objects like lexical filehandles already are references.
If you have difficulties keeping track of what is a reference, you might want to use some kind of hungarian notation, like a _ref suffix.
In print_text, this would be:
sub print_text {
my ($file_handle_ref, $text)= #_;
print_header(\$file_handle_ref);
print { $$file_handle_ref } $text;
}
And in print_header:
sub print_header {
my ($file_handle_ref_ref) = #_; # don't use `our`, and assign to a lvalue list!
print { $$$file_handle_ref_ref } "#### HEADER ####"; # double derefernence … urgh
}
A far superior solution is to pass the filehandle around directly, without references.
sub print_header {
my ($file_handle) = #_;
print {$file_handle} "#### HEADER ####"; # no reference, no cry
}
sub print_text {
my ($file_handle, $text)= #_;
print_header($file_handle);
print {$file_handle} $text;
}
And in the main part:
my $gpath = "C:/Temp"; # forward slashes work too, as long as you are consistent
my $gname = "mylogfile.log";
if (open_file_handle(\my $gfile_handle, $gpath, $gname) == $::STATUS_OK) {
my $text = "BIG SUCCESS!!!\n";
print_text($gfile_handle, $text);
...
} else {
...
}
the reference operator is "\" (backslash)
anything includes arrays, hashes and even sub-routines can be referenced
the 5th line to count backwards
print_text(\$gfile_handle, $text);
you passed a referenced variable \$gfile_handle to the sub-routine print_text
sub print_text {
my ($file_handle, $text)= #_;
print_header(\$file_handle);
print { $$file_handle } $text;
}
and in this sub-routine, $file_handle is already a reference
then your referenced it again and pass it to the sub-routine print_header
so, you can solve this problem by putting off the reference operator the 5th line to count backwards like this:
print_text($gfile_handle, $text);
and try again :-)
I have following code:
my $coderef = ${MyModule::MyTool->new};
but when I try
$coderef->();
i got error:
Not a CODE reference
How can I take reference to constructor (without calling it) and run referenced code late?
The ${...} is the scalar dereference operator, not the anonymous subroutine constructor. You want:
my $coderef = sub {MyModule::MyTool->new};
And if your constructor takes arguments, you could write it this way:
my $coderef = sub {MyModule::MyTool->new(#_)};
The two examples above do not address one issue, and that is preserving the functionality of caller. If your constructor needs this (many do not), you can use Perl's magic goto &sub syntax:
my $coderef = sub {unshift #_, 'MyModule::MyTool'; goto &{$_[0]->can('new')} };
That probably requires a little explanation. First, the module name is placed before any other arguments (which is what the new method will expect). Then I used the UNIVERSAL method ->can to retrieve the coderef for the new method. goto &{...} then jumps to that coderef using the current argument list.
EDIT: The comments below show that there is some confusion as to when you would need to use the longer third technique. Here is a short segment that shows the problem:
package Original;
sub new {say +(caller)[0]} # presumably some better use of caller
# or Carp (which uses caller)
package Encapsulate::Simple;
sub new {
my (undef, $invocant, $method) = #_;
sub {$invocant->$method(#_)}
}
package Encapsulate::Better;
sub new {
my (undef, $invocant, $method) = #_;
sub {unshift #_, $invocant; goto &{$invocant->can($method)}}
}
package main;
my $bad = Encapsulate::Simple->new(qw/Original new/);
$bad->(); # always prints 'Encapsulate::Simple'
my $good = Encapsulate::Better->new(qw/Original new/);
$good->(); # prints 'main' as it should
package another;
$bad->(); # erroneously prints 'Encapsulate::Simple' again
$good->(); # prints 'another' as it should
So in short, Encapsulate::Better's sub preserves the exact functionality of Original->new whereas Encapsulate::Simple permanently binds it to its package, breaking any encapsulated methods that use caller for anything.
Use \& to obtain a reference to a named function:
my $coderef = \&MyModule::MyTool::new;
This should work regardless of which package holds the new:
my $coderef = MyModule::MyTool->UNIVERSAL::can( 'new' );
So that if MyModule::MyTool does not implement their constructor, you can still get it's handle.