Why is my localized redefinition of a package sub not taking effect? - perl

Given the following Perl program:
package Foo;
use strict;
use warnings;
sub new {
my ($class) = #_;
return bless {}, $class;
}
sub c {
print "IN ORIG C\n";
}
sub DESTROY {
print "IN DESTROY\n";
c();
}
1;
package main;
use strict;
use warnings;
no warnings qw/redefine once/;
local *Foo::c = sub { print "IN MY C\n" };
my $f = Foo->new();
undef $f;
I expect output as:
IN DESTROY
IN MY C
But I actually get output as:
IN DESTROY
IN ORIG C
Q: Why is my localized redefinition of Foo::c not taking effect?

When perl code is compiled, globs for package variables/symbols are looked up (and created as necessary) and referenced directly from the compiled code.
So when you (temporarily) replace the symbol table entry for *Foo::c at runtime, all the already compiled code that used *Foo::c still uses the original glob. But do/require'd code or eval STRING or symbolic references won't.
(Very similar to Access package variable after its name is removed from symbol table in Perl?, see the examples there.)

This is a bug in perl which will be fixed in 5.22 (see Leon's comment below).
This happens because undef $f; doesn't actually free up and destroy $f, it just marks it as ready to
be freed by a nextstate op.
nextstate ops exist roughly between each statement, and they are there
to clean up the stack, among other things.
In your example, since undef $f is the last thing in the file, there
is no nextstate after it, so your local destructor goes out of scope
before $f's destructor is called (or, the global destruction that
happens just isn't aware of your local change.)
When you add a print statement after undef $f, the nextstate op
before the print calls your local destructor.
You can see the additional nextstate in your example at
https://gist.github.com/calid/aeb939147fdd171cffe3#file-04-diff-concise-out.
You can also see this behaviour by checking caller() in your DESTROY method:
sub DESTROY {
my ($pkg, $file, $line) = caller;
print "Destroyed at $pkg, $file, $line\n";
c();
}
mhorsfall#tworivers:~$ perl foo.pl
Destroyed at main, foo.pl, 0
IN DESTROY
IN ORIG C
mhorsfall#tworivers:~$ echo 'print "hi\n"' >> foo.pl
mhorsfall#tworivers:~$ perl foo.pl
Destroyed at main, foo.pl, 30
IN DESTROY
IN MY C
hi
(Line 30 being the print "hi\n")
Hope that sheds some light on this.
Cheers.

The problem here doesn't have to do with compile time vs runtime but rather with scoping.
The use of local limits the scope of your modified Foo::c to the remainder of the current scope (which in your example is the remainder of your script). But DESTROY doesn't run in that scope, even when you explicitly undef $f (See http://perldoc.perl.org/perlobj.html#Destructors for more discussion of the behavior of DESTROY). It runs at an undetermined time later, specifically AFTER $f has "gone out of scope". Therefore, any localized changes you have made in the scope of $f will not apply whenever DESTROY finally runs.
You can see this yourself by simply removing the local in your example:
With local
IN DESTROY
IN ORIG C
Without local
IN DESTROY
IN MY C
Or by adding a few additional subroutines and calling them in package::main scope:
package Foo;
...
sub d {
c();
}
sub DESTROY {
print "IN DESTROY\n";
c();
}
1;
package main;
...
sub e {
Foo::c();
}
local *Foo::c = sub { print "IN MY C\n" };
my $f = Foo->new();
Foo::c();
Foo::d();
e();
undef $f;
Which prints
IN MY C
IN MY C
IN MY C
IN DESTROY
IN ORIG C
So only in DESTROY is the original c used, further demonstrating that this is a scoping issue.
Also see https://stackoverflow.com/a/19100461/232706 for a great explanation of Perl scoping rules.

Related

Perl eval scope

According to perldoc, String Eval should be performed in the current scope. But the following simple test seems to contradict this.
We need the following two simple files to set up the test. Please put them under the same folder.
test_eval_scope.pm
package test_eval_scope;
use strict;
use warnings;
my %h = (a=>'b');
sub f1 {
eval 'print %h, "\n"';
# print %h, "\n"; # this would work
# my $dummy = \%h; # adding this would also work
}
1
test_eval_scope.pl
#!/usr/bin/perl
use File::Basename;
use lib dirname (__FILE__);
use test_eval_scope;
test_eval_scope::f1();
When I run the program, I got the following error
$ test_eval_scope.pl
Variable "%h" is not available at (eval 1) line 1.
My question is why the variable %h is out of scope.
I have done some modification, and found the following:
If I run without eval(), as in the above comment, it will work.
meaning that %h should be in the scope.
If I just add a seemingly useless mentioning in the code, as in the above
comment, eval() will work too.
If I combine pl and pm file into one file, eval() will work too.
If I declare %h with 'our' instead of 'my', eval() will work too.
I encountered this question when I was writing a big program which parsed user-provided code during run time. I don't need solutions as I have plenty workarounds above. But I cannot explain why the above code doesn't work. This affects my perl pride.
My perl version is v5.26.1 on linux.
Thank you for your help!
Subs only capture variables they use. Since f1 doesn't use %h, it doesn't capture it, and %h becomes inaccessible to f1 after it goes out of scope when the module finishes executing.
Any reference to the var, including one that's optimized away, causes the sub to capture the variable. As such, the following does work:
sub f1 {
%h if 0;
eval 'print %h, "\n"';
}
Demo:
$ perl -M5.010 -we'
{
my $x = "x";
sub f { eval q{$x} }
sub g { $x if 0; eval q{$x} }
}
say "f: ", f();
say "g: ", g();
'
Variable "$x" is not available at (eval 1) line 1.
Use of uninitialized value in say at -e line 8.
f:
g: x

Can someone explain why Perl behaves this way (variable scoping)?

My test goes like this:
use strict;
use warnings;
func();
my $string = 'string';
func();
sub func {
print $string, "\n";
}
And the result is:
Use of uninitialized value $string in print at test.pl line 10.
string
Perl allows us to call a function before it has been defined. However when the function uses a variable declared only after the function call, the variable appears to be undefined. Is this behavior documented somewhere? Thank you!
The behaviour of my is documented in perlsub - it boils down to this - perl knows $string is in scope - because the my tells it so.
The my operator declares the listed variables to be lexically confined to the enclosing block, conditional (if/unless/elsif/else), loop (for/foreach/while/until/continue), subroutine, eval, or do/require/use'd file.
It means it's 'in scope' from the point at which it's first 'seen' until the closing bracket of the current 'block'. (Or in your example - the end of the code)
However - in your example my also assigns a value.
This scoping process happens at compile time - where perl checks where it's valid to use $string or not. (Thanks to strict). However - it can't know what the value was, because that might change during code execution. (and is non-trivial to analyze)
So if you do this it might be a little clearer what's going on:
#!/usr/bin/env perl
use strict;
use warnings;
my $string; #undefined
func();
$string = 'string';
func();
sub func {
print $string, "\n";
}
$string is in scope in both cases - because the my happened at compile time - before the subroutine has been called - but it doesn't have a value set beyond the default of undef prior to the first invocation.
Note this contrasts with:
#!/usr/bin/env perl
use strict;
use warnings;
sub func {
print $string, "\n";
}
my $string; #undefined
func();
$string = 'string';
func();
Which errors because when the sub is declared, $string isn't in scope.
First of all, I would consider this undefined behaviour since it skips executing my like my $x if $cond; does.
That said, the behaviour is currently consistent and predictable. And in this instance, it behaves exactly as expected if the optimization that warranted the undefined behaviour notice didn't exit.
At compile-time, my has the effect of declaring and allocating the variable[1]. Scalars are initialized to undef when created. Arrays and hashes are created empty.
my $string was encountered by the compiler, so the variable was created. But since you haven't executed the assignment yet, it still has its default value (undefined) during the first call to func.
This model allows variables to be captured by closures.
Example 1:
{
my $x = "abc";
sub foo { $x } # Named subs capture at compile-time.
}
say foo(); # abc, even though $x fell out of scope before foo was called.
Example 2:
sub make_closure {
my ($x) = #_;
return sub { $x }; # Anon subs capture at run-time.
}
my $foo = make_closure("foo");
my $bar = make_closure("bar");
say $foo->(); # foo
say $bar->(); # bar
The allocation is possibly deferred until the variable is actually used.

Issues with function calls for search routine

My aim is to have multiple searches of specific files recursively. So I have these files:
/dir/here/tmp1/recursive/foo2013.log
/dir/here/tmp1/recursive/foo2014.log
/dir/here/tmp2/recursive/foo2013.log
/dir/here/tmp2/recursive/foo2014.log
where the 2013 and 2014 says in which year the files got modified lastly.
I then want to find the more up to date files (foo2014.log) for each directory tree (tmp1 and tmp2 likewise).
Referring to this answer I have the following code in script.pl:
#!/usr/bin/perl
use strict;
use warnings;
use File::Find;
func("tmp1");
print "===\n";
func("tmp2");
sub func{
my $varName = shift;
my %times;
find(\&upToDateFiles, "/dir/here");
for my $dir (keys %times) {
if ($times{$dir}{file} =~ m{$varName}){
print $times{$dir}{file}, "\n";
# do stuff here
}
}
sub upToDateFiles {
return unless (-f && /^foo/);
my $mod = -M $_;
if (!defined($times{$File::Find::dir})
or $mod < $times{$File::Find::dir}{mod})
{
$times{$File::Find::dir}{mod} = $mod;
$times{$File::Find::dir}{file} = $File::Find::name;
}
}
}
which will give me this output:
Variable "%times" will not stay shared at ./script.pl line 25.
/dir/here/tmp1/recursive/foo2014.log
===
I have three questions:
Why isn't the second call of the function func working like the first one? Variables are just defined in the scope of the function so why am I getting interferences?
Why do I get the notification for variable %times and how can I get rid of it?
If I define the function upToDateFiles outside of func I am getting this error: Execution of ./script.pl aborted due to compilation errors. I think this is because the variables aren't defined outside of func. Is it possible to change this and still get the desired output?
For starters - embedding a sub within another sub is rather nasty. If you use diagnostics; you'll get:
(W closure) An inner (nested) named subroutine is referencing a
lexical variable defined in an outer named subroutine.
When the inner subroutine is called, it will see the value of
the outer subroutine's variable as it was before and during the *first*
call to the outer subroutine; in this case, after the first call to the
outer subroutine is complete, the inner and outer subroutines will no
longer share a common value for the variable. In other words, the
variable will no longer be shared.
This problem can usually be solved by making the inner subroutine
anonymous, using the sub {} syntax. When inner anonymous subs that
reference variables in outer subroutines are created, they
are automatically rebound to the current values of such variables.
Which is directly relevant to the problem you're having. Try to avoid nesting your subs, and you won't have this problem. It certainly looks like you're trying to be far more complicated than you need to. Have you considered something like:
#!/usr/bin/perl
use strict;
use warnings;
use diagnostics;
use File::Find;
my %filenames;
sub compare_tree {
return unless -f && m/^foo/;
my $mtime = -M $File::Find::name;
if ( !$filenames{$_} || $mtime < $filenames{$_}{mtime} ) {
$filenames{$_} = {
newest => $File::Find::name,
mtime => $mtime,
};
}
}
find( \&compare_tree, "/dir/here" );
foreach my $filename ( keys %filenames ) {
print "$filename has newest version path of:", $filenames{$filename}{newest}, "\n";
print "$filename has newest mtime of:", $filenames{$filename}{mtime}, "\n";
}
I'd also note - you seem to be using $File::Find::dir - this looks wrong to me, based on what you describe you're doing. Likewise - you're running find twice on the same directory structure, which is not a very efficient approach - very big finds are expensive operations, so doubling the work needed isn't good.
Edit: Caught out by forgetting that -M was: -M Script start time minus file modification time, in days.. So 'newer' files are the lower number, not the higher. (So have amended above accordingly).

Perl using the special character &

I had a small question. I was reading some code and as my school didn't teach me anything useful about perl programming, I am here to ask you people. I see this line being used a lot in some perl programs:
$variable = &something();
I don't know what the & sign means here as I never say it in perl. And the something is a subroutine ( I am guessing). It usually says a name and it has arguments like a function too sometimes. Can someone tell me what & stands for here and what that something is all the time.
The variable takes in some sort of returned value and is then used to check some conditions, which makes me think it is a subroutine. But still why the &?
Thanks
Virtually every time you see & outside of \&foo and EXRP && EXPR, it's an error.
&foo(...) is the same as foo(...) except foo's prototype will be ignored.
sub foo(&#) { ... } # Cause foo to takes a BLOCK as its first arg
foo { ... } ...;
&foo(sub { ... }, ...); # Same thing.
Only subroutines (not operators) will be called by &foo(...).
sub print { ... }
print(...); # Calls the print builtin
&print(...); # Calls the print sub.
You'll probably never need to use this feature in your entire programming career. If you see it used, it's surely someone using & when they shouldn't.
&foo is similar to &foo(#_). The difference is that changes to #_ in foo affects the current sub's #_.
You'll probably never need to use this feature in your entire programming career. If you see it used, it's surely someone using & when they shouldn't or a foolish attempt at optimization. However, the following is pretty elegant:
sub log_info { unshift #_, 'info'; &log }
sub log_warn { unshift #_, 'warn'; &log }
sub log_error { unshift #_, 'error'; &log }
goto &foo is similar to &foo, except the current subroutine is removed from the call stack first. This will cause it to not show up in stack traces, for example.
You'll probably never need to use this feature in your entire programming career. If you see it used, it's surely a foolish attempt at optimization.
sub log_info { unshift #_, 'info'; goto &log; } # These are slower than
sub log_warn { unshift #_, 'warn'; goto &log; } # not using goto, but maybe
sub log_error { unshift #_, 'error'; goto &log; } # maybe log uses caller()?
$& contains what the last regex expression match matched. Before 5.20, using this causes every regex in your entire interpreter to become slower (if they have no captures), so don't use this.
print $& if /fo+/; # Bad before 5.20
print $MATCH if /fo+/; # Bad (Same thing. Requires "use English;")
print ${^MATCH} if /fo+/p; # Ok (Requires Perl 5.10)
print $1 if /(fo+)/; # Ok
defined &foo is a perfectly legitimate way of checking if a subroutine exists, but it's not something you'll likely ever need. There's also exists &foo is similar, but not as useful.
EXPR & EXPR is the bitwise AND operator. This is used when dealing with low-level systems that store multiple pieces of information in a single word.
system($cmd);
die "Can't execute command: $!\n" if $? == -1;
die "Child kill by ".($? & 0x7F)."\n" if $? & 0x7F;
die "Child exited with ".($? >> 8)."\n" if $? >> 8;
&{ EXPR }() (and &$ref()) is a subroutine call via a reference. This is a perfectly acceptable and somewhat common thing to do, though I prefer the $ref->() syntax. Example in next item.
\&foo takes a reference to subroutine foo. This is a perfectly acceptable and somewhat common thing to do.
my %dispatch = (
foo => \&foo,
bar => \&bar,
);
my $handler = $dispatch{$cmd} or die;
$handler->();
# Same: &{ $handler }();
# Same: &$handler();
EXPR && EXPR is the boolean AND operator. I'm sure you're familiar with this extremely common operator.
if (0 <= $x && $x <= 100) { ... }
In older versions of perl & was used to call subroutines. Now this is not necessary and \& is mostly used to take a reference to subroutine,
my $sub_ref = \&subroutine;
or to ignore function prototype (http://perldoc.perl.org/perlsub.html#Prototypes)
Other than for referencing subroutines & is bitwise and operator,
http://perldoc.perl.org/perlop.html#Bitwise-And

Difference between a BLOCK and a function in terms of scoping in Perl

Guys I'm a little bit confused, I was playing with scoping in Perl, when i encountered this one:
#! usr/bin/perl
use warnings;
use strict;
sub nested {
our $x = "nested!";
}
print $x; # Error "Variable "$x" is not imported at nested line 10."
print our $x; # Doesn't print "nested!"
print our($x) # Doesn't print "nested!"
But when i do this:
{
our $x = "nested";
}
print our($x); # Prints "nested"
print our $x; # Prints "nested"
print $x; # Prints "nested"
So guys can you explain to me why those works and not?
To restate DVK's answer, our is just a handy aliasing tool. Every variable you use in these examples is actually named $main::x. Within any lexical scope you can use our to make an alias to that variable, with a shortened name, in that same scope; the variable doesn't reset or get removed outside, only the alias. This is unlike the my keyword which makes a new variable bound to that lexical scope.
To explain why the block example works the way it does, let's look at our explanation from "Modern Perl" book, chapter 5
Our Scope
Within given scope, declare an alias to a package variable with the our builtin.
The fully-qualified name is available everywhere, but the lexical alias is visible only within its scope.
This explains why the first two prints of your second example work (our is re-declared in print's scope), whereas the third one does not (as our only aliases $x to the package variable within the block's scope). Please note that printing $main::x will work correctly - it's only the alias that is scoped to the block, not the package variable itself.
As far as with the function:
print our $x; and print our($x) "don't work" - namely, correctly claim the value is uninitialized - since you never called the function which would initialize the variable. Observe the difference:
c:\>perl -e "use strict; use warnings; sub x { our $x = 1;} print our $x"
Use of uninitialized value $x in print at -e line 1.
c:\>perl -e "use strict; use warnings; sub x { our $x = 1;} x(); print our $x"
1
print $x; won't work for the same reason as with the block - our only scopes the alias to the block (i.e. in this case body of the sub) therefore you MUST either re-alias it in the main block's scope (as per print our $x example), OR use fully qualified package global outside the sub, in which case it will behave as expected:
c:\>perl -e "use strict; use warnings; sub x { our $x = 1;} print $main::x"
Use of uninitialized value $x in print at -e line 1.
c:\>perl -e "sub x { our $x = 1;} x(); print $main::x"
1