Subroutine giving two different outputs when called twice [duplicate] - perl

This question already has an answer here:
Perl - What scopes/closures/environments are producing this behaviour?
(1 answer)
Closed 5 years ago.
I have executed the following piece of a simple nested subroutine and the output of it makes me crazy.
#!/usr/bin/perl
use strict;
use warnings;
sub outer {
my $a = "123";
sub inner {
print "$a\n";
}
inner();
$a = "456";
}
outer();
outer();
Output to this is
Variable "$a" will not stay shared at E:\Perl\source\public\sss.pl line 9.
123
456
But how is this possible?
I call the inner subroutine when $a value is 123, but why am I getting 456 when outer is called the second time.

perldoc diagnostics gives quite self explanatory description for the warning Variable "$a" will not stay shared,
use strict;
use warnings;
use diagnostics;
sub outer {
my $a = "123";
sub inner {
print "$a\n";
}
inner();
$a = "456";
}
outer();
outer();
output
Variable "$a" will not stay shared at -e line 9 (#1)
(W closure) An inner (nested) named subroutine is referencing a
lexical variable defined in an outer named subroutine.
When the inner subroutine is called, it will see the value of
the outer subroutine's variable as it was before and during the *first*
call to the outer subroutine; in this case, after the first call to the
outer subroutine is complete, the inner and outer subroutines will no
longer share a common value for the variable. In other words, the
variable will no longer be shared.
This problem can usually be solved by making the inner subroutine
anonymous, using the sub {} syntax. When inner anonymous subs that
reference variables in outer subroutines are created, they
are automatically rebound to the current values of such variables.
123
456

There is no point in declaring an subroutine within another one. It works as if it were declared at the top level, and won't function properly as a closure
If you enable lexical subroutines (and disable the corresponding "experimental" warning) and declare inner as my sub inner then your code will work as you expect
#!/usr/bin/perl
use strict;
use warnings 'all';
use feature 'lexical_subs';
no warnings 'experimental::lexical_subs';
sub outer {
my $a = "123";
my sub inner {
print "$a\n";
}
inner();
$a = "456";
}
outer();
outer();
output
123
123

Related

Can someone explain why Perl behaves this way (variable scoping)?

My test goes like this:
use strict;
use warnings;
func();
my $string = 'string';
func();
sub func {
print $string, "\n";
}
And the result is:
Use of uninitialized value $string in print at test.pl line 10.
string
Perl allows us to call a function before it has been defined. However when the function uses a variable declared only after the function call, the variable appears to be undefined. Is this behavior documented somewhere? Thank you!
The behaviour of my is documented in perlsub - it boils down to this - perl knows $string is in scope - because the my tells it so.
The my operator declares the listed variables to be lexically confined to the enclosing block, conditional (if/unless/elsif/else), loop (for/foreach/while/until/continue), subroutine, eval, or do/require/use'd file.
It means it's 'in scope' from the point at which it's first 'seen' until the closing bracket of the current 'block'. (Or in your example - the end of the code)
However - in your example my also assigns a value.
This scoping process happens at compile time - where perl checks where it's valid to use $string or not. (Thanks to strict). However - it can't know what the value was, because that might change during code execution. (and is non-trivial to analyze)
So if you do this it might be a little clearer what's going on:
#!/usr/bin/env perl
use strict;
use warnings;
my $string; #undefined
func();
$string = 'string';
func();
sub func {
print $string, "\n";
}
$string is in scope in both cases - because the my happened at compile time - before the subroutine has been called - but it doesn't have a value set beyond the default of undef prior to the first invocation.
Note this contrasts with:
#!/usr/bin/env perl
use strict;
use warnings;
sub func {
print $string, "\n";
}
my $string; #undefined
func();
$string = 'string';
func();
Which errors because when the sub is declared, $string isn't in scope.
First of all, I would consider this undefined behaviour since it skips executing my like my $x if $cond; does.
That said, the behaviour is currently consistent and predictable. And in this instance, it behaves exactly as expected if the optimization that warranted the undefined behaviour notice didn't exit.
At compile-time, my has the effect of declaring and allocating the variable[1]. Scalars are initialized to undef when created. Arrays and hashes are created empty.
my $string was encountered by the compiler, so the variable was created. But since you haven't executed the assignment yet, it still has its default value (undefined) during the first call to func.
This model allows variables to be captured by closures.
Example 1:
{
my $x = "abc";
sub foo { $x } # Named subs capture at compile-time.
}
say foo(); # abc, even though $x fell out of scope before foo was called.
Example 2:
sub make_closure {
my ($x) = #_;
return sub { $x }; # Anon subs capture at run-time.
}
my $foo = make_closure("foo");
my $bar = make_closure("bar");
say $foo->(); # foo
say $bar->(); # bar
The allocation is possibly deferred until the variable is actually used.

Issues with function calls for search routine

My aim is to have multiple searches of specific files recursively. So I have these files:
/dir/here/tmp1/recursive/foo2013.log
/dir/here/tmp1/recursive/foo2014.log
/dir/here/tmp2/recursive/foo2013.log
/dir/here/tmp2/recursive/foo2014.log
where the 2013 and 2014 says in which year the files got modified lastly.
I then want to find the more up to date files (foo2014.log) for each directory tree (tmp1 and tmp2 likewise).
Referring to this answer I have the following code in script.pl:
#!/usr/bin/perl
use strict;
use warnings;
use File::Find;
func("tmp1");
print "===\n";
func("tmp2");
sub func{
my $varName = shift;
my %times;
find(\&upToDateFiles, "/dir/here");
for my $dir (keys %times) {
if ($times{$dir}{file} =~ m{$varName}){
print $times{$dir}{file}, "\n";
# do stuff here
}
}
sub upToDateFiles {
return unless (-f && /^foo/);
my $mod = -M $_;
if (!defined($times{$File::Find::dir})
or $mod < $times{$File::Find::dir}{mod})
{
$times{$File::Find::dir}{mod} = $mod;
$times{$File::Find::dir}{file} = $File::Find::name;
}
}
}
which will give me this output:
Variable "%times" will not stay shared at ./script.pl line 25.
/dir/here/tmp1/recursive/foo2014.log
===
I have three questions:
Why isn't the second call of the function func working like the first one? Variables are just defined in the scope of the function so why am I getting interferences?
Why do I get the notification for variable %times and how can I get rid of it?
If I define the function upToDateFiles outside of func I am getting this error: Execution of ./script.pl aborted due to compilation errors. I think this is because the variables aren't defined outside of func. Is it possible to change this and still get the desired output?
For starters - embedding a sub within another sub is rather nasty. If you use diagnostics; you'll get:
(W closure) An inner (nested) named subroutine is referencing a
lexical variable defined in an outer named subroutine.
When the inner subroutine is called, it will see the value of
the outer subroutine's variable as it was before and during the *first*
call to the outer subroutine; in this case, after the first call to the
outer subroutine is complete, the inner and outer subroutines will no
longer share a common value for the variable. In other words, the
variable will no longer be shared.
This problem can usually be solved by making the inner subroutine
anonymous, using the sub {} syntax. When inner anonymous subs that
reference variables in outer subroutines are created, they
are automatically rebound to the current values of such variables.
Which is directly relevant to the problem you're having. Try to avoid nesting your subs, and you won't have this problem. It certainly looks like you're trying to be far more complicated than you need to. Have you considered something like:
#!/usr/bin/perl
use strict;
use warnings;
use diagnostics;
use File::Find;
my %filenames;
sub compare_tree {
return unless -f && m/^foo/;
my $mtime = -M $File::Find::name;
if ( !$filenames{$_} || $mtime < $filenames{$_}{mtime} ) {
$filenames{$_} = {
newest => $File::Find::name,
mtime => $mtime,
};
}
}
find( \&compare_tree, "/dir/here" );
foreach my $filename ( keys %filenames ) {
print "$filename has newest version path of:", $filenames{$filename}{newest}, "\n";
print "$filename has newest mtime of:", $filenames{$filename}{mtime}, "\n";
}
I'd also note - you seem to be using $File::Find::dir - this looks wrong to me, based on what you describe you're doing. Likewise - you're running find twice on the same directory structure, which is not a very efficient approach - very big finds are expensive operations, so doubling the work needed isn't good.
Edit: Caught out by forgetting that -M was: -M Script start time minus file modification time, in days.. So 'newer' files are the lower number, not the higher. (So have amended above accordingly).

Difference between a BLOCK and a function in terms of scoping in Perl

Guys I'm a little bit confused, I was playing with scoping in Perl, when i encountered this one:
#! usr/bin/perl
use warnings;
use strict;
sub nested {
our $x = "nested!";
}
print $x; # Error "Variable "$x" is not imported at nested line 10."
print our $x; # Doesn't print "nested!"
print our($x) # Doesn't print "nested!"
But when i do this:
{
our $x = "nested";
}
print our($x); # Prints "nested"
print our $x; # Prints "nested"
print $x; # Prints "nested"
So guys can you explain to me why those works and not?
To restate DVK's answer, our is just a handy aliasing tool. Every variable you use in these examples is actually named $main::x. Within any lexical scope you can use our to make an alias to that variable, with a shortened name, in that same scope; the variable doesn't reset or get removed outside, only the alias. This is unlike the my keyword which makes a new variable bound to that lexical scope.
To explain why the block example works the way it does, let's look at our explanation from "Modern Perl" book, chapter 5
Our Scope
Within given scope, declare an alias to a package variable with the our builtin.
The fully-qualified name is available everywhere, but the lexical alias is visible only within its scope.
This explains why the first two prints of your second example work (our is re-declared in print's scope), whereas the third one does not (as our only aliases $x to the package variable within the block's scope). Please note that printing $main::x will work correctly - it's only the alias that is scoped to the block, not the package variable itself.
As far as with the function:
print our $x; and print our($x) "don't work" - namely, correctly claim the value is uninitialized - since you never called the function which would initialize the variable. Observe the difference:
c:\>perl -e "use strict; use warnings; sub x { our $x = 1;} print our $x"
Use of uninitialized value $x in print at -e line 1.
c:\>perl -e "use strict; use warnings; sub x { our $x = 1;} x(); print our $x"
1
print $x; won't work for the same reason as with the block - our only scopes the alias to the block (i.e. in this case body of the sub) therefore you MUST either re-alias it in the main block's scope (as per print our $x example), OR use fully qualified package global outside the sub, in which case it will behave as expected:
c:\>perl -e "use strict; use warnings; sub x { our $x = 1;} print $main::x"
Use of uninitialized value $x in print at -e line 1.
c:\>perl -e "sub x { our $x = 1;} x(); print $main::x"
1

Shared variables in the context of subroutines vs anonymous subroutines

I saw this bit of code in an answer to another post: Why would I use Perl anonymous subroutines instead of a named one?, but couldn't figure out exactly what as going on, so I wanted to run it myself.
sub outer
{
my $a = 123;
sub inner
{
print $a, "\n"; #line 15 (for your reference, all other comments are the OP's)
}
# At this point, $a is 123, so this call should always print 123, right?
inner();
$a = 456;
}
outer(); # prints 123
outer(); # prints 456! Surprise!
In the above example, I received a warning: "Variable $a will not stay shared at line 15.
Obviously, this is why the output is "unexpected," but I still don't really understand what's happening here.
sub outer2
{
my $a = 123;
my $inner = sub
{
print $a, "\n";
};
# At this point, $a is 123, and since the anonymous subrotine
# whose reference is stored in $inner closes over $a in the
# "expected" way...
$inner->();
$a = 456;
}
# ...we see the "expected" results
outer2(); # prints 123
outer2(); # prints 123
In the same vein, I don't understand what's happening in this example either. Could someone please explain?
Thanks in advance.
It has to do with compile-time vs. run-time parsing of subroutines. As the diagnostics message says,
When the inner subroutine is called, it will see the value of
the outer subroutine's variable as it was before and during the first
call to the outer subroutine; in this case, after the first call to the
outer subroutine is complete, the inner and outer subroutines will no
longer share a common value for the variable. In other words, the
variable will no longer be shared.
Annotating your code:
sub outer
{
# 'my' will reallocate memory for the scalar variable $a
# every time the 'outer' function is called. That is, the address of
# '$a' will be different in the second call to 'outer' than the first call.
my $a = 123;
# the construction 'sub NAME BLOCK' defines a subroutine once,
# at compile-time.
sub inner1
{
# since this subroutine is only getting compiled once, the '$a' below
# refers to the '$a' that is allocated the first time 'outer' is called
print "inner1: ",$a, "\t", \$a, "\n";
}
# the construction sub BLOCK defines an anonymous subroutine, at run time
# '$inner2' is redefined in every call to 'outer'
my $inner2 = sub {
# this '$a' now refers to '$a' from the current call to outer
print "inner2: ", $a, "\t", \$a, "\n";
};
# At this point, $a is 123, so this call should always print 123, right?
inner1();
$inner2->();
# if this is the first call to 'outer', the definition of 'inner1' still
# holds a reference to this instance of the variable '$a', and this
# variable's memory will not be freed when the subroutine ends.
$a = 456;
}
outer();
outer();
Typical output:
inner1: 123 SCALAR(0x80071f50)
inner2: 123 SCALAR(0x80071f50)
inner1: 456 SCALAR(0x80071f50)
inner2: 123 SCALAR(0x8002bcc8)
You can print \&inner; in the first example (after definition), and print $inner; in second.
What you see are hex code references which are equal in first example and differ in second.
So, in the first example inner gets created only once, and it is always closure to $a lexical variable from the first call of the outer().

Perl - What scopes/closures/environments are producing this behaviour?

Given a root directory I wish to identify the most shallow parent directory of any .svn directory and pom.xml .
To achieve this I defined the following function
use File::Find;
sub firstDirWithFileUnder {
$needle=#_[0];
my $result = 0;
sub wanted {
print "\twanted->result is '$result'\n";
my $dir = "${File::Find::dir}";
if ($_ eq $needle and ((not $result) or length($dir) < length($result))) {
$result=$dir;
print "Setting result: '$result'\n";
}
}
find(\&wanted, #_[1]);
print "Result: '$result'\n";
return $result;
}
..and call it thus:
$svnDir = firstDirWithFileUnder(".svn",$projPath);
print "\tIdentified svn dir:\n\t'$svnDir'\n";
$pomDir = firstDirWithFileUnder("pom.xml",$projPath);
print "\tIdentified pom.xml dir:\n\t'$pomDir'\n";
There are two situations which arise that I cannot explain:
When the search for a .svn is successful, the value of $result perceived inside the nested subroutine wanted persists into the next call of firstDirWithFileUnder. So when the pom search begins, although the line my $result = 0; still exists, the wanted subroutine sees its value as the return value from the last firstDirWithFileUnder call.
If the my $result = 0; line is commented out, then the function still executes properly. This means a) outer scope (firstDirWithFileUnder) can still see the $result variable to be able to return it, and b) print shows that wanted still sees $result value from last time, i.e. it seems to have formed a closure that's persisted beyond the first call of firstDirWithFileUnder.
Can somebody explain what's happening, and suggest how I can properly reset the value of $result to zero upon entering the outer scope?
Using warnings and then diagnostics yields this helpful information, including a solution:
Variable "$needle" will not stay shared at ----- line 12 (#1)
(W closure) An inner (nested) named subroutine is referencing a
lexical variable defined in an outer named subroutine.
When the inner subroutine is called, it will see the value of
the outer subroutine's variable as it was before and during the first
call to the outer subroutine; in this case, after the first call to the
outer subroutine is complete, the inner and outer subroutines will no
longer share a common value for the variable. In other words, the
variable will no longer be shared.
This problem can usually be solved by making the inner subroutine
anonymous, using the sub {} syntax. When inner anonymous subs that
reference variables in outer subroutines are created, they
are automatically rebound to the current values of such variables.
$result is lexically scoped, meaning a brand new variable is allocated every time you call &firstDirWithFileUnder.
sub wanted { ... } is a compile-time subroutine declaration, meaning it is compiled by the Perl interpreter one time and stored in your package's symbol table. Since it contains a reference to the lexically scoped $result variable, the subroutine definition that Perl saves will only refer to the first instance of $result. The second time you call &firstDirWithFileUnder and declare a new $result variable, this will be a completely different variable than the $result inside &wanted.
You'll want to change your sub wanted { ... } declaration to a lexically scoped, anonymous sub:
my $wanted = sub {
print "\twanted->result is '$result'\n";
...
};
and invoke File::Find::find as
find($wanted, $_[1])
Here, $wanted is a run-time declaration for a subroutine, and it gets redefined with the current reference to $result in every separate call to &firstDirWithFileUnder.
Update: This code snippet may prove instructive:
sub foo {
my $foo = 0; # lexical variable
$bar = 0; # global variable
sub compiletime {
print "compile foo is ", ++$foo, " ", \$foo, "\n";
print "compile bar is ", ++$bar, " ", \$bar, "\n";
}
my $runtime = sub {
print "runtime foo is ", ++$foo, " ", \$foo, "\n";
print "runtime bar is ", ++$bar, " ", \$bar, "\n";
};
&compiletime;
&$runtime;
print "----------------\n";
push #baz, \$foo; # explained below
}
&foo for 1..3;
Typical output:
compile foo is 1 SCALAR(0xac18c0)
compile bar is 1 SCALAR(0xac1938)
runtime foo is 2 SCALAR(0xac18c0)
runtime bar is 2 SCALAR(0xac1938)
----------------
compile foo is 3 SCALAR(0xac18c0)
compile bar is 1 SCALAR(0xac1938)
runtime foo is 1 SCALAR(0xa63d18)
runtime bar is 2 SCALAR(0xac1938)
----------------
compile foo is 4 SCALAR(0xac18c0)
compile bar is 1 SCALAR(0xac1938)
runtime foo is 1 SCALAR(0xac1db8)
runtime bar is 2 SCALAR(0xac1938)
----------------
Note that the compile time $foo always refers to the same variable SCALAR(0xac18c0), and that this is also the run time $foo THE FIRST TIME the function is run.
The last line of &foo, push #baz,\$foo is included in this example so that $foo doesn't get garbage collected at the end of &foo. Otherwise, the 2nd and 3rd runtime $foo might point to the same address, even though they refer to different variables (the memory is reallocated each time the variable is declared).