Can someone explain why Perl behaves this way (variable scoping)? - perl

My test goes like this:
use strict;
use warnings;
func();
my $string = 'string';
func();
sub func {
print $string, "\n";
}
And the result is:
Use of uninitialized value $string in print at test.pl line 10.
string
Perl allows us to call a function before it has been defined. However when the function uses a variable declared only after the function call, the variable appears to be undefined. Is this behavior documented somewhere? Thank you!

The behaviour of my is documented in perlsub - it boils down to this - perl knows $string is in scope - because the my tells it so.
The my operator declares the listed variables to be lexically confined to the enclosing block, conditional (if/unless/elsif/else), loop (for/foreach/while/until/continue), subroutine, eval, or do/require/use'd file.
It means it's 'in scope' from the point at which it's first 'seen' until the closing bracket of the current 'block'. (Or in your example - the end of the code)
However - in your example my also assigns a value.
This scoping process happens at compile time - where perl checks where it's valid to use $string or not. (Thanks to strict). However - it can't know what the value was, because that might change during code execution. (and is non-trivial to analyze)
So if you do this it might be a little clearer what's going on:
#!/usr/bin/env perl
use strict;
use warnings;
my $string; #undefined
func();
$string = 'string';
func();
sub func {
print $string, "\n";
}
$string is in scope in both cases - because the my happened at compile time - before the subroutine has been called - but it doesn't have a value set beyond the default of undef prior to the first invocation.
Note this contrasts with:
#!/usr/bin/env perl
use strict;
use warnings;
sub func {
print $string, "\n";
}
my $string; #undefined
func();
$string = 'string';
func();
Which errors because when the sub is declared, $string isn't in scope.

First of all, I would consider this undefined behaviour since it skips executing my like my $x if $cond; does.
That said, the behaviour is currently consistent and predictable. And in this instance, it behaves exactly as expected if the optimization that warranted the undefined behaviour notice didn't exit.
At compile-time, my has the effect of declaring and allocating the variable[1]. Scalars are initialized to undef when created. Arrays and hashes are created empty.
my $string was encountered by the compiler, so the variable was created. But since you haven't executed the assignment yet, it still has its default value (undefined) during the first call to func.
This model allows variables to be captured by closures.
Example 1:
{
my $x = "abc";
sub foo { $x } # Named subs capture at compile-time.
}
say foo(); # abc, even though $x fell out of scope before foo was called.
Example 2:
sub make_closure {
my ($x) = #_;
return sub { $x }; # Anon subs capture at run-time.
}
my $foo = make_closure("foo");
my $bar = make_closure("bar");
say $foo->(); # foo
say $bar->(); # bar
The allocation is possibly deferred until the variable is actually used.

Related

Unexpected autovivification of arguments

Apparently my understanding of the no autovivification pragma is imperfect, as the not-dying-on-line-19 behaviour of the following script is extremely surprising to me.
use 5.014;
use strict;
use warnings;
no autovivification qw(fetch exists delete warn);
{
my $foo = undef;
my $thing = $foo->{bar};
# this does not die, as expected
die if defined $foo;
}
{
my $foo = undef;
do_nothing( $foo->{bar} );
# I would expect this to die, but it doesn't
die unless defined $foo;
}
sub do_nothing {
return undef;
}
Running the script produces:
Reference was vivified at test.pl line 8.
The question: why is $foo autovivified when $foo->{bar} is supplied as an argument to a sub, even though no autovivification is in effect?
In a subroutine call the arguments to a function are aliased in #_, so it must be possible to modify them. This provides an lvalue context, what will trigger autovivification.
When we look through descriptions of features you use in autovivification, they cover:
'fetch' -- "rvalue dereferencing expressions"
'exists' -- "dereferencing expressions that are parts of an exists"
'delete' -- "dereferencing expressions that are parts of a delete"
None of these deal with lvalues (neither does warn).
To stop autovivification in subroutine calls as well you need to add store
Turns off autovivification for lvalue dereferencing expressions, such as : [...]
where docs proceed with examples, including subroutine calls.
When I add it to your code,
no autovivification qw(fetch exists delete warn store);
# ...
I get
Reference was vivified at noautoviv.pl line 8.
Reference was vivified at noautoviv.pl line 16.
Died at noautoviv.pl line 19.

How do I pass in a variable from one function into another in perl

I am initializing a variable within one function and would like to pass this variable into another function. This variable holds a char value.
I have tried passing in the referencing and dereferencing, declaring the variables outside of the function, and using local.
I've also looked in perlmonks, perl by example, googled and looked through this site for a solution but to no avail. I'm just starting out with perl programming so any help will be appreciated!
Sounds to me like you need to read through some documentation, not just google around. I would suggest http://www.perl.org/books/beginning-perl/.
use strict;
use warnings;
sub foo {
my $char = 'A';
bar($char);
}
sub bar {
my ($bar_char) = #_;
print "bar got char $bar_char\n";
}
foo();
If you pass a parameter by reference (see below), it can be modified by the first function and you can then pass it to another function:
#!/usr/bin/perl
sub f {
$c = shift;
$$c='m';
}
$c='a';
f(\$c);
print $c;
This will print 'm'
Is there a reason who your first function cannot return this variable?
my $config_variable = function1( $param1 );
function2 ( $config_variable, $param2 );
You can also pass more than one variable back too:
my ( $config_variable, $value ) = function1( $param1 );
my $value2 = function2( $param1, $config_variable );
This would be the best way. However, you can use globally defined variables and they can be used from function to function:
#! /usr/bin/env perl
#
use strict;
use warnings;
my $value;
func1();
func2();
sub func1 {
$value = "foo";
}
sub func2 {
print "Value = $value\n";
}
Note that I declared $value outside of both functions, so it's global in the entire file - even in the subroutines. Now, func1 can set it, and func1 can print it.
The technical term for this is: A terrible, awful, evil idea and you should never, ever1 think of doing it.
This is because a particular variable you think is set to one value suddenly and mysteriously changes values without any reason. Do this for one variable is bad enough, but if you use this as a crutch, you'll end up with dozens of variables that are impossible to track through your program.
If you find yourself doing this quite a bit, you may need to rethink your code logic.

2 Sub references as arguments in perl

I have perl function I dont what does it do?
my what does min in perl?
#ARVG what does mean?
sub getArgs
{
my $argCnt=0;
my %argH;
for my $arg (#ARGV)
{
if ($arg =~ /^-/) # insert this entry and the next in the hash table
{
$argH{$ARGV[$argCnt]} = $ARGV[$argCnt+1];
}
$argCnt++;
}
return %argH;}
Code like that makes David sad...
Here's a reformatted version of the code doing the indentations correctly. That makes it so much easier to read. I can easily tell where my if and loops start and end:
sub getArgs {
my $argCnt = 0;
my %argH;
for my $arg ( #ARGV ) {
if ( $arg =~ /^-/ ) { # insert this entry and the next in the hash table
$argH{ $ARGV[$argCnt] } = $ARGV[$argCnt+1];
}
$argCnt++;
}
return %argH;
}
The #ARGV is what is passed to the program. It is an array of all the arguments passed. For example, I have a program foo.pl, and I call it like this:
foo.pl one two three four five
In this case, $ARGV is set to the list of values ("one", "two", "three", "four", "five"). The name comes from a similar variable found in the C programming language.
The author is attempting to parse these arguments. For example:
foo.pl -this that -the other
would result in:
$arg{"-this"} = "that";
$arg{"-the"} = "other";
I don't see min. Do you mean my?
This is a wee bit of a complex discussion which would normally involve package variables vs. lexically scoped variables, and how Perl stores variables. To make things easier, I'm going to give you a sort-of incorrect, but technically wrong answer: If you use the (strict) pragma, and you should, you have to declare your variables with my before they can be used. For example, here's a simple two line program that's wrong. Can you see the error?
$name = "Bob";
print "Hello $Name, how are you?\n";
Note that when I set $name to "Bob", $name is with a lowercase n. But, I used $Name (upper case N) in my print statement. As it stands, now. Perl will print out "Hello, how are you?" without a care that I've used the wrong variable name. If it's hard to spot an error like this in a two line program, imagine what it would be like in a 1000 line program.
By using strict and forcing me to declare variables with my, Perl can catch that error:
use strict;
use warnings; # Another Pragma that should always be used
my $name = "Bob";
print "Hello $Name, how are you doing\n";
Now, when I run the program, I get the following error:
Global symbol "$Name" requires explicit package name at (line # of print statement)
This means that $Name isn't defined, and Perl points to where that error is.
When you define variables like this, they are in scope with in the block where it's defined. A block could be the code contained in a set of curly braces or a while, if, or for statement. If you define a variable with my outside of these, it's defined to the end of the file.
Thus, by using my, the variables are only defined inside this subroutine. And, the $arg variable is only defined in the for loop.
One more thing:
The person who wrote this should have used the Getopt::Long module. There's a major bug in their code:
For example:
foo.pl -this that -one -two
In this case, my hash looks like this:
$args{'-this'} = "that";
$args{'-one'} = "-two";
$args{'-two'} = undef;
If I did this:
if ( defined $args{'-two'} ) {
...
}
I would not execute the if statement.
Also:
foo.pl -this=that -one -two
would also fail.
#ARGV is a special variable (refer to perldoc perlvar):
#ARGV
The array #ARGV contains the command-line arguments intended for the
script. $#ARGV is generally the number of arguments minus one, because
$ARGV[0] is the first argument, not the program's command name itself.
See $0 for the command name.
Perl documentation is also available from your command line:
perldoc -v #ARGV

Difference between a BLOCK and a function in terms of scoping in Perl

Guys I'm a little bit confused, I was playing with scoping in Perl, when i encountered this one:
#! usr/bin/perl
use warnings;
use strict;
sub nested {
our $x = "nested!";
}
print $x; # Error "Variable "$x" is not imported at nested line 10."
print our $x; # Doesn't print "nested!"
print our($x) # Doesn't print "nested!"
But when i do this:
{
our $x = "nested";
}
print our($x); # Prints "nested"
print our $x; # Prints "nested"
print $x; # Prints "nested"
So guys can you explain to me why those works and not?
To restate DVK's answer, our is just a handy aliasing tool. Every variable you use in these examples is actually named $main::x. Within any lexical scope you can use our to make an alias to that variable, with a shortened name, in that same scope; the variable doesn't reset or get removed outside, only the alias. This is unlike the my keyword which makes a new variable bound to that lexical scope.
To explain why the block example works the way it does, let's look at our explanation from "Modern Perl" book, chapter 5
Our Scope
Within given scope, declare an alias to a package variable with the our builtin.
The fully-qualified name is available everywhere, but the lexical alias is visible only within its scope.
This explains why the first two prints of your second example work (our is re-declared in print's scope), whereas the third one does not (as our only aliases $x to the package variable within the block's scope). Please note that printing $main::x will work correctly - it's only the alias that is scoped to the block, not the package variable itself.
As far as with the function:
print our $x; and print our($x) "don't work" - namely, correctly claim the value is uninitialized - since you never called the function which would initialize the variable. Observe the difference:
c:\>perl -e "use strict; use warnings; sub x { our $x = 1;} print our $x"
Use of uninitialized value $x in print at -e line 1.
c:\>perl -e "use strict; use warnings; sub x { our $x = 1;} x(); print our $x"
1
print $x; won't work for the same reason as with the block - our only scopes the alias to the block (i.e. in this case body of the sub) therefore you MUST either re-alias it in the main block's scope (as per print our $x example), OR use fully qualified package global outside the sub, in which case it will behave as expected:
c:\>perl -e "use strict; use warnings; sub x { our $x = 1;} print $main::x"
Use of uninitialized value $x in print at -e line 1.
c:\>perl -e "sub x { our $x = 1;} x(); print $main::x"
1

Why does this Perl variable keep its value

What is the difference between the following two Perl variable declarations?
my $foo = 'bar' if 0;
my $baz;
$baz = 'qux' if 0;
The difference is significant when these appear at the top of a loop. For example:
use warnings;
use strict;
foreach my $n (0,1){
my $foo = 'bar' if 0;
print defined $foo ? "defined\n" : "undefined\n";
$foo = 'bar';
print defined $foo ? "defined\n" : "undefined\n";
}
print "==\n";
foreach my $m (0,1){
my $baz;
$baz = 'qux' if 0;
print defined $baz ? "defined\n" : "undefined\n";
$baz = 'qux';
print defined $baz ? "defined\n" : "undefined\n";
}
results in
undefined
defined
defined
defined
==
undefined
defined
undefined
defined
It seems that if 0 fails, so foo is never reinitialized to undef. In this case, how does it get declared in the first place?
First, note that my $foo = 'bar' if 0; is documented to be undefined behaviour, meaning it's allowed to do anything including crash. But I'll explain what happens anyway.
my $x has three documented effects:
It declares a symbol at compile-time.
It creates an new variable on execution.
It returns the new variable on execution.
In short, it's suppose to be like Java's Scalar x = new Scalar();, except it returns the variable if used in an expression.
But if it actually worked that way, the following would create 100 variables:
for (1..100) {
my $x = rand();
print "$x\n";
}
This would mean two or three memory allocations per loop iteration for the my alone! A very expensive prospect. Instead, Perl only creates one variable and clears it at the end of the scope. So in reality, my $x actually does the following:
It declares a symbol at compile-time.
It creates the variable at compile-time[1].
It puts a directive on the stack that will clear[2] the variable when the scope is exited.
It returns the new variable on execution.
As such, only one variable is ever created[2]. This is much more CPU-efficient than then creating one every time the scope is entered.
Now consider what happens if you execute a my conditionally, or never at all. By doing so, you are preventing it from placing the directive to clear the variable on the stack, so the variable never loses its value. Obviously, that's not meant to happen, so that's why my ... if ...; isn't allowed.
Some take advantage of the implementation as follows:
sub foo {
my $state if 0;
$state = 5 if !defined($state);
print "$state\n";
++$state;
}
foo(); # 5
foo(); # 6
foo(); # 7
But doing so requires ignoring the documentation forbidding it. The above can be achieved safely using
{
my $state = 5;
sub foo {
print "$state\n";
++$state;
}
}
or
use feature qw( state ); # Or: use 5.010;
sub foo {
state $state = 5;
print "$state\n";
++$state;
}
Notes:
"Variable" can mean a couple of things. I'm not sure which definition is accurate here, but it doesn't matter.
If anything but the sub itself holds a reference to the variable (REFCNT>1) or if variable contains an object, the directive replaces the variable with a new one (on scope exit) instead of clearing the existing one. This allows the following to work as it should:
my #a;
for (...) {
my $x = ...;
push #a, \$x;
}
See ikegami's better answer, probably above.
In the first example, you never define $foo inside the loop because of the conditional, so when you use it, you're referencing and then assigning a value to an implicitly declared global variable. Then, the second time through the loop that outside variable is already defined.
In the second example, $baz is defined inside the block each time the block is executed. So the second time through the loop it is a new, not yet defined, local variable.