When does a global variable get its value assigned? - perl

I encountered some strange behavior which hints that I do not understand some basic things about Perl script execution and initialization order. The following example:
#!/usr/bin/env perl
use strict;
use warnings;
print &get_keys(), "\n";
use vars qw/%hello/; # same effect with 'my %hello = (...);'
%hello = ( a => 1, b => 2 );
sub get_keys
{
return join(', ', sort keys %hello);
}
prints an empty string. Meaning that though variable is already visible, since the state with assignment wasn't yet reached, it has no value. (Using a scalar instead of the hash would trigger a warning about uninitialized variable.)
Is that intended behavior?
I would be also glad for the RTFM pointers.

From perlsub:
A my has both a compile-time and a
run-time effect. At compile time, the
compiler takes notice of it. The
principal usefulness of this is to
quiet use strict 'vars' .... Actual
initialization is delayed until run
time, though, so it gets executed at
the appropriate time.
# At this point, %hello is a lexically scope variable (the my took effect
# at compile time), but it still has no keys.
print get_keys();
my %hello = ( a => 1, b => 2 );
# Now the hash has content.
print get_keys();
sub get_keys { join ' ', keys %hello, "\n" }
Other notes: (1) You should be using my or our rather than use vars. (2) Under normal circumstances, don't call functions with a leading ampersand: use foo() rather than &foo().

Yes, it's intended behaviour. You call get_keys() before you assign to %hello, so %hello is empty in get_keys(). (Scalars are initialised to undef, while arrays and hashes are set to be empty by default.)
If you want %hello to be initialised immediately, use a BEGIN block:
use vars qw/%hello/;
BEGIN {
%hello = ( a => 1, b => 2 );
}
Note that if you were using my (or our, for that matter) then this wouldn't work:
BEGIN {
my %hello = ( a => 1, b => 2 );
}
because you have to declare the variable outside the block, like so:
my %hello;
BEGIN {
%hello = ( a => 1, b => 2 );
}

The use vars pragma predeclares global variable name at compile time. The variable is undefined unless you assign any value to it. Since you print it before the assignment, you rightfully get an empty string.
BTW, this pragma is obsolete. From perldoc vars:
NOTE: For variables in the current package, the functionality provided
by this pragma has been superseded by "our" declarations, available in
Perl v5.6.0 or later. See "our" in perlfunc.

Related

Is it possible to use hash, array, scalar with the same name in Perl?

I'm converting perl script to python.
I haven't used perl language before so many things in perl confused me.
For example, below, opt was declared as a scalar first but declared again as a hash. %opt = ();
Is it possible to declare a scalar and a hash with the same name in perl?
As I know, foreach $opt (#opts) means that scalar opt gets values of array opts one by one. opt is an array at this time???
In addition, what does $opt{$opt} mean?
opt outside $opt{$opt} is a hash and opt inside $opt{$opt} is a scalar?
I'm so confused, please help me ...
sub ParseOptions
{
local (#optval) = #_;
local ($opt, #opts, %valFollows, #newargs);
while (#optval) {
$opt = shift(#optval);
push(#opts,$opt);
$valFollows{$opt} = shift(#optval);
}
#optArgs = ();
%opt = ();
arg: while (defined($arg = shift(#ARGV))) {
foreach $opt (#opts) {
if ($arg eq $opt) {
push(#optArgs, $arg);
if ($valFollows{$opt}) {
if (#ARGV == 0) {
&Usage();
}
$opt{$opt} = shift(#ARGV);
push(#optArgs, $opt{$opt});
} else {
$opt{$opt} = 1;
}
next arg;
}
}
push(#newargs,$arg);
}
#ARGV = #newargs;
}
In Perl types SCALAR, ARRAY, and HASH are distinguished by the leading sigil in the variable name, $, #, and % respectively, but otherwise they may bear the same name. So $var and #var and %var are, other than belonging to the same *var typeglob, completely distinct variables.
A key for a hash must be a scalar, let's call it $key. A value in a hash must also be a scalar, and the one corresponding to $key in the hash %h is $h{$key}, the leading $ indicating that this is now a scalar. Much like an element of an array is a scalar, thus $ary[$index].
In foreach $var (#ary) the scalar $var does not really "get values" of array elements but rather aliases them. So if you change it in the loop you change the element of the array.
(Note, you have the array #opts, not #opt.)
A few comments on the code, in the light of common practices in modern Perl
One must always have use warnings; on top. The other one that is a must is use strict, as it enforces declaration of variables with my (or our), promoting all kinds of good behavior. This restricts the scope of lexical (my) variables to the nearest block enclosing the declaration
A declaration may be done inside loops as well, allowing while (my $var = EXPR) {} and foreach my $var (LIST) {}, and then $var is scoped to the loop's block (and is also usable in the rest of the condition)
The local is a different beast altogether, and in this code those should be my instead
Loop labels like arg: are commonly typed in block capitals
The & in front of a function name has rather particular meanings† and is rarely needed. It is almost certainly not needed in this code
Assignment of empty list like my #optArgs = () when the variable is declared (with my) is unneeded and has no effect (with a global #name it may be needed to clear it from higher scope)
In this code there is no need to introduce variables first since they are global and thus created the first time they are used, much like in Python – Unless they are used outside of this sub, in which case they may need to be cleared. That's the thing with globals, they radiate throughout all code
Except for the last two these are unrelated to your Python translation task, for this script.
† It suppresses prototypes; informs interpreter (at runtime) that it's a subroutine and not a "bareword", if it hasn't been already defined (can do that with () after the name); ensures that the user-defined sub is called, if there is a builtin of the same name; passes the caller's #_ (when without parens, &name); in goto &name and defined ...
See, for example, this post and this post
First, to make it clear, in perl, as in shell, you could surround variable names in curly brackets {} :
${bimbo} and $bimbo are the same scalar variable.
#bimbo is an array pointer ;
%bimbo is a hash pointer ;
$bimbo is a scalar (unique value).
To address an array or hash value, you will have to use the '$' :
$bimbo{'index'} is the 'index' value of hash %bimbo.
If $i contains an int, such as 1 for instance,
$bimbo[$i] is the second value of the array #bimbo.
So, as you can see, # or % ALWAYS refers to the whole array, as $bimbo{} or $bimbo[] refers to any value of the hash or array.
${bimbo[4]} refers to the 5th value of the array #bimbo ; %{bimbo{'index'}} refers to the 'index' value of array %bimbo.
Yes, all those structures could have the same name. This is one of the obvious syntax of perl.
And, euhm… always think in perl as a C++ edulcored syntax, it is simplified, but it is C.

Printing hash value in Perl

When I print a variable, I am getting a HASH(0xd1007d0) value. I need to print the values of all the keys and values. However, I am unable to as the control does not enter the loop.
foreach my $var(keys %{$HashVariable}){
print"In the loop \n";
print"$var and $HashVariable{$var}\n";
}
But the control is not even entering the loop. I am new to perl.
I can't answer completely, because it depends entirely on what's in $HashVariable.
The easiest way to tell what's in there is:
use Data::Dumper;
print Dumper $HashVariable;
Assuming this is a hash reference - which it would be, if print $HashVariable gives HASH(0xdeadbeef) as an output.
So this should work:
#!/usr/bin/env perl
use strict;
use warnings;
my $HashVariable = { somekey => 'somevalue' };
foreach my $key ( keys %$HashVariable ) {
print $key, " => ", $HashVariable->{$key},"\n";
}
The only mistake you're making is that $HashVariable{$key} won't work - you need to dereference, because as it stands it refers to %HashVariable not $HashVariable which are two completely different things.
Otherwise - if it's not entering the loop - it may mean that keys %$HashVariable isn't returning anything. Which is why that Dumper test would be useful - is there any chance you're either not populating it correctly, or you're writing to %HashVariable instead.
E.g.:
my %HashVariable;
$HashVariable{'test'} = "foo";
There's an obvious problem here, but it wouldn't cause the behaviour that you are seeing.
You think that you have a hash reference in $HashVariable and that sounds correct given the HASH(0xd1007d0) output that you see when you print it.
But setting up a hash reference and running your code, gives slightly strange results:
my $HashVariable = {
foo => 1,
bar => 2,
baz => 3,
};
foreach my $var(keys %{$HashVariable}){
print"In the loop \n";
print"$var and $HashVariable{$var}\n";
}
The output I get is:
In the loop
baz and
In the loop
bar and
In the loop
foo and
Notice that the values aren't being printed out. That's because of the problem I mentioned above. Adding use strict to the program (which you should always do) tells us what the problem is.
Global symbol "%HashVariable" requires explicit package name (did you forget to declare "my %HashVariable"?) at hash line 14.
Execution of hash aborted due to compilation errors.
You are using $HashVariable{$var} to look up a key in your hash. That would be correct if you had a hash called %HashVariable, but you don't - you have a hash reference called $HashVariable (note the $ instead of %). To look up a key from a hash reference, you need to use a dereferencing arrow - $HashVariable->{$var}.
Fixing that, your program works as expected.
use strict;
use warnings;
my $HashVariable = {
foo => 1,
bar => 2,
baz => 3,
};
foreach my $var(keys %{$HashVariable}){
print"In the loop \n";
print"$var and $HashVariable->{$var}\n";
}
And I see:
In the loop
bar and 2
In the loop
foo and 1
In the loop
baz and 3
The only way that you could get the results you describe (the HASH(0xd1007d0) output but no iterations of the loop) is if you have a hash reference but the hash has no keys.
So (as I said in a comment) we need to see how your hash reference is created.

making runtime variable names global in perl

OK, I've looked hard and have tried a myriad of variations to get this to work. I need some help from the perl experts.
First of all, I know, I know, don't use dynamic variable names! Well, I think I need to do so in this case. I'm fixing an inherited script. It has a set of variables that are formed with prefix + "_" + suffix. This started small, but prefix is now a list of about 20 and suffix of about 50 -- a thousand different variables.
The script in some cases loops over these variables, using eval to retrieve an individual value, etc.. This all works. The problem is, all uses of the different variables throughout the script, whether inside a loop on the prefix and/or suffix or just a hardcoded use, assume that these variables are globalized by using use var qw($varname). The number of use vars hardcoded statements has grown out of control. Moreover, there are thousands of past input profile files that initialize these variables using global syntax, all of which need to continue to work.
I was hoping I could improve maintainability by doing the globalization step in a loop, dynamically creating the variable names one at a time, then globalizing them with use vars or our. Here is some code to show what I'm after. I've shown several options here, but each is actually a separate trial. I have tried other variations and cannot keep them straight anymore.
my #prefixes = ("one", "two", ..., "twenty");
my #suffixes = ("1", "2", ..., "50");
my $globalvars = "";
for my $suffix (#suffixes) {
for my $prefix (#prefixes) {
# option 1 -- remainder of #1 outside of loop
# NOTE: 1.a (no comma); 1.b (include comma between vars)
$globalvars .= "\$$prefix\_$suffix, ";
# option 2
eval "use vars qw(\$$prefix\_$suffix)";
# option 3
my $g = "$prefix\_$suffix";
eval ("use vars qw(\$$g)");
# option 4
eval ("our \$$g");
}
}
# option 1.a
use vars qw($globalvars);
# or option 1.b
my (eval "$globalvars");
:
:
:
# now use them as global variables
if ($one_1 eq ...) {
if ($one_10) {
It seems that the problem with "use vars" (besides the fact that it is discouraged) is that the string inside qw has to be the actual variable name, not a dynamic variable name. The 'eval' options don't work, I think, because the scope only exists within the eval itself.
Is there a way to do this in perl?
string inside qw has to be the actual variable name, not a dynamic variable name
That's how qw works, it quotes words, it does no interpolation. You don't have to use qw with use vars, though.
#! /usr/bin/perl
use warnings;
use strict;
my #global_vars;
BEGIN { #global_vars = qw($x $y) }
use vars #global_vars;
# strict won't complain:
$x = 1;
$y = 2;
BEGIN is needed here, because use vars runs in compile time, and we need #global_vars to be already populated.
There's no need for eval EXPR since there's no need to use qw. qw is just one way to create a list of strings, and one that's not particularly useful to your needs.
my #prefixes; BEGIN { #prefixes = qw( one two ... twenty ); }
my #suffixes; BEGIN { #suffixes = 1..50; }
use vars map { my $p = $_; map { '$'.$p.'_'.$_ } #suffixes } #prefixes;

Using # and $ with the same variable name in Perl

I am declaring the same variable name with # and $:
#ask=(1..9);
$ask="insanity";
print ("Array #ask\n");
print ("Scalar $ask\n");
Without using use strict I am getting output correctly but when I am using use strict it gives me a compilation error.
Do these two variables refer to two different memory locations or is it the same variable?
You've got two variables:
#ask
$ask
You could have %ask (a hash) too if you wanted. Then you'd write:
print $ask, $ask[0], $ask{0};
to reference the scalar, the array and the hash.
Generally, you should avoid this treatment, but the variables are all quite distinct and Perl won't be confused.
The only reason use strict; is complaining is because you don't prefix your variables with my:
#!/usr/bin/env perl
use strict;
use warnings;
my #ask = (1..9);
my $ask = "insanity";
my %ask = ( 0 => 'infinity', infinity => 0 );
print "Array #ask\n";
print "Scalar $ask\n";
print "Hash $ask{0}\n";
with use strict; you need to declare your variables first before using it.
For example:
use strict;
my #ask=(1..9);
my $ask="insanity";
print ("Array #ask\n");
print ("Scalar $ask\n");
#ask and $ask are different variables — as is %ask — and it is not an error to do this. It is however poor style.
Because the sigil changes when you use them, such as when you use $ask[1] to get the second element of #ask, the code becomes harder to read and use strict will also not be able to tell if you've gotten confused. Thus it's a good idea to use names that differ in more than the sigil unless you know what you're doing. So you could use e.g. #asks and $ask.
The error you are getting with strict is not due to variable names. It is because you are not declaring the variables (using one of my, our, local, or state. Nor are you using the vars pragma.
Short answer: Stick a my in front of each variable, and you'll be strict-compliant.
For package variables, you can examine entries in the symbol table. $ask and #ask are separate entities:
#!/usr/bin/env perl
use Devel::Symdump;
use YAML;
#ask=(1..9);
$ask="insanity";
my $st = Devel::Symdump->new('main');
print Dump [ $st->$_ ] for qw(
scalars
arrays
);
Among other things, this code will output:
--
…
- main::ask
…
---
…
- main::ask
…
Being able to use the same name can help when, say, you have an array of fish and you are doing something with each fish in the array:
for my $fish (#fish) {
go($fish);
}
Normally, it is more expressive to use the plural form for arrays and hashes, the singular form for elements of an array, and something based on the singular form for keys in a hash:
#!/usr/bin/env perl
use strict;
use warnings;
my #ships = ('Titanic', 'Costa Concordia');
my %ships = (
'Titanic' => {
maiden_voyage => '10 April 1912',
capacity => 3_327,
},
'Costa Concordia' => {
maiden_voyage => '14 July 2006',
capacity => 4_880,
},
);
for my $ship (#ships) {
print "$ship\n";
}
while (my ($ship_name, $ship_details) = each %ships) {
print "$ship_name capacity: $ship_details->{capacity}\n";
}

Why does this Perl variable keep its value

What is the difference between the following two Perl variable declarations?
my $foo = 'bar' if 0;
my $baz;
$baz = 'qux' if 0;
The difference is significant when these appear at the top of a loop. For example:
use warnings;
use strict;
foreach my $n (0,1){
my $foo = 'bar' if 0;
print defined $foo ? "defined\n" : "undefined\n";
$foo = 'bar';
print defined $foo ? "defined\n" : "undefined\n";
}
print "==\n";
foreach my $m (0,1){
my $baz;
$baz = 'qux' if 0;
print defined $baz ? "defined\n" : "undefined\n";
$baz = 'qux';
print defined $baz ? "defined\n" : "undefined\n";
}
results in
undefined
defined
defined
defined
==
undefined
defined
undefined
defined
It seems that if 0 fails, so foo is never reinitialized to undef. In this case, how does it get declared in the first place?
First, note that my $foo = 'bar' if 0; is documented to be undefined behaviour, meaning it's allowed to do anything including crash. But I'll explain what happens anyway.
my $x has three documented effects:
It declares a symbol at compile-time.
It creates an new variable on execution.
It returns the new variable on execution.
In short, it's suppose to be like Java's Scalar x = new Scalar();, except it returns the variable if used in an expression.
But if it actually worked that way, the following would create 100 variables:
for (1..100) {
my $x = rand();
print "$x\n";
}
This would mean two or three memory allocations per loop iteration for the my alone! A very expensive prospect. Instead, Perl only creates one variable and clears it at the end of the scope. So in reality, my $x actually does the following:
It declares a symbol at compile-time.
It creates the variable at compile-time[1].
It puts a directive on the stack that will clear[2] the variable when the scope is exited.
It returns the new variable on execution.
As such, only one variable is ever created[2]. This is much more CPU-efficient than then creating one every time the scope is entered.
Now consider what happens if you execute a my conditionally, or never at all. By doing so, you are preventing it from placing the directive to clear the variable on the stack, so the variable never loses its value. Obviously, that's not meant to happen, so that's why my ... if ...; isn't allowed.
Some take advantage of the implementation as follows:
sub foo {
my $state if 0;
$state = 5 if !defined($state);
print "$state\n";
++$state;
}
foo(); # 5
foo(); # 6
foo(); # 7
But doing so requires ignoring the documentation forbidding it. The above can be achieved safely using
{
my $state = 5;
sub foo {
print "$state\n";
++$state;
}
}
or
use feature qw( state ); # Or: use 5.010;
sub foo {
state $state = 5;
print "$state\n";
++$state;
}
Notes:
"Variable" can mean a couple of things. I'm not sure which definition is accurate here, but it doesn't matter.
If anything but the sub itself holds a reference to the variable (REFCNT>1) or if variable contains an object, the directive replaces the variable with a new one (on scope exit) instead of clearing the existing one. This allows the following to work as it should:
my #a;
for (...) {
my $x = ...;
push #a, \$x;
}
See ikegami's better answer, probably above.
In the first example, you never define $foo inside the loop because of the conditional, so when you use it, you're referencing and then assigning a value to an implicitly declared global variable. Then, the second time through the loop that outside variable is already defined.
In the second example, $baz is defined inside the block each time the block is executed. So the second time through the loop it is a new, not yet defined, local variable.