Perl's main package - block syntax - pragmas and BEGIN/END blocks

Perl's main package - block syntax - pragmas and BEGIN/END blocks - perl

I saw this question: Is there any difference between "standard" and block package declaration? and thinking about the main package. When I write a script, like:
---- begin of the file ---
#!/usr/bin/perl #probably removed by shell?
my $var; #defined from now up to the end of file
...
---- end of the file ----
this automatically comes into the main package, so as I understand right the next happens.
---- begin of the file ---
{ #<-- 1st line
package main;
my $var; #variable transformed to block scope - "up to the end of block"
...
} # <-- last line
---- end of the file ----
which is equivalent to
---- begin of the file ---
package main { #1st line
my $var; #variable block scope
...
} #last line
---- end of the file ----
Question 1: Is the above right? That happens with the main package?
Now the BEGIN/END blocks and pragmas. There are handled in the compilation phase, if I understand right. So having:
---- begin of the file ---
#!/usr/bin/perl
use strict; #file scope
use warnings; #file scope
my $var; #defined from now up to the end of file
BEGIN {
say $var; #the $var is not known here - but it is declared
}
...
---- end of the file ----
the $var is declared, but here
---- begin of the file ---
#!/usr/bin/perl
use strict; #file scope
use warnings; #file scope
BEGIN {
say $var; #the $var is not known here - but "requires explicit package name" error
}
my $var; #defined from now up to the end of file
...
---- end of the file ----
the $var is not declared.
So how is the above translated to "default main package"?
It is always:
---- begin of the file ---
{
package main;
use strict; #block scope ???
use warnings; #block scope ???
my $var; #defined from now up to the end of block
BEGIN { #NESTED???
say $var; #the $var is not known here - but declared
}
...
}
---- end of the file ----
which is equivalent of
---- begin of the file ---
package main {
use strict; #block scope
use warnings; #block scope
my $var; #defined from now up to the end of block
BEGIN { #NESTED block
say $var;
}
...
}
---- end of the file ----
The question is - is here _ANY benefit using something like:
---- begin of the file ---
use strict; #always should be at the START OF THE FILE - NOT IN BLOCKS?
use warnings;
#not NESTED
BEGIN {
}
package main {
my $var;
}
So the question is:
how exactly are handled the pragmas, BEGIN/END/CHECK blocks and the main package in a context of BLOCK syntax?
when changes the "file scope" to the "block scope" - or if it not changes, what is the equivalent translation of "standard main package" to "main package {block}"
and the last code:
---- begin of the file ---
use strict; #always should be at the START OF THE FILE - NOT IN BLOCKS?
use warnings;
my $var;
#not NESTED
BEGIN {
}
package main {
}
How does the my $var get into the main package? So this is translated to somewhat as:
---- begin of the file ---
use strict; #always should be at the START OF THE FILE - NOT IN BLOCKS?
use warnings;
#not NESTED
BEGIN {
}
package main {
my $var; #### GETS HERE????
}
Sorry for the wall of text...

When you declare the variable with my, it is not in any package. At all. The block scope is strictly distinct from any package. The variable is valid until the closing brace (}) of the innermost enclosing block only without package qualification. If you wrote $main::var or $::var, it would be different variable.
use warnings;
use strict;
package main {
my $var = 'this';
}
$var; # error, $var was not declared in this scope
say $main::var; # says nothing
There are two more ways to declare variables:
use vars qw($var) makes $var refer to the variable in current package wherever inside the package.
our $var makes $var refer to the variable in package that was current at the time of the our statement within current block.
The block package declaration is a block and puts it's content in a package. Whereas the block-less package declaration puts following content in another package, but the current block scope continues.
The other missing bit is that when you write
use warnings;
use strict;
package main {
# ...
}
you've effectively written
package main {
use warnings;
use strict;
package main {
# ...
}
}
and since the package is the same, that's the same as
package main {
use warnings;
use strict;
{
# ...
}
}
In other words the package is main at the beginning of the file and an implicit block scope (the file scope) is open. When you re-enter main package, it has no effect and if it is associated with block, it behaves as any block.

Scope and execution order have little to do with each other.
Yes, the default package is main. So it could be said that
---- begin file ----
1: #!/usr/bin/perl
2: my $var;
3: ...;
---- end file ----
is equivalent to
package main {
---- begin file ----
1: #!/usr/bin/perl
2: my $var;
3: ...;
---- end file ----
}
It is simply that the main package is assumed unless another is specified. This does not change line numbers etc.
When a variable declaration is encountered, it is immediately added to the list of known variables. Or more precisely, as soon as the statement where it was declared has ended:
my # $var unknown
$var # $var unknown
= # $var unknown
foo() # $var unknown
; # NOW $var is declared
Similar for pragmas: An use statement is executed as soon at is fully parsed. In the next statement, all imports are available.
Blocks like BEGIN are executed outside of the normal control flow, but obey scoping rules.
BEGIN blocks are executed as soon as they are fully parsed. The return value is discarded.
END blocks are executed when the interpreter exits by normal means.
When we have
my $var = 1; # $var is now declared, but the assignment is run-time
BEGIN {
# here $var is declared, but was not assigned yet.
$var = 42; # but we can assign something if we like
}
# This is executed run-time: $var == 1
say $var;
BEGIN {
# This is executed immediately. The runtime assignment has not yet happened.
# The previous asignment in BEGIN did happen.
say $var;
}
The result?
42
1
Note that if I do not assign a new value at runtime, this variable keeps its compile time value:
my $var;
...; # rest as before
Then we get
42
42
Blocks can be arbitrarily nested:
my $var;
if (0) {
BEGIN {
say "BEGIN 1: ", ++$var;
BEGIN {
say "BEGIN 2: ", ++$var;
BEGIN { $var = 0 }
}
}
}
Output:
BEGIN 2: 1
BEGIN 1: 2
Here we can see that BEGIN blocks are executed before the if (0) is optimized away, because BEGIN is executed immediately.
We can also ask which package a block is in:
BEGIN { say "BEGIN: ", __PACKAGE__ }
say "before package main: ", __PACKAGE__;
# useless redeclaration, we are already in main
package main {
say "in package main: ", __PACKAGE__;
}
Output:
BEGIN: main
before package main: main
in package main: main
So we are in main before we redeclared it. A package is no sealed, immutable entity. It is rather a namespace we can reenter at will:
package Foo;
say "We are staring in ", __PACKAGE__;
for (1 .. 6) {
package Bar;
say "Loop $_ in ", __PACKAGE__;
if ($_ % 2) {
package Baz;
say "... and in ", __PACKAGE__;
BEGIN { say "just compiled something in ", __PACKAGE__ }
} else {
package Foo;
say "... again in ", __PACKAGE__;
BEGIN { say "just compiled something in ", __PACKAGE__ }
}
}
Output:
just compiled something in Baz
just compiled something in Foo
We are staring in Foo
Loop 1 in Bar
... and in Baz
Loop 2 in Bar
... again in Foo
Loop 3 in Bar
... and in Baz
Loop 4 in Bar
... again in Foo
Loop 5 in Bar
... and in Baz
Loop 6 in Bar
... again in Foo
So regarding this:
The question is - is here ANY benefit using something like:
---- begin of the file ---
use strict;
use warnings;
package main {
my $var;
}
The answer is no: If we are already in the package main, redeclaring it has no benefit:
say __PACKAGE__;
package main {
my $var;
say __PACKAGE__;
}
say __PACKAGE__;
If we execute that we can see we are in main the whole time.
Pragmas like strict and warnings have lexical scope, so declaring them as early as possible is good.
# no strict yet
use strict;
# strict now activated
BEGIN {
# we are still in scope of strict
$var = 1; # ooh, an undeclared variable. Will it blow up?
say "BEGIN was executed";
}
my $var;
Output:
Global symbol "$var" requires explicit package name at - line 8.
BEGIN not safe after errors--compilation aborted at - line 10.
The variable was not declared inside the BEGIN block, because it was compiled and (not quite executed) before the declaration. Therefore, strict issues this error. Because this error occurred during compilation of the BEGIN block, this block wasn't executed.
Because of scoping, you can't always reorder your source code in a way that avoids using BEGIN blocks. Here is something you should never do:
for (1 .. 3) {
my $var;
BEGIN { $var = 42 };
say $var // "undef";
}
Output:
42
undef
undef
because $var is emptied whenever the block is left. (This is probably undefined behaviour, and may possibly change. This runs under at least v5.16.3 and v5.14.2).
When your program is compiled no reordering takes place. Instead, BEGIN blocks are executed as soon as they are compiled.
For the exact times when CHECK and END are run, read through perlmod.

Related

How to convince Devel::Trace to print the BEGIN-block statements?

Have a simple script p.pl:
use strict;
use warnings;
our $x;
BEGIN {
$x = 42;
}
print "$x\n";
When I run it as:
perl -d:Trace p.pl
prints:
>> p.pl:3: our $x;
>> p.pl:7: print "$x\n";
42
how to get printed the BEGIN block statements too, e.g. the $x = 42;?
Because my intention isn't clear, adding the clarification:
Looking for ANY way to print statements when the perl script runs (like Devel::Trace it does) but including the statements in the BEGIN block.

It's very possible. Set $DB::single in an early BEGIN block.
use strict;
use warnings;
our $x;
BEGIN { $DB::single = 1 }
BEGIN {
$x = 42;
}
print "$x\n";
$DB::single is a debugger variable used to determine whether the DB::DB function will be invoked at each line. In compilation phase it is usually false but you can set it in compilation phase in a BEGIN block.
This trick is also helpful to set a breakpoint inside a BEGIN block when you want to debug compile-time code in the standard debugger.

Disclaimer: This is just an attempt to explain the behaviour.
Devel::Trace hooks up to the Perl debugging API through the DB model. That is just code. It installs a sub DB::DB.
The big question is, when is that executed. According to perlmod, there are five block types that are executed at specific points during execution. One of them is BEGIN, which is the first.
Consider this program.
use strict;
use warnings;
our ($x, $y);
BEGIN { $x = '42' }
UNITCHECK { 'unitcheck' }
CHECK { 'check' }
INIT { 'init' }
END { 'end' }
print "$x\n";
This will output the following:
>> trace.pl:8: INIT { 'init' }
>> trace.pl:3: our ($x, $y);
>> trace.pl:11: print "$x\n";
42
>> trace.pl:9: END { 'end' }
So Devel::Trace sees the INIT block and the END block. But why the INIT block?
Above mentioned perlmod says:
INIT blocks are run just before the Perl runtime begins execution, in "first in, first out" (FIFO) order.
Apparently at that phase, the DB::DB has already been installed. I could not find any documentation that says when a sub definition is run exactly. However, it seems it's after BEGIN and before INIT. Hence, it does not see whatever goes on in the BEGIN.
Adding a BEGIN { $Devel::Trace::TRACE = 1 } to the beginning of the file also does not help.
I rummaged around in documentation for perldebug and the likes, but could not find an explanation of this behaviour. My guess is that the debugger interface doesn't know about BEGIN at all. They are executed very early after all (consider e.g. perl -c -E 'BEGIN{ say "foo" } say "bar"' will print foo.)

Access "external" Perl Array from "BEGIN" Block/Package

Below is a simplified version of a perl script I am trying to modify:
use MODULE_1 ;
use MODULE_2 ;
use vars qw(%ARR $VarZZ) ;
sub A {
# Somestuff
# Call to Sub B
B() ;
# Call to Sub C
C() ;
}
BEGIN {
package XYZ ;
use vars qw($VarYY $VarXX) ;
# MISC SUBS HERE
# end of package XYZ
}
sub B {
# Somestuff
}
sub C {
# Somestuff to set %ARR
}
# Call to Sub A
A() ;
My issue is that I will like to access %ARR within the package XYZ BEGIN block but keep getting error messages saying I need to define %ARR ("requires explicit package name")
I have tried, trying to copy a similar example within the block, $main::ARR{index} but failed so far.
I assume it may be because %ARR isn't set when that block is begin evaluated and that I need to call "sub C" perhaps but &main::C(); seems to be failing as well.
How can I access this array there?
I have looked at: Perl's main package - block syntax - pragmas and BEGIN/END blocks which seems to be addressing similar themes but struggling to properly understand the answers
** EDIT **
Expanded skeleton script showing some attempts at moving forward:
use MODULE_1 ;
use MODULE_2 ;
use vars qw(%ARR $VarZZ) ;
sub A {
# Somestuff
# Call to Sub B
B() ;
# Call to Sub C
C() ;
# Call to Sub E
E() ;
}
sub E {
# Call to Package XYZ subs
}
BEGIN {
package XYZ ;
use vars qw($VarYY $VarXX %ARR) ;
# I tried to Call to Sub C and load a local version of %ARR
#
# This fails with "Undefined subroutine &main::C" error
&main::C() ;
#
# We never get here so not sure if correct
%ARR = &main::ARR ;
# MISC SUBS HERE
sub X {
# Call to Sub D
&main::D() ;
}
# end of package XYZ
}
sub B {
# Somestuff
}
sub C {
# Somestuff to set %ARR
}
sub D {
# Somestuff
}
# Call to Sub A
A() ;
Note the the call to &main::E() works when called within the Subs in Package XYZ but both this and &main::C() fail when running free standing. Perhaps the free standing call is done at complile time before the subs are defined.
BTW, I tried the our definition but getting a 502 error: Nginx Debug Log
Perhaps this is because the array is not available?

%main:ARR or $main::ARR{index} are correct for the code skeleton you have provided (well, anything is correct because you haven't said use strict, but anyway ...). Is it possible that main is not the correct namespace (i.e., is there some pacakge statement that precedes use vars ...) ?
In any case, you can workaround this issue with the our keyword. If you declare it at the top level, it will have scope throughout the rest of the file:
package ABC;
our %ARR; # %ABC::ARR
sub foo {
$ARR{"key"} = "value"; # %ABC::ARR
}
{ # "BEGIN" optional
package XYZ;
our %Hash; # %XYZ::Hash
sub bar {
my $key1 = $Hash{"key1"}; # %XYZ::Hash
my $val1 = $ARR{$key1}; # %ABC::ARR
$ARR{$val1} = $key1;
}
}
...

Symbolic reference to sub in a package, trying to understand

use strict;
use warnings;
package Foo::Bar;
sub baz { print "$_[0]\n" }
package main;
{ # test 1
my $quux = "Foo::Bar::baz";
no strict 'refs';
&$quux(1);
}
{ # test 2
my $qux = 'Foo::Bar';
my $quux = "$qux\::baz";
no strict 'refs';
&$quux(2);
}
{ # test 3
my $qux = 'Foo::Bar';
my $quux = "$qux::baz";
no strict 'refs';
&$quux(3);
}
Output:
Name "qux::baz" used only once: possible typo at test31.pl line 21.
1
2
Use of uninitialized value $qux::baz in string at test31.pl line 21.
Undefined subroutine &main:: called at test31.pl line 23.
Why does test 2 work, and why does backslash has to be placed exactly there?
Why does test 3 not work, is it so far, in syntax, from test 1?
I tried to write that string as "{$qux}::baz", but it doesn't work, too.
I came across this looking at source of Image::Info distribution.

$qux::baz refers to scalar baz in package qux.
"$qux::baz" is the stringification of that scalar.
"$qux::baz" is another way of writing $qux::baz."".
Curlies can be used to indicate where the variable ends.
"$foo bar" means $foo." bar"
"${f}oo bar" means $f."oo bar"
As such,
"${qux}::baz" is another way of writing $qux."::baz".
"$qux\::baz" is a cute way of doing $qux."::baz" since \ can't appear in variable names.

A variable name can either be simple like $foo or a fully qualified name (in the case of package variables). Such a fully qualified name looks like $Foo::bar. This is the “global” variable $bar in the package Foo.
If an interpolated variable is to be followed by a double colon :: and the variable should not be interpreted as a fully qualified package variable name, then you can either:
Use string concatenation: $qux . "::baz"
Terminate the variable name with a backslash: "$qux\::baz"
Surround the variable name (not the sigil) with curly braces, as if the name were a symbolic reference: "${qux}::bar".

How can I use "member" variables of a module inside a function?

I have this code:
#!/usr/bin/perl
package Modules::TextStuff;
use strict;
use warnings;
use Exporter;
our #ISA = qw(Exporter);
our #EXPORT = qw(get_text);
my $author;
my $text_tmp1 =<<'ENG';
This is a template text
by $author.
ENG
sub get_text {
my $tmp = shift #_;
$author = shift #_;
print "In sub author= $author lang = $tmp \n";
my $final_str = eval('$text_'.$tmp);
print "$final_str \n";
return $final_str;
}
1;
Test script:
#!/usr/bin/perl
use strict;
use warnings;
use Modules::TextStuff;
my $str = get_text('tmp1','jim');
print $str;
When I run the test script it does not work. I get:
In sub author=jim lang = eng
Variable "$text_tmp1" is not available at (eval 1) line 2. Use of
uninitialized value $final_str in concatenation (.) or string
How can I fix this?

Combining strings to create variables names is usually a bad idea. You could salvage your current program using our $text_tmp1 = ... instead of my $text_tmp1 = ..., but I think you should consider a different approach, like a hash:
my %templates = (
tmp1 => <<ENG,
This is a template text
by \$author.
ENG
tmp2 => <<ESP,
Esta es templata texta de \$author.
ESP
);
sub get_text {
...
my $final_str = eval( $templates{$tmp} );
...
}

The error you asked about is generated when eval EXPR tries to grab the value of a variable that did exist, but no longer exists.
>perl -wE"{ my $x = 123; sub f { eval '$x' } } say '<'.f().'>';"
Variable "$x" is not available at (eval 1) line 2.
Use of uninitialized value in concatenation (.) or string at -e line 1.
<>
Remember, executing a file (such as a script or a module) is done in its own a lexical scope, just like the one the curlies create above.
It can be fixed by keeping the variable alive by not letting it go out of scope
>perl -wE"my $x = 123; sub f { eval '$x' } say '<'.f().'>';"
<123>
But that's not an option for you.
Other options include making the variable a global variable.
>perl -wE"{ our $x = 123; sub f { eval '$x' } } say '<'.f().'>';"
<123>
Or forcing the sub to capture it so it doesn't cease to exist.
>perl -wE"{ my $x = 123; sub f { $x if 0; eval '$x' } } say '<'.f().'>';"
<123>
(The if 0 silences the "void context" warning.)
That said, it looks like you're trying to re-invent the wheel. Don't invent another half-assed templating system.

I'm looking at several things:
First of all, $text_tmp1 is not a package variable. It's lexically scoped since you declared it with my. If you need it as a package variable and for it to be visible in all or your subroutines, you need to declare it with our.
Your module doesn't compile as written. You are trying to source in $author, but it's not defined.
What are you doing with eval? This is wrong on so many levels.
Here's how I would do it:
#! /usr/bin/env perl
package Modules::TextStuff;
use strict;
use warnings;
use Exporter qw(import);
use Carp;
our #EXPORT_OK = qw(get_text);
our %templates; # This is now a package variable
#
# TEMPLATES
#
$templates{tmp1}=<<TEMPLATE; # We'll use `%s` for replacements
This is a template text
by %s.
TEMPLATE
$templates{tmp2}=<<TEMPLATE;
This is another template and we will substitute
in %s in this one too.
TEMPLATE
sub get_text {
my $template = shift;
my $author = shift;
if ( not exists $templates{$template} ) {
croak qq(Invalid template name "$template");
}
return sprintf $templates{$template}, $author;
}
1;
I'll make each of these templates an entry in my %templates hash. No need for eval to calculate out a variable name for the template. Also notice that I can now actually test whether the user passed in a valid template or not with the exists.
Also note that %template is declared with our and not my. This makes it available in the entire package including any subroutines in my package.
I also use #EXPORT_OK instead of #EXPORT. It's considered more polite. You're requesting permission to pollute the user's namespace. It's like knocking on someone's door and asking if you can have a beer rather than barging in and rummaging through their fridge for a beer.
Note how I use sprintf to handle the replaceable parameters. This again removes the need for eval.
I also prefer to use #! /usr/bin/env perl on my program header since it's more compatible with things like Perlbrew. You're using /usr/bin/env to find the executable Perl program that's in the user's path. This way, you don't have to know whether it's /bin/perl, /usr/bin/perl, /usr/local/bin/perl, or $HOME/perl5/perlbrew/perls/perl-5.18.0/bin/perl
To use your module, I would do this:
#! /usr/bin/env perl
use strict;
use warnings;
use feature qw(say);
use Modules::TextStuff qw(get_text);
say get_text('tmp1','jim');
Pretty much the same call you made. This prints out:
This is a template text
by jim.

What does "1;" mean in Perl?

I have come across a few Perl modules that for example look similar to the following code:
package MyPackage;
use strict;
use warnings;
use constant PERL510 => ( $] >= 5.0100 );
require Exporter;
our #ISA = qw(Exporter);
our #EXPORT = qw( );
{ #What is the significance of this curly brace?
my $somevar;
sub Somesub {
#Some code here
}
}
1;
What is the significance of 1; and of the curly braces that enclose the $somevar and the Sub?

1 at the end of a module means that the module returns true to use/require statements. It can be used to tell if module initialization is successful. Otherwise, use/require will fail.
$somevar is a variable which is accessable only inside the block. It is used to simulate "static" variables. Starting from Perl 5.10 you can use keyword state keyword to have the same results:
## Starting from Perl 5.10 you can specify "static" variables directly.
sub Somesub {
state $somevar;
}

When you load a module "Foo" with use Foo or require(), perl executes the Foo.pm file like an ordinary script. It expects it to return a true value if the module was loaded correctly. The 1; does that. It could be 2; or "hey there"; just as well.
The block around the declaration of $somevar and the function Somesub limits the scope of the variable. That way, it is only accessible from Somesub and doesn't get cleared on each invocation of Somesub (which would be the case if it was declared inside the function body). This idiom has been superseded in recent versions of perl (5.10 and up) which have the state keyword.

Modules have to return a true value. 1 is a true value.

Perl modules must return something that evaluates to true. If they don't, Perl reports an error.
C:\temp>cat MyTest.pm
package MyTest;
use strict;
sub test { print "test\n"; }
#1; # commented out to show error
C:\temp>perl -e "use MyTest"
MyTest.pm did not return a true value at -e line 1.
BEGIN failed--compilation aborted at -e line 1.
C:\temp>
Although it's customary to use "1;", anything that evaluates to true will work.
C:\temp>cat MyTest.pm
package MyTest;
use strict;
sub test { print "test\n"; }
"false";
C:\temp>perl -e "use MyTest"
C:\temp> (no error here)
For obvious reasons another popular return value is 42.
There's a list of cool return values maintained at http://returnvalues.useperl.at/values.html.

The curly braces limit the scope of the local variable $somevar:
{
my $somevar;
...
} # $somevar's scope ends here

From the documentation for require:
The file must return true as the last
statement to indicate successful
execution of any initialization code,
so it's customary to end such a file
with 1; unless you're sure it'll
return true otherwise. But it's better
just to put the 1; , in case you add
more statements.

I don't know much about Perl, but usually you create a scope using curly braces. Probably $somevar shoudln't be available globally?

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Perl's main package - block syntax - pragmas and BEGIN/END blocks - perl

Related

How to convince Devel::Trace to print the BEGIN-block statements?

Access "external" Perl Array from "BEGIN" Block/Package

Symbolic reference to sub in a package, trying to understand

How can I use "member" variables of a module inside a function?

What does "1;" mean in Perl?

Categories

Resources