Can variable declarations be placed in a common script

Can variable declarations be placed in a common script - perl

Before I start, the whole 'concept' may be technically impossible; hopefully someone will have more knowledge about such things, and advise me.
With Perl, you can "declare" global variables at the start of a script via my / our thus:
my ($a,$b,$c ..)
That's fine with a few unique variables. But I am using about 50 of them ... and the same names (not values) are used by five scripts. Rather than having to place huge my( ...) blocks at the start of each file, I'm wondering if there is a way to create them in one script. Note: Declare the namespace, not their values.
I have tried placing them all in a single file, with the shebang at the top, and a 1 at the bottom, and then tried "require", "use" and "do" to load them in. But - at certain times -the script complains it cannot find the global package name. (Maybe the "paths.pl" is setting up the global space relative to itself - which cannot be 'seen' by the other scripts)
Looking on Google, somebody suggested setting variables in the second file, and still setting the my in the calling script ... but that is defeating the object of what I'm trying to do, which is simply declare the name space once, and setting the values in another script
** So far, it seems if I go from a link in an HTML page to a perl script, the above method works. But when I call a script via XHTMLRequest using a similar setup, it cannot find the $a, $b, $c etc within the "paths" script
HTML
<form method="post" action="/cgi-bin/track/script1.pl>
<input type="submit" value="send"></form>
Perl: (script1.pl)
#shebang
require "./paths.pl"
$a=1;
$b="test";
print "content-type: text/html\n\n";
print "$a $b";
Paths.pl
our($a,
$b,
$c ...
)1;
Seems to work OK, with no errors. But ...
# Shebang
require "./paths.pl"
XHTMLREQUEST script1.pl
Now it complains it cannot find $a or $b etc as an "explicit package" for "script1.pl"
Am I moving into the territory of "modules" - of which I know little. Please bear in mind, I am NOT declaring values within the linked file, but rather setting up the 'global space' so that they can be used by all scripts which declare their own values.
(On a tangent, I thought - in the past - a file in the same directory could be accessed as "paths.pl" -but it won't accept that, and it insists on "./" Maybe this is part of the problem. I have tried absolute and relative paths too, from "url/cgi-bin/track/" to "/cgi-bin/track" but can't seem to get that to work either)
I'm fairly certain it's finding the paths file as I placed a "my value" before the require, and set a string within paths, and it was able to print it out.

First, lexical (my) variables only exist in their scope. A file is a scope, so they only exist in their file. You are now trying to work around that, and when you find yourself fighting the language that way, you should realize that you are doing it wrong.
You should move away from declaring all variables in one go at the top of a program. Declare them near the scope you want to use them, and declare them in the smallest scope possible.
You say that you want to "Set up a global space", so I think you might misunderstand something. If you want to declare a lexical variable in some scope, you just do it. You don't have to do anything else to make that possible.
Instead of this:
my( $foo, $bar, $baz );
$foo = 5;
sub do_it { $bar = 9; ... }
while( ... ) { $baz = 6; ... }
Declare the variable just where you want them:
my $foo = 5;
sub do_it { my $bar = 9; ... }
while( ... ) { my $baz = 6; ... }
Every lexical variable should exist in the smallest scope that can tolerate it. That way nothing else can mess with it and it doesn't retain values from previous operations when it shouldn't. That's the point of them, after all.
When you declare them to be file scoped, then don't declare them in the scope that uses them, you might have two unrelated uses of the same name conflicting with each other. One of the main benefits of lexical variables is that you don't have to know the names of any other variables in scope or in the program:
my( $foo, ... );
while( ... ) {
$foo = ...;
do_something();
...
}
sub do_something {
$foo = ...;
}
Are those uses of $foo in the while and the sub the same, or do they accidentally have the same name? That's a cruel question to leave up to the maintenance program.
If they are the same thing, make the subroutine get its value from its argument list instead. You can use the same names, but since each scope has it's own lexical variables, they don't interfere with each other:
while( ... ) {
my $foo = ...;
do_something($foo);
...
}
sub do_something {
my( $foo ) = #_;
}
See also:
How to share/export a global variable between two different perl scripts?
You say you aren't doing what I'm about to explain, but other people may want to do something similar to share values. Since you are sharing the same variable names across programs, I suspect that this is actually what it going on, though.
In that case, there are many modules on CPAN that can do that job. What you choose depends on what sort of stuff you are trying to share between programs. I have a chapter in Mastering Perl all about it.
You might be able to get away with something like this, where one module defines all the values and makes them available for export:
# in Local/Config.pm
package Local::Config;
use Exporter qw(import);
our #EXPORT = qw( $foo $bar );
our $foo = 'Some value';
our $bar = 'Different value';
1;
To use this, merely load it with use. It will automatically import the variables that you put in #EXPORT:
# in some program
use Local::Config;
We cover lots of this sort of stuff in Intermediate Perl.

What you want to do here is a form of boilerplate management. Shoving variable declarations into a module or class file. This is a laudable goal. In fact you should shove as much boilerplate into that other module as possible. It makes it far easier to keep consistent behavior across the many scripts in a project. However shoving variables in there will not be as easy as you think.
First of all, $a and $b are special variables reserved for use in sort blocks so they never have to be declared. So using them here will not validate your test. require always searches for the file in #INC. See perlfunc require.
To declare a variable it has to be done at compile time. our, my, and state all operate at compile time and legalize a symbol in a lexical scope. Since a module is a scope, and require and do both create a scope for that file, there is no way to have our (let alone my and state) reach back to a parent scope to declare a symbol.
This leaves you with two options. Export package globals back to the calling script or munge the script with a source filter. Both of these will give you heartburn. Remember that it has to be done at compile time.
In the interest of computer science, here's how you would do it (but don't do it).
#boilerplate.pm
use strict;
use vars qw/$foo $bar/;
1;
__END__
#script.pl
use strict;
use boilerplate;
$foo = "foo here";
use vars is how you declare package globals when strict is in effect. Package globals are unscoped ("global") so it doesn't matter what scope or file they're declared in. (NB: our does not create a global like my creates a lexical. our creates a lexical alias to a global, thus exposing whatever is there.) Notice that boilerplate.pm has no package declaration. It will inherit whatever called it which is what you want.
The second way using source filters is devious. You create a module that rewrites the source code of your script on the fly. See Filter::Simple and perlfilter for more information. This only works on real scripts, not perl -e ....
#boilerplate.pm
package boilerplate;
use strict; use diagnostics;
use Filter::Simple;
my $injection = '
our ($foo, $bar);
my ($baz);
';
FILTER { s/__FILTER__/$injection/; }
__END__
#script.pl
use strict; use diagnostics;
use boilerplate;
__FILTER__
$foo = "foo here";
You can make any number of filtering tokens or scenarios for code substitution. e.g. use boilerplate qw/D2_loadout/;
These are the only ways to do it with standard Perl. There are modules that let you meddle with calling scopes through various B modules but you're on your own there. Thanks for the question!
HTH

Related

Splitting perl module into multiple files

I'm a perl newbie who is working on a module that is getting quite long and I want to split it into multiple files for easier maintainability and organization. Right now it looks something like this
#ABC.pm
package ABC;
use strict;
use warnings;
my $var1;
my $var2;
sub func1 {
#some operations on a $var
}
sub func2 {
#some operations on a $var
}
return 1;
I'd like it to look something like
#ABC_Part_1.pm
package ABC;
use strict;
use warnings;
my $var1;
my $var2;
sub func1 {
#some operations on a $var
}
return 1;
#ABC_Part_2.pm
package ABC;
use strict;
use warnings;
sub func2 {
#some operations on a $var
}
return 1;
The issue I'm having is getting the variables to be seen across the separate files. I tried to declare them using 'our', but then I have to use the scope resolution operator which I don't want to do. I'd like to treat them as local variables within the module files, but have them hidden to the calling script. I'd also like to only have one include in the calling script, like
#!/usr/bin/env perl
#script.pl
use strict;
use warnings;
use ABC;
func1();
func2();
Thanks

The issue I'm having is getting the variables to be seen across the separate files.
Your best option is to stop wanting that.
The whole point of lexical variables is for them to be accessible in only a small locally visible bit of code. A variable that needs to be accessed from multiple different files is a sign of "code smell".
If you're really sure you want this though…
I tried to declare them using 'our', but then I have to use the scope resolution operator which I don't want to do.
Yes, this should work, but you need to declare the variable at the top of each file you use it in.
# ABC_part_1.pm
package ABC;
our $foo;
# code that accesses $foo goes here
1;
And then:
# ABC_part_2.pm
package ABC;
our $foo;
# code that accesses $foo goes here
1;

You can make your ABC.pm file a collection of require statements.
package ABC;
require ABC1;
require ABC2;
require ABC3;
1;
It's important to work with require instead of use, because use will try to import things automatically. But that will not work since there is no package ABC1 in your ABC1.pm file, so ABC1->import will fail.
Regarding the variables, there's really no way to get lexical variables into different files. You could use do instead of require, which would read and run the files directly in the line with the do. That way, the scope would stay the same, and you could have this.
package ABC;
my $foo;
my $bar;
do 'lib/ABC1.pm';
do 'lib/ABC2.pm';
Please do not do this. It's crazy!
If you feel that your library is getting too big, first add proper documentation to every function, and sort so that things that belong together are together. If that does not help you, split up the file into smaller logical units and make those individual packages that talk to each other through a defined interface, but also are able to stand alone where needed.
If repeating a bunch of use statements feels like too much boiler plate, write your own module collection (like strictures) using Import::Into.
Furthermore, don't use lexical variables in the file scope. If you want to have state for things, create object oriented code and write classes. Then you'll have state and behavior. If you have package/class data, use package variables.
Perl doesn't have the concept of private things for a reason. There are conventions in place to mark things as private, like naming them _stuff with a leading underscore. That's a sign for everyone that this is internal, not a stable API, might change at any moment and shouldn't be messed with. Do that, instead of trying to hide things. It's a strength of Perl to allow you to mess with everything. But that doesn't mean you have to do it. It's an option that you should embrace.

writing reflective method to load variables from conf file, and assigning references?

I'm working with ugly code and trying to do a cleanup by moving values in a module into a configuration file. I want to keep the modules default values if a variable doesn't exist in the conf file, otherwise use the conf file version. There are lots of variables (too many) in the module so I wanted a helper method to support this. This is a first refactoring step, I likely will go further to better handle config variables later, but one step at a time.
I want a method that would take a variable in my module and either load the value from conf or set a default. So something like this (writing this from scratch, so treat it as just pseudocode for now)
Our ($var_a, $var_b ...);
export($var_a, $var_b ...);
my %conf = #load config file
load_var(\$var_a, "foo");
load_var(\$var_b, "$var_abar");
sub load_var($$){
my($variable_ref, $default) = #_
my $variale_name = Dumper($$variable_ref); #get name of variable
my $variable_value = $conf{$variable_name} // $default;
#update original variable by having $variable_ref point to $variable_value
}
So two questions here. First, does anyone know if some functionality like my load_var already exists which I an reuse?
Second, if I have to write it from scratch, can i do it with a perl version older then 5.22? when I read perlref it refers to setting references as being a new feature in 5.22, but it seems odd that such a basic behavior of references wasn't implemented sooner, so I'm wonder if I'm misunderstanding the document. Is there a way to pass a variable to my load_var method and ensure it's actually updated?

For this sort of problem, I would be thinking along the lines of using the AUTOLOAD - I know it's not quite what you asked, but it's sort of doing a similar thing:
If you call a subroutine that is undefined, you would ordinarily get an immediate, fatal error complaining that the subroutine doesn't exist. (Likewise for subroutines being used as methods, when the method doesn't exist in any base class of the class's package.) However, if an AUTOLOAD subroutine is defined in the package or packages used to locate the original subroutine, then that AUTOLOAD subroutine is called with the arguments that would have been passed to the original subroutine.
Something like:
#!/usr/bin/env perl
package Narf;
use Data::Dumper;
use strict;
use warnings;
our $AUTOLOAD;
my %conf = ( fish => 1,
carrot => "banana" );
sub AUTOLOAD {
print "Loading $AUTOLOAD\n";
##read config file
my $target = $AUTOLOAD =~ s/.*:://gr;
print $target;
return $conf{$target} // 0;
}
sub boo {
print "Boo!\n";
}
You can either call it OO style, or just 'normally' - but bear in mind this creates subs, not variables, so you might need to specify the package (or otherwise 'force' an import/export)
#!/usr/bin/env perl
use strict;
use warnings;
use Narf;
print Narf::fish(),"\n";
print Narf::carrot(),"\n";
print Narf::somethingelse(),"\n";
print Narf::boo;
Note - as these are autoloaded, they're not in the local namespace. Related to variables you've got this perlmonks discussion but I'm not convinced that's a good line to take, for all the reasons outlined in Why it's stupid to `use a variable as a variable name'

Why declare Perl variable with "my" at file scope?

I'm learning Perl and trying to understand variable scope. I understand that my $name = 'Bob'; will declare a local variable inside a sub, but why would you use the my keyword at the global scope? Is it just a good habit so you can safely move the code into a sub?
I see lots of example scripts that do this, and I wonder why. Even with use strict, it doesn't complain when I remove the my. I've tried comparing behaviour with and without it, and I can't see any difference.
Here's one example that does this:
#!/usr/bin/perl
use strict;
use warnings;
use DBI;
my $dbfile = "sample.db";
my $dsn = "dbi:SQLite:dbname=$dbfile";
my $user = "";
my $password = "";
my $dbh = DBI->connect($dsn, $user, $password, {
PrintError => 0,
RaiseError => 1,
AutoCommit => 1,
FetchHashKeyName => 'NAME_lc',
});
# ...
$dbh->disconnect;
Update
It seems I was unlucky when I tested this behaviour. Here's the script I tested with:
use strict;
my $a = 5;
$b = 6;
sub print_stuff() {
print $a, $b, "\n"; # prints 56
$a = 55;
$b = 66;
}
print_stuff();
print $a, $b, "\n"; # prints 5566
As I learned from some of the answers here, $a and $b are special variables that are already declared, so the compiler doesn't complain. If I change the $b to $c in that script, then it complains.
As for why to use my $foo at the global scope, it seems like the file scope may not actually be the global scope.

The addition of my was about the best thing that ever happened to Perl and the problem it solved was typos.
Say you have a variable $variable. You do some assignments and comparisons on this variable.
$variable = 5;
# intervening assignments and calculations...
if ( $varable + 20 > 25 ) # don't use magic numbers in real code
{
# do one thing
}
else
{
# do something else
}
Do you see the subtle bug in the above code that happens if you don't use strict; and require variables be declared with my? The # do one thing case will never happen. I encountered this several times in production code I had to maintain.

A few points:
strict demands that all variables be declared with a my (or state) or installed into the package--declared with an our statement or a use vars pragma (archaic), or inserted into the symbol table at compile time.
They are that file's variables. They remain of no concern and no use to any module required during the use of that file.
They can be used across packages (although that's a less good reason.)
Lexical variables don't have any of the magic that the only alternative does. You can't "push" and "pop" a lexical variable as you change scope, as you can with any package variable. No magic means faster and plainer handling.
Laziness. It's just easier to declare a my with no brackets as opposed to concentrating its scope by specific bracketing.
{ my $visible_in_this_scope_only;
...
sub bananas {
...
my $bananas = $visible_in_this_scope_only + 3;
...
}
} # End $visible_in_this_scope_only
(Note on the syntax: in my code, I never use a bare brace. It will always tell you, either before (standard loops) or after what the scope is for, even if it would have been "obvious".

It's just good practice. As a personal rule, I try to keep variables in the smallest scope possible. If a line of code can't see a variable, then it can't mess with it in unexpected ways.
I'm surprised that you found that the script worked under use strict without the my, though. That's generally not allowed:
$ perl -E 'use strict; $db = "foo"; say $db'
Global symbol "$db" requires explicit package name at -e line 1.
Global symbol "$db" requires explicit package name at -e line 1.
Execution of -e aborted due to compilation errors.
$ perl -E 'use strict; my $db = "foo"; say $db'
foo
Variables $a and $b are exempt:
$ perl -E 'use strict; $b = "foo"; say $b'
foo
But I don't know how you would make the code you posted work with strict and a missing my.

A sub controls/limits the scope of variables between the braces {} that define its operations. Of course many variables exist outside of a particular function and using lexical my for "global" variables can give you more control over how "dynamic" their behavior is inside your application. The Private Variables via my() section of perlodocperlsub discusses reasons for doing this pretty thoroughly.
I'm going to quote myself from elsewhere which is not the best thing to do on SO but here goes:
The classic perlmonks node - Variable Scoping in Perl: the
basics - is a frequently
consulted reference :-)
As I noted in a comment, Bruce Gray's talk at YAPC::NA 2012 - The why of my() is a good story about how a pretty expert perl programmer wrapped his head around perl and namespaces.
I've heard people explain my as Perl's equivalent to Javascript's var - it's practically necessary but, Perl being perl, things will work without it if you insist or take pains to make it do that.
ps: Actually with Javascript, I guess functions are used to control "scope" in a way that is analagous to your description of using my in sub's.

Perl lexically scoped variables effiency in declaration

Final Edit: I'm making one last edit to this question for clarity sake. I believe it has been answered in all of the comments, but in the event there is such an option, I think it's best to clean up the question for future readers.
I recently inherited someone's Perl code and there's an interesting setup and I'm wondering if it can be made more efficient.
The program is setup to use strict and warnings (as it should). The program is setup to use global variables; meaning the variables are declared and initialized to undef at the top of the program. Specific values beyond the initial undef are then assigned to the variables throughout various loops within the program. After they are set, there is a separate report section (after all internal loops and subroutines have run) that uses the variables for its content.
There are quite a few variables and it seems repetitive to just make an initial variable declaration at the top of the program so that it is available later for output/report purposes (and to stay compliant with the strict pragma). Based on my understanding and from all of the comments received thus far, it seems this is the only way to do it, as lexically scoped variables by definition only persist in during their declared scope. So, in order to make it global it needs to be declared early (i.e. at the top).
I'm just wondering if there is a shortcut to elevate the scope of a variable to be global regardless of where it's declared, and still stay within strict's parameters of course?
My previous example was confusing, so I'll write a pseudo code example here to convey the concept (I don't have the original source with me):
# Initial declaration to create "Global variables" -- imagine about 30 or so
my ($var1, $var2, $var3); # Global
# Imagine a foreach loop going through a hash and assigning values
# to previously declared variables
for my $k (keys %h){
...
$var = 1; #$var gets set to 1 within the foreach loop
...
}
print "The final calculated report is as follows:\n";
#output $var after loop is done later in program
print "Variable 1 comes to $var1\n";
So the question: is there an acceptable shortcut that declares $var1 within the foreach loop above and escalates its scope beyond the foreach loop, so it is in effect "Global"? This would avoid the need to declare and initialize it to undef at the top of the program and still make it available for use in the program's output.
Based on feedback already received the answer seems to be an emphatic "No" due to the defined scoping constraints.

As per your own question:
So the question: is there an acceptable shortcut that declares $var1 within the foreach loop above and escalates its scope beyond the foreach loop, so it is in effect "Global"? This would avoid the need to declare and initialize it to undef at the top of the program and still make it available for use in the program's output.
You can refer to a variable name without having declared it first by using the whole name including namespace:
use strict;
use warnings;
my %h = (
first => 'value 1',
second => 'value 2',
third => 'value 3',
);
for my $k (keys %h) {
print "Processing key $k...\n";
$main::var = 1;
}
print "Variable var is now $main::var\n";
I'm assuming main as your namespace, which is the default. If your script is declaring a package, then you need to use that name. Also, if you didn't declare the variable first, you will need to use the whole package::name format everytime.
However, and just so it's clear: You don't need to "declare and initialize a variable to undef". If you do this:
my ($var1, $var2, $var3);
Those variables are already initialized to undef.
Now, you also need to understand the difference between a lexical and a global variable. Using the keyword my to declare variable at the top of your script will make them available in all the rest of your script, no matter if you are inside or outside a block. But they are not really global as they are not visible to any external module or script. If you change the keyword my to our, then they are global and visible anywhere outside that script.

I'm just wondering if there is a shortcut to elevate the scope of a variable to be global regardless of where it's declared, and still stay within strict's parameters of course?
Using global variables is dumb, but possible.
use vars qw( $foo );
It still needs to be present before any uses.

Is it a design flaw that Perl subs aren't lexically scoped?

{
sub a {
print 1;
}
}
a;
A bug,is it?
a should not be available from outside.
Does it work in Perl 6*?
* Sorry I don't have installed it yet.

Are you asking why the sub is visible outside the block? If so then its because the compile time sub keyword puts the sub in the main namespace (unless you use the package keyword to create a new namespace). You can try something like
{
my $a = sub {
print 1;
};
$a->(); # works
}
$a->(); # fails
In this case the sub keyword is not creating a sub and putting it in the main namespace, but instead creating an anonymous subroutine and storing it in the lexically scoped variable. When the variable goes out of scope, it is no longer available (usually).
To read more check out perldoc perlsub
Also, did you know that you can inspect the way the Perl parser sees your code? Run perl with the flag -MO=Deparse as in perl -MO=Deparse yourscript.pl. Your original code parses as:
sub a {
print 1;
}
{;};
a ;
The sub is compiled first, then a block is run with no code in it, then a is called.
For my example in Perl 6 see: Success, Failure. Note that in Perl 6, dereference is . not ->.
Edit: I have added another answer about new experimental support for lexical subroutines expected for Perl 5.18.

In Perl 6, subs are indeed lexically scoped, which is why the code throws an error (as several people have pointed out already).
This has several interesting implications:
nested named subs work as proper closures (see also: the "will not stay shared" warning in perl 5)
importing of subs from modules works into lexical scopes
built-in functions are provided in an outer lexical scope (the "setting") around the program, so overriding is as easy as declaring or importing a function of the same name
since lexpads are immutable at run time, the compiler can detect calls to unknown routines at compile time (niecza does that already, Rakudo only in the "optimizer" branch).

Subroutines are package scoped, not block scoped.
#!/usr/bin/perl
use strict;
use warnings;
package A;
sub a {
print 1, "\n";
}
a();
1;
package B;
sub a {
print 2, "\n";
}
a();
1;

Named subroutines in Perl are created as global names. Other answers have shown how to create a lexical subroutines by assigning an anonymous sub to a lexical variable. Another option is to use a local variable to create a dynamically scoped sub.
The primary differences between the two are call style and visibility. The dynamically scoped sub can be called like a named sub, and it will also be globally visible until the block it is defined in is left.
use strict;
use warnings;
sub test_sub {
print "in test_sub\n";
temp_sub();
}
{
local *temp_sub = sub {
print "in temp_sub\n";
};
temp_sub();
test_sub();
}
test_sub();
This should print
in temp_sub
in test_sub
in temp_sub
in test_sub
Undefined subroutine &main::temp_sub called at ...

At the risk of another scolding by #tchrist, I am adding another answer for completeness. The as yet to be released Perl 5.18 is expected to include lexical subroutines as an experimental feature.
Here is a link to the relevant documentation. Again, this is very experimental, it should not be used for production code for two reasons:
It might not be well implemented yet
It might be removed without notice
So play with this new toy if you want, but you have been warned!

If you see the code compile, run and print "1", then you are not experiencing a bug.
You seem to be expecting subroutines to only be callable inside the lexical scope in which they are defined. That would be bad, because that would mean that one wouldn't be able to call subroutines defined in other files. Maybe you didn't realise that each file is evaluated in its own lexical scope? That allows the likes of
my $x = ...;
sub f { $x }

Yes, I think it is a design flaw - more specifically, the initial choice of using dynamic scoping rather than lexical scoping made in Perl, which naturally leads to this behavior. But not all language designers and users would agree. So the question you ask doesn't have a clear answer.
Lexical scoping was added in Perl 5, but as an optional feature, you always need to indicate it specifically. With that design choice I fully agree: backward compatibility is important.