Splitting perl module into multiple files - perl

I'm a perl newbie who is working on a module that is getting quite long and I want to split it into multiple files for easier maintainability and organization. Right now it looks something like this
#ABC.pm
package ABC;
use strict;
use warnings;
my $var1;
my $var2;
sub func1 {
#some operations on a $var
}
sub func2 {
#some operations on a $var
}
return 1;
I'd like it to look something like
#ABC_Part_1.pm
package ABC;
use strict;
use warnings;
my $var1;
my $var2;
sub func1 {
#some operations on a $var
}
return 1;
#ABC_Part_2.pm
package ABC;
use strict;
use warnings;
sub func2 {
#some operations on a $var
}
return 1;
The issue I'm having is getting the variables to be seen across the separate files. I tried to declare them using 'our', but then I have to use the scope resolution operator which I don't want to do. I'd like to treat them as local variables within the module files, but have them hidden to the calling script. I'd also like to only have one include in the calling script, like
#!/usr/bin/env perl
#script.pl
use strict;
use warnings;
use ABC;
func1();
func2();
Thanks

The issue I'm having is getting the variables to be seen across the separate files.
Your best option is to stop wanting that.
The whole point of lexical variables is for them to be accessible in only a small locally visible bit of code. A variable that needs to be accessed from multiple different files is a sign of "code smell".
If you're really sure you want this though…
I tried to declare them using 'our', but then I have to use the scope resolution operator which I don't want to do.
Yes, this should work, but you need to declare the variable at the top of each file you use it in.
# ABC_part_1.pm
package ABC;
our $foo;
# code that accesses $foo goes here
1;
And then:
# ABC_part_2.pm
package ABC;
our $foo;
# code that accesses $foo goes here
1;

You can make your ABC.pm file a collection of require statements.
package ABC;
require ABC1;
require ABC2;
require ABC3;
1;
It's important to work with require instead of use, because use will try to import things automatically. But that will not work since there is no package ABC1 in your ABC1.pm file, so ABC1->import will fail.
Regarding the variables, there's really no way to get lexical variables into different files. You could use do instead of require, which would read and run the files directly in the line with the do. That way, the scope would stay the same, and you could have this.
package ABC;
my $foo;
my $bar;
do 'lib/ABC1.pm';
do 'lib/ABC2.pm';
Please do not do this. It's crazy!
If you feel that your library is getting too big, first add proper documentation to every function, and sort so that things that belong together are together. If that does not help you, split up the file into smaller logical units and make those individual packages that talk to each other through a defined interface, but also are able to stand alone where needed.
If repeating a bunch of use statements feels like too much boiler plate, write your own module collection (like strictures) using Import::Into.
Furthermore, don't use lexical variables in the file scope. If you want to have state for things, create object oriented code and write classes. Then you'll have state and behavior. If you have package/class data, use package variables.
Perl doesn't have the concept of private things for a reason. There are conventions in place to mark things as private, like naming them _stuff with a leading underscore. That's a sign for everyone that this is internal, not a stable API, might change at any moment and shouldn't be messed with. Do that, instead of trying to hide things. It's a strength of Perl to allow you to mess with everything. But that doesn't mean you have to do it. It's an option that you should embrace.

Related

Can variable declarations be placed in a common script

Before I start, the whole 'concept' may be technically impossible; hopefully someone will have more knowledge about such things, and advise me.
With Perl, you can "declare" global variables at the start of a script via my / our thus:
my ($a,$b,$c ..)
That's fine with a few unique variables. But I am using about 50 of them ... and the same names (not values) are used by five scripts. Rather than having to place huge my( ...) blocks at the start of each file, I'm wondering if there is a way to create them in one script. Note: Declare the namespace, not their values.
I have tried placing them all in a single file, with the shebang at the top, and a 1 at the bottom, and then tried "require", "use" and "do" to load them in. But - at certain times -the script complains it cannot find the global package name. (Maybe the "paths.pl" is setting up the global space relative to itself - which cannot be 'seen' by the other scripts)
Looking on Google, somebody suggested setting variables in the second file, and still setting the my in the calling script ... but that is defeating the object of what I'm trying to do, which is simply declare the name space once, and setting the values in another script
** So far, it seems if I go from a link in an HTML page to a perl script, the above method works. But when I call a script via XHTMLRequest using a similar setup, it cannot find the $a, $b, $c etc within the "paths" script
HTML
<form method="post" action="/cgi-bin/track/script1.pl>
<input type="submit" value="send"></form>
Perl: (script1.pl)
#shebang
require "./paths.pl"
$a=1;
$b="test";
print "content-type: text/html\n\n";
print "$a $b";
Paths.pl
our($a,
$b,
$c ...
)1;
Seems to work OK, with no errors. But ...
# Shebang
require "./paths.pl"
XHTMLREQUEST script1.pl
Now it complains it cannot find $a or $b etc as an "explicit package" for "script1.pl"
Am I moving into the territory of "modules" - of which I know little. Please bear in mind, I am NOT declaring values within the linked file, but rather setting up the 'global space' so that they can be used by all scripts which declare their own values.
(On a tangent, I thought - in the past - a file in the same directory could be accessed as "paths.pl" -but it won't accept that, and it insists on "./" Maybe this is part of the problem. I have tried absolute and relative paths too, from "url/cgi-bin/track/" to "/cgi-bin/track" but can't seem to get that to work either)
I'm fairly certain it's finding the paths file as I placed a "my value" before the require, and set a string within paths, and it was able to print it out.
First, lexical (my) variables only exist in their scope. A file is a scope, so they only exist in their file. You are now trying to work around that, and when you find yourself fighting the language that way, you should realize that you are doing it wrong.
You should move away from declaring all variables in one go at the top of a program. Declare them near the scope you want to use them, and declare them in the smallest scope possible.
You say that you want to "Set up a global space", so I think you might misunderstand something. If you want to declare a lexical variable in some scope, you just do it. You don't have to do anything else to make that possible.
Instead of this:
my( $foo, $bar, $baz );
$foo = 5;
sub do_it { $bar = 9; ... }
while( ... ) { $baz = 6; ... }
Declare the variable just where you want them:
my $foo = 5;
sub do_it { my $bar = 9; ... }
while( ... ) { my $baz = 6; ... }
Every lexical variable should exist in the smallest scope that can tolerate it. That way nothing else can mess with it and it doesn't retain values from previous operations when it shouldn't. That's the point of them, after all.
When you declare them to be file scoped, then don't declare them in the scope that uses them, you might have two unrelated uses of the same name conflicting with each other. One of the main benefits of lexical variables is that you don't have to know the names of any other variables in scope or in the program:
my( $foo, ... );
while( ... ) {
$foo = ...;
do_something();
...
}
sub do_something {
$foo = ...;
}
Are those uses of $foo in the while and the sub the same, or do they accidentally have the same name? That's a cruel question to leave up to the maintenance program.
If they are the same thing, make the subroutine get its value from its argument list instead. You can use the same names, but since each scope has it's own lexical variables, they don't interfere with each other:
while( ... ) {
my $foo = ...;
do_something($foo);
...
}
sub do_something {
my( $foo ) = #_;
}
See also:
How to share/export a global variable between two different perl scripts?
You say you aren't doing what I'm about to explain, but other people may want to do something similar to share values. Since you are sharing the same variable names across programs, I suspect that this is actually what it going on, though.
In that case, there are many modules on CPAN that can do that job. What you choose depends on what sort of stuff you are trying to share between programs. I have a chapter in Mastering Perl all about it.
You might be able to get away with something like this, where one module defines all the values and makes them available for export:
# in Local/Config.pm
package Local::Config;
use Exporter qw(import);
our #EXPORT = qw( $foo $bar );
our $foo = 'Some value';
our $bar = 'Different value';
1;
To use this, merely load it with use. It will automatically import the variables that you put in #EXPORT:
# in some program
use Local::Config;
We cover lots of this sort of stuff in Intermediate Perl.
What you want to do here is a form of boilerplate management. Shoving variable declarations into a module or class file. This is a laudable goal. In fact you should shove as much boilerplate into that other module as possible. It makes it far easier to keep consistent behavior across the many scripts in a project. However shoving variables in there will not be as easy as you think.
First of all, $a and $b are special variables reserved for use in sort blocks so they never have to be declared. So using them here will not validate your test. require always searches for the file in #INC. See perlfunc require.
To declare a variable it has to be done at compile time. our, my, and state all operate at compile time and legalize a symbol in a lexical scope. Since a module is a scope, and require and do both create a scope for that file, there is no way to have our (let alone my and state) reach back to a parent scope to declare a symbol.
This leaves you with two options. Export package globals back to the calling script or munge the script with a source filter. Both of these will give you heartburn. Remember that it has to be done at compile time.
In the interest of computer science, here's how you would do it (but don't do it).
#boilerplate.pm
use strict;
use vars qw/$foo $bar/;
1;
__END__
#script.pl
use strict;
use boilerplate;
$foo = "foo here";
use vars is how you declare package globals when strict is in effect. Package globals are unscoped ("global") so it doesn't matter what scope or file they're declared in. (NB: our does not create a global like my creates a lexical. our creates a lexical alias to a global, thus exposing whatever is there.) Notice that boilerplate.pm has no package declaration. It will inherit whatever called it which is what you want.
The second way using source filters is devious. You create a module that rewrites the source code of your script on the fly. See Filter::Simple and perlfilter for more information. This only works on real scripts, not perl -e ....
#boilerplate.pm
package boilerplate;
use strict; use diagnostics;
use Filter::Simple;
my $injection = '
our ($foo, $bar);
my ($baz);
';
FILTER { s/__FILTER__/$injection/; }
__END__
#script.pl
use strict; use diagnostics;
use boilerplate;
__FILTER__
$foo = "foo here";
You can make any number of filtering tokens or scenarios for code substitution. e.g. use boilerplate qw/D2_loadout/;
These are the only ways to do it with standard Perl. There are modules that let you meddle with calling scopes through various B modules but you're on your own there. Thanks for the question!
HTH

how to access variables in imported module in local scope in perl?

I am stuck while creating a perl Moose module.
I have a global pm module.
package XYZ;
require Exporter;
our #ISA = qw(Exporter); ## EDIT missed this line
our #EXPORT_OK = qw($VAR);
my $VAR1 = 1;
our $VAR = {'XYZ' => $VAR1};
1;
I want to get $VAR in a Moose module I'm creating
package THIS;
use Moose;
use YAML::XS;
sub get_all_blocks{
my ($self) = #_;
require $self->get_pkg(); # this returns the full path+name of the above package
# i cannot use use lib+use since the get_pkg starts complaining
our $VAR;
print YAML::XS::Dump($XYZ::VAR); # this works
print YAML::XS::Dump($VAR); # this does not work
# i cannot use the scope resolution since XYZ would keep changing.
}
1;
Can someone please help me with accessing variable?
EDIT: Missed one line in the package XYZ code.
I cannot touch the package XYZ since it is owned/used by someone else, I can just use it :(
Exporting variables may easily lead to trouble.
Why not
package XYZ;
use strict;
use warnings;
use Exporter qw(import);
our #EXPORT_OK = qw(get_var);
my $VAR = '...'; # no need for "our" now
sub get_var { return $VAR }
...
1;
and then
package THIS;
use warnings;
use strict;
use XYZ qw(get_var);
my $var = get_var();
...
1;
See Exporter.
As for what you tried to do, there are two direct problems
$VAR from XYZ is never imported into THIS. If you need symbols from other packages you need to import them.† Those packages have to make them available first, so you need to add it to #EXPORT_OK as well.
Like above but with $VAR instead of get_var()
package XYZ;
...
use Exporter qw(import);
our #EXPORT_OK = qw($VAR);
our $VAR = '...'; # need be "our" for this
with
package THIS;
...
use XYZ qw($VAR);
print "$VAR\n";
Now $VAR can be used directly, including being written to (unless declared constant); that can change its value under the feet of yet other code, which may never even know about any of it.
Another way is to use #EXPORT and then those symbols are introduced into every program that says use Package;. I strongly recommend to only use #EXPORT_OK, when callers need to explicitly list what they want. That also nicely documents what is being used.
Even once you add that, there is still a variable with the same name in THIS, which hides (masks, shadows) the $XYZ::VAR. So remove our $VAR in THIS. This is an excellent example of one problem with globals. Once they're introduced we have to be careful about them always and everywhere.
But there are far greater problems with sharing variables across modules.
It makes code components entangled and the code gets harder and harder to work with. It runs contrary to principles of well defined scopes and modular design, it enables action at a distance, etc. Perl provides many good tools for structuring code and we rarely need globals and shared variables. It is telling that the Exporter itself warns against that.
Note how now my $VAR in XYZ is not visible outside XYZ; there is no way for any code outside XYZ to know about it or to access it.‡ When it is our then any code in the interpreter can write it simply as $XYZ::VAR, and without even importing it; that's what we don't want.
Of course that there may be a need for or good use of exporting variables, what can occasionally be found in modules. That is an exception though, to be used sparingly and carefully.
† Unless they're declared as package globals under a lexical alias via our in their package, in which case they can be used anywhere as $TheirPackageName::varname.
‡ This complete privacy is courtesy of my.
You do not want our $VAR; in THIS's namespace. That creates a lexical reference to $THIS::VAR. Not what you want.
Instead, you need to use properly:
use XYZ qw($VAR);
However, XYZ doesn't have an import to run here, so you need to update that. There are two ways to fix XYZ to do this - one is to import import, e.g., use Exporter qw(import);, the other is to derive off Exporter, e.g., use parent qw(Exporter);. Both of these will get XYZ->import(...) to work properly.
Once XYZ is useing Exporter correctly, then the use XYZ qw($VAR); line will cause perl to implicitly load XYZ and call XYZ->import(qw($VAR)), which will import that variable into your namespace.
Now, having answered your question, I will join others in suggesting that exporting variables is a very bad code smell, and probably is not the best / cleanest way to do what you want.

When is our needed in a Perl program?

It seems that our is only needed for exposing a (global) variable in a package. In other contexts, its use only helps readability but is not required.
And if the above observation is right, then by following the practice of encapsulation, it's not even needed in a package, because my would be used, and getter and setter would be provided.
Assuming my application can completely be implemented using OOD, and that within a package, data is strictly passed around using args to subroutines, would I then completely obviate the need for our?
Use our when...
You're required to use a global variable.
You want to use local (which you probably shouldn't).
In general you're correct, anything you might do as a global variable could be done with a class method accessor gaining all the advantages of encapsulation.
For example...
package Foo;
use strict;
use warnings;
our $Thing = 42;
compared to...
package Foo;
use strict;
use warnings;
sub thing { 42 }
What happens if $Foo::Thing is no longer a simple constant? What if it's something that turns out to be expensive to calculate and rarely used? By encapsulating with Foo->thing you can do the calculation only when needed.
It also allows subclasses to override class information.
package Bar;
our #ISA = qw(Foo);
sub thing { 23 }
And that brings us to when to use our: when you have to. There's a lot of Perl features and libraries that read global variables either by convention or implementation. The most common examples are #ISA for subclassing, $VERSION, and the salad of Exporter variables like #EXPORT.
There are better ways to do this, and many modules like Exporter have replacements, but many of these conventions were laid down when Perl 5 wasn't comfortable with OO.
There is one final use of our and that's to take advantage of local. It can be used to pass extra data around without changing the function signatures. The original value is automatically restored when the function exits.
our $foo;
sub something {
...do something involving $foo and set $stuff...
local $foo = $stuff;
something();
}
Yes, this is a poor example.
The circumstances where this is useful and advisable are, again, indicative of bad design. Usually it's used to pass extra data between functions without changing their signature, often as part of recursion. File::Find is littered with this technique. Run perldoc -m File::Find and poke around.
It is needed in certain circumstances, such as accessing the class variables from other locations where you don't need/can't use a getter:
package Package;
our $VERSION = '0.01';
1;
Now:
perl -wMstrict -MPackage -E 'say $Package::VERSION'
0.01
If the $VERSION variable was declared with my:
Use of uninitialized value $Package::VERSION in say at -e line 1.
That is, the variable is not visible outside of the package namespace itself, because when using my, it is lexical to the package itself, ie. it's in package scope only.
You must also use our if you are exporting variables:
package Package;
use Exporter;
our #ISA = 'Exporter';
our #EXPORT = qw($x);
our $x = 10;
1;
This will print 10:
perl -wMstrict -MPackage -E 'say $x'
...but with my, you'll get the same warning as above.

Globally declare a hash in perl

I have a hash which is present in the main perl script (.pl) I want the hash to be visble to the modules (.pm) that are called in the main perl script. How can I declare it globally? Is it possible?
First off - this is a bad idea. Globals or super-globals like you're looking for lead to code with complicated dependencies all over the place - the very thing you're trying to avoid by using modules in the first place.
However - you can declare a variable with our and then access it via package name.
our %thing = ( key => "value" );
print Dumper \%main::thing;
This is visible elsewhere in the namespace via full name and module.
(If you really must, you can start mucking around with TYPEGLOBs, but trust me when I say this is a bad idea).
This breaks many rules of good software design, but it's possible using package variables.
In hash.pl:
#!/usr/bin/perl
use strict;
use warnings;
use 5.010;
use MyHashModule;
our %hash = (one => 1, two => 2, three => 3);
say hashkeys();
In MyHashModule.pm
package MyHashModule;
use strict;
use warnings;
use base 'Exporter';
our #EXPORT = qw[hashkeys];
sub hashkeys {
return keys %main::hash;
}
1;
But I can't repeat enough what a terrible idea this is. If you were to explain more about what you are actually trying to do, we could probably come up with a solution that is far saner.
It is strongly recommended you look to declare local variables as opposed to global variables, whenever possible. If you are only needing to write a small script, there may be no problems with declaring a global variable, but as the script gets bigger, or if you or another developer adds more functionality, there may be some hard to find logic errors that creep in.
That being said, if you must declare a global variable, you would simply change the my to our, like this:
our %global_variable = (key => "value");

Name space pollution from indirectly included module

Consider the following script p.pl:
use strict;
use warnings;
use AA;
BB::bfunc();
where the file AA.pm is:
package AA;
use BB;
1;
and the file BB.pm is:
package BB;
sub bfunc {
print "Running bfunc..\n";
}
1;
Running p.pl gives output (with no warnings or errors):
Running bfunc..
Q: Why is it possible to call BB::bfunc() from p.pl even though there is no use BB; in p.pl? Isn't this odd behavior? Or are there situation where this could be useful?
(To me, it seems like this behavior only presents an information leak to another package and violates the data hiding principle.. Leading to programs that are difficult to maintain.. )
You're not polluting a namespace, because the function within BB isn't being 'imported' into your existing namespace.
They are separate, and may be referenced autonomously.
If you're making a module, then usually you'll define via Exporter two lists:
#EXPORT and #EXPORT_OK.
The former is the list of things that should be imported when you use the package. The latter is the things that you can explicity import via:
use MyPackage qw ( some_func );
You can also define package variables in your local namespace via our and reference them via $main.
our $fish = "haddock";
print $main::fish;
When you do this, you're explicitly referencing the main namespace. When you use a module, then you cause perl to go and look for it, and include it in your %INC. I then 'knows about' that namespace - because it must in order for the dependencies to resolve.
But this isn't namespace pollution, because it doesn't include anything in your namespace until your ask.
This might make a bit more sense if you have multiple packages within the same program:
use strict;
use warnings;
package CC;
our $package_var = "Blong";
sub do_something {
print $package_var,"\n";
}
package main;
use Data::Dumper;
our $package_var = "flonk";
print Dumper $package_var;
print Dumper $CC::package_var;
Each package is it's own namespace, but you can 'poke' things in another. perl will also let you do this with object - poking at the innards of instantiated objects or indeed "patch" them.
That's quite powerful, but I'd generally suggest Really Bad Style.
While it's good practice to use or require every dependency that you are planning to access (tried to avoid use here), you don't have to do that.
As long as you use full package names, that is fine. The important part is that Perl knows about the namespaces. If it does not, it will fail.
When you use something, that is equivalent to:
BEGIN {
require Foo::Bar;
Foo::Bar->import();
}
The require will take the Foo::Bar and convert it to a path according to the operating system's conventions. On Linux, it will try to find Foo/Bar.pm somewhere inside #INC. It will then load that file and make a note in %INC that it loaded the file.
Now Perl knows about that namespace. In case of the use it might import something into your own namespace. But it will always be available from everywhere after that as long as you use the full name. Just the same, stuff that you have in your main script.pl would be available inside of packages by saying main::frobnicate(). (Please don't do that!)
It's also not uncommon to bundle several namespaces/packages in one .pm module file. There are quite a few big names on CPAN that do it, like XML::Twig.
If you do that, and don't import anything, the only way to get to the stuff under the different namespaces is by using the full name.
As you can see, this is not polluting at all.