Is it good practice to export variables in Perl? - perl

I'm finding it very convenient to pass configuration and other data that is read or calculated once but then used many times throughout a program by using Perl's use mechanism. I'm doing this by exporting a hash into the caller's namespace. For example:
package Myconfiguration;
my %config;
sub import {
my $callpkg = caller(0);
my $expsym = $_[1];
configure() unless %config;
*{"$callpkg\::$expsym"} = \%config;
}
and then in other modules:
use MyConfiguration (loc_config_sym);
if ( $loc_config_sym{paramater} ) {
# ... do stuff ...
}
However, I'm not sure about this as a best practice. Is it better to add a method that returns a hash ref with the data? Something else?

If you only want to read the values of %config, then why not have a routine to do it for you?
my %config;
sub config_value
{
my ($value) = #_;
return $config{$value};
}
You could export this by default if you wanted to:
package Mypackage;
require Exporter;
#EXPORT = qw/config_value/;
The reason that I would not allow access to the hash all over the place in lots of different modules is that I would have a hard time mentally keeping track of all the places it was being used. I would rather make the above kind of access routine so that, if some bug happened, I could add a print statement to the routine, or something, to find out when the value was being accessed. I don't know if that is related to "best practices" or it is just because I'm stupid, but the kind of confusion created by global variables scares me.
There's no reason you can't have a set routine too:
sub set_value
{
my ($key, $value) = #_;
$config{$key} = $value;
}

I think it's better to work with a copy of the config hash. This way, if you modify some elements, this won't affect the rest of your code.
I usually use simple object (optionally Singleton) for this with a single method like get_property().

I suggest never exporting variables. Create a class that can return a reference to a private variable instead. People can then store it in a variable with whichever name they like, and only when they decide they want to use it.

In general, it's best to let the user decide whether or not to import symbols. Exporter makes this easy. Writing a custom import method to let the user decide what to name imported symbols can be useful on rare occasions but I don't think this is one of them.
package MyConfiguration;
require Exporter;
our #ISA = qw(Exporter);
our #EXPORT_OK = qw(Config);
our %Config;
And then, in your script:
use MyConfiguration;
print $MyConfiguration::Config{key};
or
use MyConfiguration qw(Config);
print $Config{key};

Related

How to create globally available functions in Perl?

Is it possible to create global functions available across all namespaces like perl built-in functions?
First of all, "function" is the name given to Perl's named list operators, named unary operators and named nullary operators. They are visible everywhere because they are operators, just like ,, && and +. Subs aren't operators.
Second of all, you ask how to create a global sub, but all subs are already global (visible from everywhere) in Perl! You simply need to quality the name of the sub with the package if it's not in the current package. For example, Foo::mysub() will call my_sub found in package Foo from anywhere.
But maybe you want to be able to say mysub() instead of Foo::mysub() from everywhere, and that's a very bad idea. It violates core principles of good programming. The number of types of problems it can cause are too numerous to list.
There is a middle ground. A better solution is to create a sub that can be imported into the namespaces you want. For example, say you had the module
package Foo;
use Exporter qw( import );
our #EXPORT_OK = qw( my_sub );
our %TAGS = ( ALL => \#EXPORT_OK );
sub my_sub { ... }
1;
Then, you can use
use Foo qw( my_sub );
to load the module (if it hasn't already been loaded) and create my_sub in the current package. This allows it to call the sub as my_sub() from the package into which it was imported.
There is nothing simple that would allow one to somehow "register" user's subs with the interpreter, or some such, so that you could run them as builtins in any part of the program.
One way to get the behavior you ask for is to directly write to symbol tables of loaded modules. This has to be done after the modules have been loaded, and after subs that you add to those modules have been defined. I use INIT block in the example below.
Note that this has a number of weaknesses and just in general the idea itself is suspect to me, akin to extending the interpreter. Altogether I'd much rather write a module with all such subs and use standard approaches for good program design to have that module loaded where it needs to go.
Having said that, here is a basic demo
use warnings;
use strict;
use feature 'say';
use Data::Dump qw(dd pp);
use TestMod qw(modsub);
sub t_main { say "In t_main(), from ", __PACKAGE__ }
modsub("Calling from main::");
INIT {
no strict 'refs';
foreach my $pkg (qw(TestMod)) {
*{ $pkg . '::' . 'sub_from_main' } = \&t_main;
}
dd \%TestMod::;
}
This copies the reference to t_main from the current package (main::) into the symbol table of $pkg, under the name of sub_from_main, which can then be used with that name in that package.
For simplicity the name of the module is hardcoded, but you can use %INC instead, and whatever other clues you have, to figure out what loaded modules' stashes to add to.
The benefactor (or the victim?) module TestMod.pm
package TestMod;
use warnings;
use strict;
use feature 'say';
use Exporter qw(import);
our #EXPORT_OK = qw(modsub);
sub modsub {
say "In module ", __PACKAGE__, ", args: #_";
say "Call a sub pushed into this namespace: ";
sub_from_main();
}
1;
The name of the added sub can be passed to modules as they're loaded, instead of being hardcoded, in which case you need to write their import sub instead of borrowing the Exporter's one.
There are also modules that allow one to add keywords, but that's no light alternative.
The answer seems to be no, but you can impliment most of the behaivior that you want by using the symbol table *main::main:: to define a subroutine in all the namespaces.
use strict;
use warnings;
use Data::Dump qw(dd);
my $xx = *main::main::;
package A;
sub test {
printf "A::%s\n", &the_global;
}
package B;
sub the_global
{
"This is B::the_global";
}
sub test {
printf "B::%s\n", &the_global;
}
package main;
my $global_sub = sub { "The Global thing" };
for my $NS (keys %$xx) {
if ($NS =~ /^[A-Z]::$/) {
my $x = $NS . 'the_global';
if (defined &$x) {
printf "Skipping &%s\n", $x;
} else {
printf "Adding &%s\n", $x;
no strict 'refs';
*$x = $global_sub;
}
}
}
A::test;
This will not work on packages that are not referenced at all before the for loop above is run. But this would only happen if a require, use or package was eval'd after the code started running.
This is also still a compiler issue! You either need to refer to the global function as the_global() or &the_global if you are (as you should be) using use strict.
Sorry for my late response and thank you all for yours detailed answers and explanations.
Well.. I understood the right answer is: IT'S NOT POSSIBLE!
I'm mantaining a Perl framework used by some customers, and that framework exports some specialized subs (logging, event handling, controllers for hardware devices, domain specific subs and so). That's why I tried to figure out how to prevent the developers from importing my subs in all their packages.

Import perl variables into the module

Is there any way in perl to import variables from the main script to the module?
Here is my main.pl:
#!/usr/bin/perl -w
use DBI;
our $db = DBI->connect(...);
__END__
Now I want to use the $db variable in my modules, because I want to avoid duplicate connections and duplicate codes... It is possible?
You can do that by referring to $main::db in other packages. The main namespace always point to globals in the primary namespace if there is none other given. You should read up on package.
Note that this is not a very good idea as your modules will be dependent on main having the connection. Instead, you should construct your objects in a way that let you pass a database handle in. If you require a db connection at all cost, either let them throw an exception or create their own db handle.
If you are not using OO code, make the database handle an argument of every function call.
Also note that it's best practice to name the database handle $dbh.
Let's look at this for non-OO (Foo) and OO (Bar).
# this is package main (but you don't need to say so)
use strictures;
use DBI;
use Foo;
use Bar;
my $dbh = DBI->connect($dsn);
Foo::frobnicate($dbh, 1, 2)
my $bar = Bar->new(dbh => $dbh);
$bar->frobnicate(23);
package Foo;
use strictures;
sub frobnicate {
my ($dbh, $one, $two) = #_;
die q{No dbh given} unless $dbh; # could check ref($dbh)
$dbh->do( ... );
return;
}
package Bar;
use strictures;
sub new {
my ($class, %args) = #_;
die q{No dbh given} unless $args{dbh};
return bless \%args, $class;
}
sub frobnicate {
my ($self, $stuff) = #_;
$self->{dbh}->do(q{INSERT INTO bar SET baz=?}, undef, $stuff);
return;
}
__END__
You can always pass a db handle into a method. I'm not a fan of this approach, but we have code that functions using this approach.
The problem IMHO is in the debugging. It makes it difficult to know anything about the db handle itself from the code in your module, though that might not be an issue for you. Imagine, however going in to debug code that uses a db handle, but you have no idea where it came from. If you get your db handle from a method in your class, you can trace it to that subroutine and immediately you have some information. This is definitely my preferred way of doing things.
If you do pass in a DB handle, you should do some input validation, such as checking for $dbh->isa('DBI::db') (I think that's the class into which db handles are blessed).
My preference, however would be to have a subroutine in your class that gets the db handle, either based on information you pass in, or by information in the sub itself. One thing to consider is that if you're using DBI, the connect_cached() method is very helpful. From the DBI docs:
connect_cached is like "connect", except that the database handle returned is also stored in a hash associated with the given parameters. If another call is made to connect_cached with the same parameter values, then the corresponding cached $dbh will be returned if it is still valid. The cached database handle is replaced with a new connection if it has been disconnected or if the ping method fails.
Using db handle caching of some sort will, regardless of whether you were to have created the db handle in your script or in the class, give you the same connection.
So, I recommend creating a method in your class that takes all of the parameters required to replicate the creation of the db handle as you'd do it in your script, and consider using connect_cached, Apache2::DBI or something that will handle the db connection pooling/abstraction.

How to test module functions which use hardcoded configuration file?

I want to make some tests on my modules.
Unfortunately, some functions in these modules use hardcoded configurations files.
package My::Module;
use strict;
use warnings;
use Readonly;
Readonly my $CONF_FILE => '/my/conf_file.xml';
=head1 FUNCTIONS
=head2 Info($appli)
Returns Application Information
=cut
sub Info
{
my $appli = shift;
my $conf = MyXML::Read($CONF_FILE);
foreach my $a (ARRAY($conf->{application}))
{
return ($a) if ($a->{name} eq $appli);
}
return (undef);
}
[some others functions that use this config file...]
The solution that came to my mind is to create a new function in each module that will change this default config file when I need it.
Then I will use that function in my tests...
Do you have any other (better ?) ideas ?
Well, the proper thing for me to tell you would be "don't use hard coded paths". It'll come back and bite you at some point in the future, I promise.
But... assuming you're resolved to using them, there are a number of ways to allow an override. You're right you could add a function that would let you change it, or you could use an environmental variable:
Readonly my $CONF_FILE => $ENV{'MY_CONF_FILE'} || '/foo/bar';
But the right thing to do is still to allow for other items to be passed in properly if you have a choice.

Do Perl subclasses inherit imported modules and pragmas?

Lets say you have a parent Perl class in one file:
#!/usr/bin/perl
package Foo;
use strict;
use warnings;
use Data::Dumper;
sub new{
my $class = shift;
my %self = ();
return bless %self, $class;
}
1;
and a subclass in a different file:
#!/usr/bin/perl
package Bar;
use base "Foo";
1;
Will the subclass inherit the use statements from the parent? I know the method new will be inherited.
Basically I am trying to reduce the amount of boilerplate in my code and I can't find a clear answer to this question.
You asked in a comment about Test::Most and how it reduces boilerplate. Look at its import method. It's loading the modules into its namespace, adding those symbols to #EXPORT, then re-calling another import through a goto to finally get them into the calling namespace. It's some serious black magic that Curtis has going on there, although I wonder why he just didn't use something like import_to_level. Maybe there are some side effects I'm not thinking about.
I talk quite a bit about this sort of thing in Avoid accidently creating methods from module exports in The Effective Perler. It's in a different context but it's some of the same issues.
Here's a different example.
If some other module loads a module, you have access to it. It's not good to depend on that though. Here are three separate files:
Top.pm
use 5.010;
package Top;
use File::Spec;
sub announce { say "Hello from top!" }
1;
Bottom.pm
package Bottom;
use parent qw(Top);
sub catfiles { File::Spec->catfile( #_ ) }
1;
test.pl
use 5.010;
use Bottom;
say Bottom->catfiles( qw(foo bar baz) );
say File::Spec->catfile( qw( one two three ) );
I only load File::Spec in Top.pm. However, once loaded, I can use it anywhere in my Perl program. The output shows that I was able to "use" the module in other files even though I only loaded it in one:
Bottom/foo/bar/baz
one/two/three
For this to work, the part of the code that loads the module has to load before any other part of the code tries to use that module. As I said, it's a bad idea to depend on this: things break if the loading sequence changes or the loading module disappears.
If you want to import symbols, however, you have to explicitly load the module you want while you are in the package you want to import into. That's just so the exporting module defines the symbols in that package. It's not something that depends with scope.
Ah, good question!
Will the subclass inherit the use statements from the parent?
Well this depends on what you mean by inherit. I won't make any assumptions until the end, but the answer is maybe. You see, perl mixes the ideas of Classes, and Namespaces -- a package is a term that can describe either of them. Now the issue is the statement use all it does is force a package inclusion, and call the targets import() sub. This means it essentially has unlimited control over your package - and by way of that your class.
Now, compound this with all methods in perl being nothing more than subs that take $self as a first argument by convention and you're left with perl5. This has an enormous upside for those that know how to use it. While strict is a lexical pragma, what about Moose?
package BigMooseUser;
use Moose;
package BabyMooseUser;
our #ISA = 'BigMooseUser';
package Foo;
my $b = BabyMooseUser->new;
print $b->meta->name;
Now, where did BabyMooseUser get the constructor (new) from? Where did it get the meta class from? All of this is provided from a single use Moose; in the parent class (namespace). So
Will the subclass inherit the use statements from the parent?
Well, here, in our example, if the effects of the use statement are to add methods, than certainly.
This subject is kind of deep, and it depends if you're talking about pragmas, or more obscure object frameworks, or procedural modules. If you want to mitigate a parents namespace from affecting your own in the OO paradigm see namespace::autoclean.
For boilerplate reduction, I have a couple of strategies: Most of my classes are Moose classes, which takes care of OO setup and also gives me strict and warnings. If I want to have functions available in many packages, I'll create a project specific MyProject::Util module that uses Sub-Exporter to provide me with my own functions and my own interface. This makes it more consistent, and if I decide to change the Dumper (for example) later for whatever reason, I don't have to change lots of code. That'll also allow you to group exports. A class then usually looks something like this:
package Foo;
use Moose;
use MyProject::Util qw( :parsing :logging );
use namespace::autoclean;
# class implementation goes here
1;
If there's other things you regard as boilerplate and want to make simpler to include, it of course depends on what those things are.
A pragmatic answer to your problem: Either use, or look at how Modern::Perl does it to enforce strict and warnings.
You can get a definitive answer by examining the symbol tables for each package:
# examine-symbol-tables.pl
use Bar;
%parent_names = map{$_ => 1} keys %Foo::;
%child_names = map{$_ => 1} keys %Bar::;
delete $parent_names{$_} && ($common_names{$_} = delete $child_names{$_}) foreach keys %child_names;
print "Common names in symbol tables:\n";
print "#{[keys %common_names]}\n\n";
print "Unique names in Bar symbol table:\n";
print "#{[keys %child_names]}\n\n";
print "Unique names in Foo symbol table:\n";
print "#{[keys %parent_names]}\n\n";
$ perl inherit.pl
Common names in symbol tables:
BEGIN
Unique names in Bar symbol table:
ISA isa import
Unique names in Foo symbol table:
Dumper new VERSION

How can I call a Perl class with a shorter name?

I am writing a Perl module Galaxy::SGE::MakeJobSH with OO.
I want to use MakeJobSH->new() instead of Galaxy::SGE::MakeJobSH->new(),
or some other shortnames. How can I do that?
You can suggest that your users use the aliased module to load yours:
use aliased 'Galaxy::SGE::MakeJobSH';
my $job = MakeJobSH->new();
Or you could export your class name in a variable named $MakeJobSH;
use Galaxy::SGE::MakeJobSH; # Assume this exports $MakeJobSH = 'Galaxy::SGE::MakeJobSH';
my $job = $MakeJobSH->new();
Or you could export a MakeJobSH function that returns your class name:
use Galaxy::SGE::MakeJobSH; # Assume this exports the MakeJobSH function
my $job = MakeJobSH->new();
I'm not sure this is all that great an idea, though. People don't usually have to type the class name all that often.
Here's what you'd do in your class for the last two options:
package Galaxy::SGE::MakeJobSH;
use Exporter 'import';
our #EXPORT = qw(MakeJobSH $MakeJobSH);
our $MakeJobSH = __PACKAGE__;
sub MakeJobSH () { __PACKAGE__ };
Of course, you'd probably want to pick just one of those methods. I've just combined them to avoid duplicating examples.
I don't bother with aliasing. I think it's the wrong way to go. If you're just looking for less to type, it might be the answer (but is a new dependency more benefit than risk?). I don't like the idea of tricking a maintenance programmer by hiding the real name from him since the aliasing happens a long way away from its use and there's no indication that what looks like a class name isn't a real class.
I'm mostly looking for easy subclassing, so I let the class decide for itself which module will implement a part.
For instance, I might start with a class that wants to use Foo to handle part of the job. I know that I might want to subclass Foo later, so I don't hard-code it:
package Foo::Bar;
sub foo_class { 'Foo' }
sub new {
....
eval "require $self->foo_class";
$self->foo_class->do_something;
}
In the application, I choose to use 'Foo::Bar':
#!perl
use Foo::Bar;
my $obj = Foo::Bar->new();
Later, I need to specialise Foo, so I create a subclass overrides the parts I need:
package Foo::Bar::Baz;
use parent 'Foo::Bar';
sub foo_class { 'Local::Foo::SomeFeature' }
1;
Another application uses almost all of the same stuff, but with the small tweak:
#!perl
use Foo::Bar::Baz;
my $obj = Foo::Bar::Baz->new();
You can also do a similar thing at the application level if you want to write one program and let users choose the class through configuration.
Thanks cjm.
I just choose to inline aliased.
require Exporter;
our #ISA = qw(Exporter);
our #EXPORT = qw(MakeJobSH);
sub MakeJobSH() {return 'Galaxy::SGE::MakeJobSH';}
aliased works well when you want to only affect calls from packages that explicitly request the aliasing. If you want global aliasing of one namespace to another, use Package::Alias instead.
It is almost exactly same approach as aliased but using standard Perl module:
use constant MakeJobSH => 'Galaxy::SGE::MakeJobSH';
my $job = MakeJobSH->new();