Perl file changes package between / inside functions - perl

I'm looking at some of our Perl codebase and am puzzled by the use of package in some files.
We have a file containing some useful functions, functions.pl, which is laid out roughly like this:
package functions;
use strict;
sub function_a {
# code here
}
sub function_b {
# code here
}
package main;
sub function_c {
my ($arguments, $for, $this, $function) = #_;
package functions;
# Actual function code here.
}
(Function and package names changed, obviously.)
Functions in this file are used in other scripts by require 'functions.pl' and then calling &function_c() - since the scripts where function_c is called do not declare a package, presumably they're in the main namespace so don't have to prepend anything to function_c when calling it.
function_a and function_b aren't used outside this file, so presumably keeping the main body of function_c back in the non-main namespace means that code in there doesn't have to prepend functions:: to any calls to them.
Does anyone know why someone might write a script to be require'd in this way, rather than writing it as a module and explicitly importing certain functions?
And I know that there's more than one way to do it in Perl, but is package really supposed to be switched around in one file whenever you feel like it like this?

Technically, there's nothing wrong with the code. The package declaration can indeed be used to "switch around" the current package like that.
That said, it's certainly not the standard or generally recommended way to do this; as you note, that would be to turn the script into a module and (optionally) export the public functions into the namespace where the module is used.
One practical use of multiple package declarations is in OO code, where you may want to define multiple classes in one file, e.g. like this:
package MyClass;
# ... MyClass methods here ...
package MyClass::Helper {
# ... helper class methods here ...
}
# ... more MyClass methods here ...
or, in older Perl versions (< 5.14):
package MyClass;
# ... MyClass methods here ...
{
package MyClass::Helper;
# ... helper class methods here ...
}
# ... more MyClass methods here ...

One reason could be that the script was originally designed to import functions from different set of files (eg:function.pl;function1.pl;function2.pl) based on certain user inputs or certain conditions.
This will entail importing the functions at run time and so 'require $function' where $function could be function.pl or function1.pl
Other reason could be that the person was not aware of modules at that time ;)

Related

Importing subroutines from parent class

I am just a beginner with perl and trying to wrap my head around objects.
I am having no problem creating a object, however, I am having issues when I introduce a child class (sorry if wrong terminology) and exporting everything to the main:: script. When I say everything essentially I mean the subroutines (not methods) I want exported from the parent .pm. See code below.
#main.pl
use A::B;
my $string = format_date();
#A.pm
package A;
use strict;
use Exporter qw(import);
our #EXPORT = qw(format_date);
sub format_date { #do stuff}
#other subroutines/methods in package A
#B.pm -> located in A/B.pm
package A::B;
use strict;
use base qw(A);
other code in package B
I was expecting the 'use A::B;' syntax to load the format_date subroutine to the main:: script. When I run it I get - Undefined subroutine &main::format_date called at.....
When I use 'use A;' in main.pl everything runs fine.
What am I missing?
Note - some of our stations use 5.8.8 so some syntax needs to be slightly older, like the use base. Most use 5.34, but not all.
Subclasses inherit methods defined in parent classes, but don't automatically import subs from parent classes. (Inheritance is more a run-time feature, and importing is more a compile-time feature.)
Generally speaking, if you're trying to use object-oriented features to write a module (like using inheritance, defining methods, a constructor, a destructor, etc) and the module is also an exporter, then you probably want to step back to the design phase. Consider splitting it into two separate modules: an object-oriented one and a function exporter.
(It certainly is possible to write modules that do both, but in most cases, it's an sign that you've got a confused design.)
What am I missing?
In short: main.pl calls a subroutine (format_date) by its unqualified name that's never been imported to it. (Further, the program doesn't even load the package in which that sub is defined.)
The problem is that the question mixes up notions of class and package. While Perl allows us to do so as the distinction is blurry,† we don't have to abuse it: your packages A and A::B aren't much of a class -- they cannot construct an instance (an object) as there is no sub that bless's a referent, and they don't use other object-oriented tools either (the indicated inheritance isn't used).
Then treating them as a class hierarchy bites: that inheritance established in the question doesn't import subs, as explained by tobyink.
The simplest way to fix this is to make your packages into normal classes and use them as such.
The program (main.pl)
use warnings;
use strict;
use feature 'say';
use A::B;
my $obj = A::B->new;
my $string = $obj->format_date();
say $string;
File A.pm
package A;
use warnings;
use strict;
sub new { bless {}, shift } # make it a proper constructor...
sub format_date { return scalar localtime }
# other subroutines/methods in package A
1;
File A/B.pm
package A::B;
use warnings;
use strict;
use parent qw(A); # or, if you must: use base qw(A)
# other code in package B
1;
Now calling perl main.pl prints the (local) date, Wed Jan 18 13:56:47 2023.
Another way would be to have them as mere packages and to write A::B so that it (explicitly) imports needed subs from A, as they are requested. This is much more complicated and has no advantages over having classes.
See this SO page for more.
Alternatively, you could carry on the act and with the code you have actually treat A and A::B as classes, with A::B inheriting from A -- call format_date as a class method.
So instead of my $string = format_date(), use
my $string = A::B->format_date();
Now format_date is taken as a class method and since it is not defined in A::B the inheritance hierarchy is followed and it is found in A. The package name (A::B) is passed as the first argument and the method runs.
However, I wouldn't recommend this. Using packages both as classes and for exporting is complicated to get right, complex in use -- and there is no reason for it.
† A class is a package, firstly. See the reference perlobj and the tutorial perlootut.
For example, with an ordinary package, that never desired to be a class, we can call its subs as Packname->subname -- and it will behave as a class method, where Packname gets passed to subname as the first argument.
But this is more of a side-effect of Perl's (intentionally) simple object-oriented system, that shouldn't be abused. I strongly recommend to not mix: don't push an exporting package to be used as a class (perhaps by slapping an OO feature to it, like inheritance), and don't EXPORT from a class (a package intended for object-oriented use, that normally has a bless-ing sub(s)).
See the footnote, and the preceding text, in this post, for more elaborate statements on class-vs-package in Perl.

When do Perl package variables fall out of scope?

From my main program I require a file containing a package, and then call a subroutine from that package:
while($somecondition){
require( 'people.pm' );
my $result = PERSON::stuff($args);
}
The PERSON package has multiple subs and some 'our' variables declared:
package PERSON;
our $name;
our ...
sub stuff {
...
}
In my understanding of other more object oriented languages you would need to declare a new object instance, maybe with its own constructor/initialization functions to use "package" variables. That doesn't seem to be the case here with Perl.
I'm dealing with legacy code so I don't want to change much, I just want to understand when the package variables ($name) come into existence, and when are they returned to memory from the perspective of the main program.
Would putting a PERSON::stuff() call after the while loop have new package variables?
After calling a single function inside a package do the package variables live until the end of the program?
The question mixes up some concepts so let's first address what appears to be the main issue: If a package is require'd inside some scope, what of it outside of that scope?
In short, (dynamical global) symbols from the package are accessible everywhere in the unit in which it is require'd, via their fully qualified names.†
Let's try with an example
use warnings;
use strict;
use feature 'say';
TEST_SCOPE: {
say "In scope, in ", __PACKAGE__;
require TestPack;
#hi(); # "Undefined subroutine &main::hi called..."
TestPack::hi(); # ok
#say $global; # $global ... who??
say $TestPack::global; # ok
say "Leaving scope\n";
};
say "--- in ", __PACKAGE__;
TestPack::hi(); # ok
say $TestPack::global; # ok
File TestPack.pm:
package TestPack;
use warnings;
use strict;
use feature 'say';
#use Exporter qw(import); # This is normally done to export symbols
#our #EXPORT_OK = qw(hi); # (unless the package is a class)
our $global = 7;
sub hi { say "hi from ", __PACKAGE__ }
1;
One needs to use fully qualified names for those symbols as they weren't imported. If the package exports symbols and we import some‡ then they go into the calling package's namespace so in the example above they'd be available in main::, so they can be accessed by any code in the interpreter by their exported names (hi, no need for TestPack::hi). One cannot access lexical variables from that package (created with my, our, state)§.
This also works if instead of the mere block (named TEST_SCOPE) we introduce another package, and require our TestPack inside of it.
...
package AnotherPack {
require TestPack;
...
1;
};
...
TestPack::hi(); # ok
...
(That package should be inside a BEGIN block really, what doesn't change the main point here.) Global symbols from TestPack are still accessible in main::, via their fully qualified names. The exported names, which we import along with require, are then available as such in this package, but not in main::.
Comments on the question
Package name (PERSON) and the filename for it (person.pm) have to agree. For example, the namespace (==package) Person is defined in the file Person.pm
This is about basics related to require-ing a package; it has nothing to do with object-oriented notions. (Even though, a class is firstly a package. See perlootut and perlobj.) Also see Packages in perlmod and our.
If you were to use a package that bless-es, the returned object (instance) is (assigned to) a lexical variable. As such, it doesn't exist outside of the scope. The package itself, though, is visible just as shown above but in object-oriented work we don't want to poke at a package's insides, but rather use methods via objects.
So yes, to work with that package outside of the scope in which it is require-ed you'd need to instantiate an object in that other scope as well. That would still work much like the example above -- we can use the package name, outside of scope in which it was required, to instantiate an object (try!), even though I'd raise questions of such design (see next)
This hints at a convoluted design though, bringing in packages inside scopes, at runtime (via require); what is the context? (And I hope it's not really in a while loop.)
† Print out the main's symbol table, %main:: (using Data::Dumper for example) and we find
"TestPack::" => *main::TestPack::
along with all other symbols from TestPack namespace.
‡ If a package exports symbols and we require the package then we can import by
require Pack::Name;
Pack::Name->import( qw(subname anothername ...) );
§ Note that our creates a lexical which is an alias for a package variable, which is accessible.
zdim's answer gives a very good explanation of how package variables work and can be used. I don't think it directly answers the question of when they fall out of scope though.
Succinctly:
Package variables are global static variables, just namespaced so the "global" aspect isn't as terrible.
As with any static variable, they are in scope for the entire execution of the program.
You also asked:
In my understanding of other more object oriented languages you would need to declare a new object instance, maybe with its own constructor/initialization functions to use "package" variables. That doesn't seem to be the case here with Perl.
Package variables are fairly unrelated to object-oriented programming in Perl. They are not used for storing instance data. (Except sometimes in the case of inside-out objects, though that's more of an advanced topic.)

Prevent multiple inclusions in perl

Suppose I have two files: a module file that looks like this:
package myPackage;
use Bio::Seq;
and another file that looks like this:
use lib "path/to/lib";
use myPackage;
use Bio::Seq;
How can i prevent that Bio::Seq is included twice? Thanx
It won't be included twice. use semantics could be described like that:
require the module
call module's import
As the documentation says, it's equivalent to:
BEGIN { require Module; Module−>import( LIST ); }
require mechanism, on the other hand, assures modules' code is compiled and executed only once, the first time some require it. This mechanism is based on the special variable %INC. You can find further details in the documentation for use, require, and in the perlmod page.
use Foo
is mostly equivalent to
# perldoc -f use
BEGIN {
require "Foo.pm";
Foo->import();
}
And require "Foo" is mostly equivalent to
# perldoc -f require
sub require {
my ($filename) = #_;
if (exists $INC{$filename}) {
return 1 if $INC{$filename};
die "Compilation failed in require";
}
# .... find $filename in #INC
# really load
return do $realfilename;
}
So
No, the code won't be "Loaded" more than once, only "imported" more than once.
If you have code such as
package Bio::Seq;
...
sub import {
# fancy stuff
}
And you wanted to make sure a library was loaded, but not call import on it,
#perldoc -f use
use Bio::Seq ();
Modules aren't "included" in Perl like they are in C. They are "loaded", by which I mean "executed".
A module will only be loaded/executed once, no matter how many use statements specify it.
The only thing that happens for every use of a module is the call to the module's import method. That is typically used to export symbols to the using namespace.
I guess, you want to optimize the loading(usage) of Module.
For optimizing, dynamic loading may be helpful.
For dynamically loading a Perl Module, we use Class::Autouse.
For more details you can visit this link.
I guess the OP may look for a way of avoiding a long list of use statement boilerplate at the beginning of his/her Perl script. In this case, I'd like to point everyone to Import::Into. It works like the keyword import in Java and Python. Also, this blog post provides a wonderful demo of Import::Into.

Perl: Dynamic module loading, object inheritance and "common helper files"

In a nutshell I try to model a network topology using objects for every instance in the network. Additionally I got a top-level manager class responsible for, well, managing these objects and performing integrity checks. The filestructure looks like this (I left out most of the object-files as they are all structured pretty equal):
Manager.pm
Constants.pm
Classes/
+- Machine.pm
+- Node.pm
+- Object.pm
+- Switch.pm
Coming from quite a few years in OOP, I'm a fan of code reuse etc. so I set up inheritance between thos objects, the inheritance tree (in this example) looks like this:
Switch -+-> Node -+-> Object
Machine -+
All those objects are structured like this:
package Switch;
use parent qw(Node);
sub buildFromXML {
...
}
sub new {
...
}
# additonal methods
Now the interesting part:
Question 1
How can I ensure correct loading of all those objects without typing out the names statically?
The underlying problem is: If I just require "$_" foreach glob("./Classes/*"); I get many "Subroutine new redefined at" errors. I also played around with use parent qw(-norequire Object), Module::Find and some other #INC modifications in various combinations, to make it short: It didn't work. Currently I'm statically importing all used classes, they auto-import their parent classes.
So basically what I'm asking: What is the (perl-)correct way of doing this?
And advanced: It would be very helpful to be able to create a more complex folder structure (as there will be quite a few objects) and still have inheritance + "autoloading"
Question 2 - SOLVED
How can I "share my imports"? I use several libraries (my own, containing some helper functions, LibXML, Scalar::Util, etc.) and I want to share them amongst my objects. (The reasoning behind that is, I may need to add another common library to all objects and chances are high that there will be well above 100 objects - no fun editing all of them manually and doing that with a regex / script would theoretically work but that doesn't seem like the cleanest solution available)
What I tried:
import everything in Manager.pm -> Works inside the Manager package - gives me errors like "undefined subroutine &Switch::trace called"
Create a include.pl file and do/require/use it inside every object - gives me the same errors.
Some more stuff I sadly don't remember
include.pl basically would look like that:
use lib_perl;
use Scalar::Util qw(blessed);
use XML::LibXML;
use Data::Dumper;
use Error::TryCatch;
...
Again I ask: What's the correct way to do it? Am I using the right approach and just failing at the execution or should I change my structure completely?
It doesn't matter that much why my current code doesn't work that well, providing a correct, clean approach for those problems would be enough by far :)
EDIT: Totally forgot perl version -_- Sidenote: I can't upgrade perl, as I need libraries that are stuck with 5.8 :/
C:\> perl -version
This is perl, v5.8.8 built for MSWin32-x86-multi-thread
(with 50 registered patches, see perl -V for more detail)
Copyright 1987-2006, Larry Wall
Binary build 820 [274739] provided by ActiveState http://www.ActiveState.com
Built Jan 23 2007 15:57:46
This is just a partial answer to question 2, sharing imports.
Loading a module (via use) does two things:
Compiling the module and installing the contents in the namespace hierarchy (which is shared). See perldoc -f require.
Calling the import sub on each loaded module. This loads some subs or constants etc. into the namespace of the caller. This is a process that the Exporter class largely hides from view. This part is important to use subs etc. without their full name, e.g. max instead of List::Util::max. See perldoc -f use.
Lets view following three modules: A, B and User.
{
package A;
use List::Util qw(max);
# can use List::Util::max
# can use max
}
{
package User;
# can use List::Util::max -> it is already loaded
# cannot use max, this name is not defined in this namespace
}
Package B defines a sub load that loads a predefined list of modules and subs into the callers namespace:
{
package B;
sub load {
my $package = (caller())[0]; # caller is a built-in, fetches package name
eval qq{package $package;} . <<'FINIS' ;
use List::Util qw(max);
# add further modules here to load
# you can place arbitrarily complex code in this eval string
# to execute it in all modules that call this sub.
# (e.g. testing and registering)
# However, this is orthogonal to OOP.
FINIS
if ($#) {
# Do error handling
}
}
}
Inside the eval'd string, we temporarily switch into the callers package and then load the specified module. This means that the User package code now looks like this:
{
package User;
B::load();
# can use List::Util::max
# can use max
}
However, you have to make sure the load sub is already loaded itself. use B if in doubt. It might be best to execute B::load() in the BEGIN phase, before the rest of the module is compiled:
{
package User;
BEGIN {use B; B::load()}
# ...
}
is equivalent to
{
package User;
use B;
use List::Util qw(max);
# ...
}
TIMTOWTDI. Although I find evaling code quite messy and dangerous, it is the way I'd pursue in this scenario (rather than doing files, which is similar but has different side effects). Manually messing with typeglobs in the package namespace is hell in comparision, and copy-pasting a list of module names is like going back to the days when there wasn't even C's preprocessor.
Edit: Import::Into
… is a CPAN module providing this functionality via an interesting method interface. Using this module, we would redefine our B package the following way:
{
package B;
use List::Util; # you have to 'use' or 'require' this first, before using 'load'.
use Import::Into; # has to be installed from CPAN first
sub load {
my $package = caller;
List::Util->import::into($package, qw(max));
# should work too: strict->import::into($package);
# ...
}
}
This module hides all the dirty work (evaling) from view and does method call resolution gymnastics to allow importing pragmas into other namespaces.
Addendum to Import::Into Solution
I found a scenario that seems to require eval() from within the Import::Into solution. In this scenario, mod User is effectively among the uses from package B. This may be a common scenario for people using Import::Into.
Specifics:
I created module uses_exporter with separate subs for importing
different groups of modules, e.g. load_generic() and
load_list_utils().
The uses in load_list_utils() are to public mods like
List::MoreUtils, AND to a module of my own, list_utils_again. That
local module also calls load_list_utils(). The call fails if
load_list_utils() uses list_utils_again.
My solution was to put the use to list_utils_again into an eval which
does not excecute when $target eq 'list_utils_again'
The correct idiomatic Perl way to do this is not to always load a bunch a modules whether used or not; it is to have every file use those modules it directly (not indirectly) needs.
If it turns out that every file uses the same set of modules, you might make things simpler by having a single dedicated module to use all those in that common set.

how to fake a perl module for dependency?

An external Perl library that I am using has a dependency (DBD::mysql) that I will not be using in my application (DBD::SQLite), so I would like the system to just pretend the dependency is there, even if it's a "fake".
Can I just create an empty DBD::mysql.pm module that compiles or is there a more straightforward way of doing this?
So I think there are few issues here.
When you say dependency, do you mean the external module simply tries to require or use DBD::mysql? If that is the case then you should advise the developer that he shouldn't be explicitly doing that because that defeats the purpose of using DBI. The database driver should be selected on the fly based on the DSN.
Assuming that the author is merely useing the package name because he thought that was a useful or meaningful thing to do, then yes, you may override that package, and there are a few ways to do it.
As you suggested, you can merely create your own module DBD/mysql.pm that would define the DBD::mysql package.
There are some other things you could do if you are interested. Instead of littering your source tree with fake directories and files, you just need to convince Perl that the module was loaded. We can do this by directly manipulating %INC.
package main; # or whereever
BEGIN {
$INC{'DBD/mysql.pm'} = "nothing to see here";
}
Simply by adding this hash key, we preclude a search of the filesystem for the offending module. Observe that this is in a BEGIN block. If the external author did a use then we must populate this value before the use statement is evaluated. The use statements are equivalent to a require and import wrapped in a BEGIN.
Now lets further speculate in the general sense that the external author was attempting to call methods of the package. You will get run time errors if there if those symbols don't exist. You can take advantage of Perl's AUTOLOAD to intercept such calls and do the right thing. What The right thing is can vary a lot, from simply logging a message to something more elaborate. For instance, you could use this facility to examine the depth of the coupling that the author introduced by monitoring all the calls.
package DBD::mysql;
sub AUTOLOAD {
printf(
"I don't wanna '%s' called from '%s'\n", $AUTOLOAD, caller(0)
);
}
package main; # or whereever
BEGIN {
$INC{'DBD/mysql.pm'} = "nothing to see here";
}
DBD::mysql::blah()
Now let's also cover the case where the offending author also created some object oriented instances of a class, and his code doesn't properly account
for your stub code. We will stub the constructor which we assume is new to just bless an anonymous hash with our package name. That way you won't get
errors when he calls methods on an instance.
package DBD::mysql;
sub AUTOLOAD {
printf(
"I don't wanna '%s' called from '%s'\n", $AUTOLOAD, caller(0)
);
}
sub new {
bless({}, __PACKAGE__)
}
package main; # or whereever
BEGIN {
$INC{'DBD/mysql.pm'} = "nothing to see here";
}
my $thing = new DBD::mysql;
$thing->blah()