How do I load a module with a path not in the #INC in Perl?
I've read the answers on how to add a path to the #INC, but the issue is that #INC has to be changed at the beginning of the code. After it's compiled, all the modules I've come across look to #INC to find the location of the module. My issue is that we're trying to separate out all these environment specific things to a config file. If it's in a config file, the path needs to be read before it can be pushed to #INC, but then the code has already been compiled and it seems #INC can't be modified.
Is there a way? Is there a library that lets me load a module and pass it a custom path?
Is this a terrible bad thing to do? Why?
Perl has an incremental compilation model which means that code can be executed before other code is even parsed. For this, we can use phase blocks (aka. phasers):
BEGIN {
print "This is executed as soon as the block has been parsed\n";
}
Such a phase block could also be used to load a configuration file.
For example, use statements are effectively syntactic sugar for a BEGIN block.
use Foo::Bar qw/baz qux/;
is equivalent to
BEGIN {
require Foo::Bar; # or: require "Foo/Bar.pm";
Foo::Bar->import(qw/baz qux/);
}
We can also load modules at runtime, although that's only sensible for object-oriented modules.
So we have three options:
Load config in the BEGIN phase and add the correct library paths before loading the actual modules
Load the modules manually during BEGIN with their full path (e.g. require "/my/modules/Foo/Bar.pm"
Figure out the configuration at runtime, load modules after that.
Using bare require is fairly uncomfortable, which is why Module::Runtime exists
Use a BEGIN block to load your custom #INC location and then use lib to include it.
# use lib /a special directory/
BEGIN {
my $lib_to_include = ...;
require lib;
lib->import($lib_to_include);
}
use Module_That_Requires_Special_Dir;
The only thing to note is that whatever code you use to load your custom include directory will have to rely on methods already defined before this BEGIN block. Therefore you can't use a subroutine that is later in the file.
Came across only, which seems to let a path be passed to the use argument like so:
use only { versionlib => '/home/ingy/modules' },
MyModule => 0.33;
It has to be used with a version condition, but putting this here anyways since it's relevant to my question 1 and I wasn't able to find any modules first time around that allowed a path outside #INC.
require supposedly is able to take in a full path, according to the perlfaq with
require "$ENV{HOME}/lib/Foo.pm"; # no #INC searching!
Related
I know specific instances of this question have been answered before:
How can I dynamically include Perl modules without using eval?
How do I use a Perl package known only in runtime?
There are also good answers at Perl Monks:
Writing a Perl module that dynamically loads other modules.
Creating subroutines on the fly
But I would like a robust way to add functionallity to a Perl application that will be:
Efficient: if the code is not needed it should not be compiled.
Easy to debug: error reporting if something goes wrong at the dynamic code, should point at the right place at the dynamic code.
Easy to extend: adding new code should be as easy as adding a new file or directory+file.
Easy to invoke: the main application should be able to use an "add on" without much trouble. An efficient mechanism to check if the "add on" has already been loaded and if not load it, would be a plus.
To illustrate the point, here are some examples that would benefit from a good solution:
A set of scripts that move data from different applications. For instance, moving data from OpenCart to Prestashop, where each entity in the data model has a specific "add on" that deals with the input or output; then an intermediate data model takes care of the transformation of the data. This could be used to move data in any direction or even between different versions of the same ecommerce.
A web application that needs to render different types of HTML in different places. Each "module" knows how to handle a certain information and accepts parameters to do it. A module outputs HTML, another a list of documents, another a document, another a banner, and so on.
Here are some examples that I have used and that work.
Load a function at run time and output the possible compile errors:
eval `cat $file_with_function`;
if( $# ) {
print STDERR $#, "\n";
die "Errors at file $file_with_function\n";
}
Or more robust using File::Slurp:
eval read_file("$file_with_function", binmode => ':utf8');
Check that a certain function has been defined:
if( !defined &myfunction ) {
die "myfunction is not defined\n";
}
The function may be called from there on. This is fine with one function, but not for many.
If the function is put in a module:
require $file_with_function; # needs the ".pm" extension, i.e. addon/func.pm
$name_of_module->import(); # need to know the module name, i.e. Addon::Func
$name_of_module->myfunction(...);
Where the require may be protected inside an eval and then use $# as before.
With Module::Load:
load $name_of_module;
Followed by the import and used in the same way. Security should not be a concern as it may be assumed that the dynamic code comes from a trusted place. Are there better ways? Which way would be considered good practice?
In case it helps, I will be using the solution (among other places, but not exclusively) within the Dancer framework.
EDIT: Given the comments, I add some more info. All cases that I have in mind have in common:
There is more than one dynamic piece of code. Probably many to start with.
Each bit of code has the same interface.
Given the comments and the lack of responses, I have done some research to answer my own question. Comments or other answers are welcome!
Dynamic code
By dynamic code I mean code that is evaluated at run-time. In general, I consider better to compile an application so that you have all the error checking the Perl compiler can offer before starting to execute. Added to use strict and use warnings, you can catch many common mistakes that way. So why using dynamic code at all? These are the reasons I consider:
An application performs many different actions that are chosen depending on the context of execution. For instance, an application extracts certain properties from a file. The way to extract them depends on the file type and we want to deal with many file types, but we do not want to change the application for each new file type we add. We also want the application to start quickly.
An application needs to be expanded on the fly in a way that does not require the application to restart.
We have a large application that contains a number of features. When we deploy the application, we do not want to provide all the possible features all the time, maybe because we licence them separately, maybe because not all of them are able to run under all platforms. By throwing in only the files with the features we want, we have a distribution that does not require changing any code or config files.
How do we do it?
Given the possibilities that Perl offers, solutions to adding dynamic code come in two flavors: using eval and using require. Then there are modules that may help do things in an easier or more maintainable way.
The quick and dirty way
The eval way uses the form eval EXPR to compile a piece of Perl code at run-time. The expression could be a string but I suggest putting the code in a file and grouping other similar files in a convenient place. Then, if possible using File::Slurp:
eval read_file("$file_with_code", binmode => ':utf8');
if( $# ) {
die "$file_with_code: error $#\n";
}
if( !defined &myfunction ) {
die "myfunction is not defined at $file_with_code\n";
}
Specifying the character set to read_file makes sure that the file will be interpreted correctly. It is also good to check that the compilation was correct and that the function we expect was defined. So in $file_with_code, we will have:
sub myfunction(...) {
# Do whatever; maybe return something
}
Then you may invoke the function normally. The function will be a different one depending on which file was loaded. Simple and dynamic.
The modular way (recommended)
The way I would do it with maintainability in mind would be using require. Unlike use, that is evaluated at compile-time, require may be used to load a module at run-time. Out of the various ways to invoke require, I would go for:
my $mymodule = 'MyCompany::MyModule'; # The module name ends up in $mymodule
require $mymodule;
Also unlike use, require will load the module but will not execute import. So we may use any functions inside the module and those function names will not polute the calling namespace. To access the function we will need to use:
$mymodule->myfunction($a, $b);
See below as to how the arguments get passed. This way of invoking a function will add an argument before $a and $b that is usually named $self. You may ignore it if you donĀ“t know anything about object orientation.
As require will try to load a module and the module may not exist or it may not compile, to catch the error it will be better to use:
eval "require $mymodule";
Then $# may be used to check for an error in the loading+compiling process. We may also check that the function has been defined with:
if( $mymodule->can('myfunction') ) {
die "myfunction is not defined at module $mymodule\n";
}
In this case we will need to create a directory for the modules and a file with the .pm extension for each one:
MyCompany
MyModule.pm
Inside MyModule.pm we will have:
package MyCompany::MyModule;
sub myfunction {
my ($self, $a, $b);
# Do whatever; maybe return something
# $self will be 'MyCompany::MyModule'
}
1;
The package bit is essential and will make sure that whatever definitions we put inside will be at the MyCompany::MyModule namespace. The 1; at the end will tell require that the module initialization was correct.
In case we wanted to implement the module by using other libraries that could polute the caller namespace, we could use the namespace::clean module. This module will make sure the caller does not get any additions to the namespace coming from the module we are defining. It is used in this way:
package MyCompany::MyModule;
# Definitions by these modules will not be available to the code doing the require
use Library1 qw(def1 def2);
use Library2 qw(def3 def4);
...
# Private functions go here and will not be visible from the code doing the require
sub private_function1 {
...
}
...
use namespace::clean;
# myfunction will be available
sub myfunction {
# Do whatever; maybe return something
}
...
1;
What happens if we include a module more than once?
The short answer is nothing. Perl keeps track of which modules have been loaded and from where using the %INC variable. Both use and require will not load a library twice. use will add any exported names to the callers namespace. require will not do that either. In case you want to check that a module has been loaded already, you could use %INC or better yet, you could use module::loaded which is part of the core in modern Perl versions:
use Module::Loaded;
if( !is_loaded( $mymodule ) {
eval "require $mymodule" );
...
}
How do I make sure Perl finds my module files?
For use and require Perl uses the #INC variable to define the list of directories that will be used to look for libraries. Adding a new directory to it may be achieved (among other ways) by adding it to the PERL5LIB environment variable or by using:
use lib '/the/path/to/my/libs';
Helper libraries
I have found some libraries that may be used to make the code that uses the dynamic mechanism more maintainable. They are:
The if module: will load a module or not depending on a condition: use if CONDITION, MODULE => ARGUMENTS;. May also be used to unload a module.
Module::Load::Conditional: will not die on you while trying to load a module and may also be used to check the module version or its dependencies. It is also able to load a list of modules all at once even checking their versions before doing so.
Taken from the Module::Load::Conditional documentation:
use Module::Load::Conditional qw(can_load);
my $use_list = {
CPANPLUS => 0.05,
LWP => 5.60,
'Test::More' => undef,
};
print can_load( modules => $use_list )
? 'all modules loaded successfully'
: 'failed to load required modules';
I have been USEing .pm files willy-nilly in my programs without really getting into using packages unless really needed. In essence, I would just have common routines in a .pm and they would become part of main when use'd (the .pm would have no package or Exporter or the like... just a bunch of routines).
However, I came across a situation the other day which I think I know what happened and why, but I was hoping to get some coaching from the experts here on best practices and how to resolve this. In essence, should packages be used always? Should I "do" files when I just want common routines absorbed into main (or the parent module/package)? Is Exporter really the way to handle all of this?
Here's example code of what I came across (I won't post the original code as it's thousands of lines... this is just the essence of the problem).
Pgm1.pl:
use PM1;
use PM2;
print "main\n";
&pm2sub1;
&pm1sub1;
PM1.pm:
package PM1;
require Exporter;
#ISA=qw(Exporter);
#EXPORT=qw(pm1sub1);
use Data::Dump 'dump';
use PM2;
&pm2sub1;
sub pm1sub1 {
print "pm1sub1 from caller ",dump(caller()),"\n";
&pm2sub1;
}
1;
PM2.pm:
use Data::Dump 'dump';
sub pm2sub1 {
print "pm2sub1 from caller ",dump(caller()),"\n";
}
1;
In essence, I'd been use'ing PM2.pm for some time with its &pm2sub1 subroutine. Then I wrote PM1.pm at some point and it needed PM2.pm's routines as well. However, in doing it like this, PM2.pm's modules got absorbed into PM2.pm's package and then Pgm1.pl couldn't do the same since PM2.pm had already been use'd.
This code will produce
Undefined subroutine &main::pm2sub1 called at E:\Scripts\PackageFun\Pgm1.pl line 4.
pm2sub1 from caller ("PM1", "PM1.pm", 7)
main
However, when I swap the use statements like so in Pgm1.pl
use PM2;
use PM1;
print "main\n";
&pm2sub1;
&pm1sub1;
... perl will allow PM2.pm's modules into main, but then not into PM1.pm's package:
Undefined subroutine &PM1::pm2sub1 called at PM1.pm line 7.
Compilation failed in require at E:\Scripts\PackageFun\Pgm1.pl line 2.
BEGIN failed--compilation aborted at E:\Scripts\PackageFun\Pgm1.pl line 2.
So, I think I can fix this by getting religious about packages and Exporter in all my modules. Trouble is, PM2.pm is already used in a great number of other programs, so it would be a ton of regression testing to make sure I didn't break anything.
Ideas?
See my answer to What is the difference between library files and modules?.
Only use require (and thus use) for modules (files with package, usually .pm). For "libraries" (files without package, usually .pl), use do.
Better yet, only use modules!
use will not load same file more than once. It will, however, call target package's import sub every time. You should format your PM2 as proper package, so use can find its import and export function to requestor's namespace from there.
(Or you could sneak your import function into proper package by fully qualifying its name, but don't do that.)
You're just asking for trouble arranging your code like this. Give each module a package name (namespace), then fully qualify calls to its functions, e.g. PM2::sub1() to call sub1 in package PM2. You are already naming the functions with the package name on them (pm2sub1); it is two extra characters (::) to do it the right way and then you don't need to bother with Exporter either.
I am totally new to Perl/Fastcgi.
I have some pm-modules to which will have to add a lot of scripts and over time it will grow and grow. Hence, I need a structure which makes the admin easier.
So, I want to create files in some kind of directory structure which I can include. I want the files that I include will be exaclty like if the text were written in the file where I do the include.
I have tried 'do', 'use' and 'require'. The actual file I want to include is in one of the directories Perl is looking in. (verified using perl -V)
I have tried within and outside BEGIN {}.
How do I do this? Is it possible at all including pm files in pm files? Does it have to be pm-files I include or can it be any extension?
I have tried several ways, included below is my last try.
Config.pm
package Kernel::Config;
sub Load {
#other activities
require 'severalnines.pm';
#other activities
}
1;
severalnines.pm
# Filter Queues
$Self->{TicketAcl}->{'ACL-hide-queues'} = {
Properties => {
},
PossibleNot => {Ticket => { Queue =>
['[RegExp]^*'] },
},
};
1;
I'm not getting any errors in the Apache's error_log related to this. Still, the code is not recognized like it would be if I put it in the Config.pm file.
I am not about to start programming a lot, just do some admin in a 3rd party application. Still, I have searched around trying to learn how it works with including files. Is the severalnines.pm considered to be a perl module and do I need to use a program like h2xs, or similar, in order to "create" the module (told you, totally newbie...)?
Thanks in advance!
I usually create my own module prefix -- named after the project or the place I worked. For example, you might put everything under Mu with modules named like Mu::Foo and Mu::Bar. Use multiple modules (don't try to keep everything in one single file) and name your modules with the *.pm suffix.
Then, if the Mu directory is in the same directory as your programs, you only need to do this:
use Mu::Foo;
use Mu::Bar;
If they're in another directory, you can do this:
use lib qw(/path/to/other/directory);
use Mu::Foo;
use Mu::Bar;
Is it possible at all including pm files in pm files?
Why certainly yes.
So, I want to create files in some kind of directory structure which I can include. I want the files that I include will be exaclty like if the text were written in the file where I do the include.
That's a bad, bad idea. You are better off using the package mechanism. That is, declare each of your module as a separate package name. Otherwise, your module will have a variable or function in it that your script will override, and you'll never, ever know it.
In Perl, you can reference variables in your modules by prefixing it with the module name. (Such as File::Find does. For example $File::Find::Name is the found file's name. This doesn't pollute your namespace.
If you really want your module's functions and variables in your namespace, look at the #EXPORT_OK list variable in Exporter. This is a list of all the variables and functions that you'd like to import into your module's namespace. However, it's not automatic, you have to list them next to your use statement. That way, you're more likely to know about them. Using Exporter isn't too difficult. In your module, you'd put:
package Mu::Foo;
use Exporter qw(import);
our EXPORT_OK = qw(convert $fundge #ribitz);
Then, in your program, you'd put:
use Mu::Foo qw(convert $fundge #ribitz);
Now you can access convert, $fundge and #ribitz as if they were part of your main program. However, you now have documented that you're pulling in these subroutines and variables from Mu::Foo.
(If you think this is complex, be glad I didn't tell you that you really should use Object Oriented methods in your Modules. That's really the best way to do it.)
if ( 'I want the files that I include will be exactly like if the text were written in the file where I do the include.'
&& 'have to add a lot of scripts and over time it will grow and grow') {
warn 'This is probably a bad idea because you are not creating any kind of abstraction!';
}
Take a look at Exporter, it will probably give you a good solution!
I have an interesting situation. In some (large) legacy code, there is a namespace that should look like require A::B, but instead the path to A was added so it's possible to just say require B. However, I would like to be able to use both invocations. Is this possible without creating a redirecting package? For instance, is there a way to dual declare a package?
Thanks!
First load the package:
require A::B;
Then alias B to A::B:
*B:: = *A::B::;
Then tell require that it has already loaded B
$INC{'B.pm'}++;
To make sure this all works right, it is best to perform these actions inside a BEGIN block:
BEGIN {
require A::B;
*B:: = *A::B::;
$INC{'B.pm'}++;
}
After this, all require A::B; and require B; lines will become no-ops. You will also be able to refer to variables in that package with either name. \&A::B::foo == \&B::foo
To get this to work transparently, you could add the following code to each file:
A/B.pm
*B:: = *A::B::;
$INC{'B.pm'}++;
B.pm
*A::B:: = *B::;
$INC{'A/B.pm'}++;
Then if a user does require A::B; they can call A::B::foo or B::foo and require B; will become a no-op.
And if a user does require B; they can call A::B::foo or B::foo and require A::B; will become a no-op.
But for maintainability, it is probably best to keep all of the real code in one file (along with the aliasing code above), and setup the other file as a shim that just loads the real file. Assuming A/B.pm contains the real code:
A/B.pm
*B:: = *A::B::; # this gets added to the existing file
$INC{'B.pm'}++;
B.pm
require A::B; # this is the entire file
require Something will search the directories in #INC for a file called Something.pm.
To get some-path/A/B.pm to be loaded with either require B or require A::B, you would need to have both some-path and some-path/A in your #INC directory list.
There are many many ways to add directories or otherwise manipulate your #INC list.
Eric's solution would work, but truthfully, I'd shoot anyone who did this in real production code. You could probably achieve similar results by using methods in Package::Stash, but again, why mess up the symbol table like this? I'd rather fix the legacy code that was calling things the wrong way. Seriously, how hard is it to do a search-and-replace on your code and fix the package names?
Another quick and dirty method to get require B; to actually find package A::B is to simply make a symlink in the lib/ directory pointing to A/B.pm:
cd lib
ln -sf -T A/B.pm B.pm
Note that this will create two packages with identical code, so variables in one will not be the same value as the other: $A::B::foo will be entirely separate from $B::foo even though their declaration comes from the same file.
I am experiencing a problem with using a constant defined in a configuration file.
This is my package:
package myPackage;
require "APIconfig.pl";
APIconfig::import(APIconfig);
use constant SERVICE_URL => APIconfig::SERVICE_URL();
The configuration looks like this:
package APIconfig;
use constant SERVICE_URL => 'http://api.example.org/blah';
1;
When running this code, I get the following error:
Undefined subroutine &APIconfig::SERVICE_URL called at API.pl line 4.
I cannot use 'use' instead of 'require' because this expects the configuration file to be named .pm, and it's called .pl on a lot of servers on our network.
How can I use the package without renaming the file?
There are two differences between 'use' and 'require'. One of them affects your current problem, the other doesn't. Unfortunately you are working around one that has no effect.
The differences are:
1/ 'use' calls the import() function, 'require' doesn't.
2/ 'use' happens at compile time, 'require' happens at runtime.
You're working around the fact that 'require' doesn't call import() by calling it explicitly. This has no effect as your module doesn't export any symbols and doesn't have an import() subroutine.
You're not working around the fact that 'use' statements are executed at runtime. The problem is that "use constant SERVICE_URL => APIconfig::SERVICE_URL();" is executed at compile time and your 'require' hasn't run by then so myPackage knows nothing about APIconfig.
The (nasty, hacky) solution is to put the 'require' statement into a BEGIN block - to force it to be executed at compile time. You'll also want to remove the call to import() as that gives a runtime error (due to the absence of the subroutine).
The test files that I used to work this out are as follows:
$ cat APIconfig.pl
package APIconfig;
use constant SERVICE_URL => 'http://api.example.org/blah';
1;
$ cat api.pl
#!/usr/bin/perl
package myPackage;
BEGIN {
require "APIconfig.pl";
}
# APIconfig::import(APIconfig);
use constant SERVICE_URL => APIconfig::SERVICE_URL();
print SERVICE_URL, "\n";
$ ./api.pl
http://api.example.org/blah
The real solution is to rewrite APIconfig as a real module. You hint that you know that, but that environmental issues prevent you taking this approach. I highly recommend trying to work around those issues and doing things correctly.
That can't be possibly right - there is no subroutine import in package APIconfig. Once you're accessing symbolic names with a full package path, you don't need to export/import anyway.
The solution is to run require at compile time, before use constant. This works:
package myPackage;
BEGIN {
require "APIconfig.pl";
}
use constant SERVICE_URL => APIconfig::SERVICE_URL();
If it's a configuration file, don't make it code. I have a whole chapter in Mastering Perl about that, and there are many modules on CPAN to help you with almost any configuration format.
If it's code, why not just make it a module so you can use use. Modules are so much easier to control and manipulate within another program.
The easiest solutions are the ones where you don't swim against the tide. :)
Beyond that, use is the same as:
BEGIN {
require Module;
Module->import;
}
You just do the same thing with filename and the namespace it defines (as long as the code in the file looks like a module):
BEGIN {
require "file.pl"; # defines SomeNamespace
SomeNamespace->import;
}