How to dynamically set the directory for user-specified libraries - perl

I have some fairly complex libraries that interconnect with each other to do some work, and while they mostly run on our servers, they're connecting to some high-performance servers to crunch numbers.
On our servers, the line is ...
use lib '/home/ourgroup/lib' ;
use HomeGrown::Code ':all' ;
On the high-performance cluster, it's more like ...
use lib '/scratch/ourgroup/lib' ;
use HomeGrown::Code ':all' ;
For the programs that use the modules, that's a reasonably easy thing to set, but I would like to not have to make server-specific changes in the code base. I'd rather be able to copy the directory as-is. So, how do I tell my modules to use my library dir without hard-coding it like this?

Usually you do this by having the PERL5LIB environment variable set differently on different machines. Yes, it's not a pure-Perl solution, but it has to be done only once per server, not once per deployment.

Here's what we're going with.
use lib '/home/ourgroup/lib' ;
use lib '/scratch/ourgroup/lib' ;
If /home/ourgroup/lib doesn't exist on one machine, so be it. If /scratch/ourgroup/lib doesn't exist on the other, so be it. It doesn't complain, so that's what we're doing.

You can also have a module that detects the environment and includes the correct directories for for you.
To use it, just do:
use VarLogRant::FindLibs;
use Stuff;
And to write the module:
package VarLogRant::Findlibs;
sub select_lib_dirs {
my #libs;
push #libs, '/home/ourgroup/libs' if need_home();
push #libs, '/scratch/ourgroup/libs' if need_scratch();
# Any other magical logic you want.
}
# It is critical that use lib comes AFTER the functions are defined.
use lib select_lib_dirs();
1;

Related

Why isn't my CPAN distribution indexed by PAUSE?

I've uploaded my stasis distribution to PAUSE, but it isn't in the index.
I thought this was because it didn't contain a package, so I added a package declaration to the stasis script in v0.04 like this:
#!/usr/bin/env perl
package stasis;
package main;
...
but it still wasn't indexed.
Is there anyway to get this distribution indexed that doesn't involve creating a boilerplate module file? (e.g. adding lib/stasis.pm to the distribution).
I believe CPAN does not index scripts.
IMO your best option is to make a module that allows doing programmatically what your script does (and make the script use it).
You could put in a fake module or make it think your script is a module (I think listing it in provides works), but I wouldn't if I were you.
Because your package statement was not in a *.pm file.
The PAUSE indexer is open source. It is a little complicated to unpack, but the regex for extracting a package name in a distribution is in PAUSE::pmfile::packages_per_pmfile, which is a method and a package that is meant to process *.pm files only.
The PAUSE::dist::_index_by_meta method provides the alternate method of declaring a package through the provides keyword in the metafile.

Something wrong with #INC in perl?

Let's imagine we are new to perl and wrote some great module MyModule.pm. We also wrote some great script myscript.pl, that need to use this module.
use strict;
use warnings;
use MyModule;
etc...
Now we will create dir /home/user/GreatScript and put our files in it. And trying to run myscript.pl...
cd /home/user/GreatScript
perl myscript.pl
Great! Now moving to another dir...
cd /
perl /home/user/GreatScript/myscript.pl
Getting some not very usefull error about #INC and list of paths. What is this? Now after some googling we know that #INC contains paths where to search our modules, and this error means that Perl can't find out MyModule.pm.
Now we can:
Add path to system PERL5LIB var, it will add our path to the beginning of #INC
Install our module to one of dirs from #INC
Manually add our path to #INC in the BEGIN section
Add use lib '/home/user/GreatScript'; to our script, but it looks bad. What if we will move our script to other dir?
Or use FindBin module to find our current dir, and use this path in use lib "$FindBin::Bin";, but it's not 100% effective, for example some bugs with mod_perl or issues...
Or use __FILE__ (or $0) variable and extract path from it using abs_path method from Cwd module. Looks like a yet another bicycle?
Or some forgotten or missing ( Did I miss something? )
Why this obvious and regular operation need a lot of work? If I have one script, it's easy... But why I need to write this few additional lines of code to all my scripts? And it will not work 100%! Why perl not adding current script dir to #INC by default, like it does with "."?
PS: I was searching for answer to my question but find only list of solutions from the list above and some others. I hope this question is duplicate...
Or some forgotten or missing ( Did I miss something? )
Another option: Don't keep your module with your script. The module is separate because it is re-usable, so put it in a library folder. This can be anything from a local library for personal projects, included with use lib (and perhaps referencing an environment variable you have set up for the project), to making the module into a CPAN library, and letting cpan manage where it goes. What you decide to do depends on how re-usable your code is, beyond how the script uses it.
Why this obvious and regular operation need a lot of work?
It is not a lot of work in the scheme of things. Your expectation that files grouped together in the file system, should automatically be used by Perl at the language level to resolve require or use statements is not true. It's not that it is impossible, or that your expectation is unreasonable, just that Perl is not implemented that way. Changing it after the fact could affect how many existing projects work, and may be contentious - so is unlikely to change.
If your use case is: "I want to have a script which accesses additional modules or other resources in the same or relative directory, and I don't want to install everything", then FindBin is a perfect solution. The limitation mentioned in the manpage happens only in persistent environments like mod_perl, but here I would propose different solutions (e.g. using FindBin only within httpd.conf, not in the scripts/modules).
(/me is a heavy FindBin user, and I am also the author of half of the KNOWN ISSUES section of the FindBin manpage)
That is need in order to know which module would you like to use. When you move to another catalog as you said perl looks in '.' catalog so if run:
cd /
perl /home/user/GreatScript/myscript.pl
and if MyModule.pm in '/' perl will find it and will be use in myscrpit.pl. Now as Perl finds module in the #INC in order which catologs in it you have to keep an eye on #INC.
Summary: obvious and regular operation is need to prevent using wrong module with the same name what you want to use.

How to import all "our"-variables from the unnamed Perl module without listing them?

I need to import all our variables from the unnamed Perl module (Module.pm) and use them inside the Perl script (Script.pl).
The following code works well without the "use strict", but failed with it. How can I change this code to work with "use strict" without the manual listing of all imported variables (as described in the answer to other question)?
Thanks a lot for your help!
Script.pl:
use strict;
require Module;
print $Var1;
Module.pm:
our $Var1 = "1\n";
...
our $VarN = "N\n";
return 1;
Run the script:
$> perl Script.pl
Errors:
Global symbol "$Var1" requires explicit package name at Script.pl line 3.
Execution of Script.pl aborted due to compilation errors.
NOTE (1): The module is unnamed, so using a Module:: prefix is not the option.
NOTE (2): Module.pm contains also a set of functions configured by global variables.
NOTE (3): Variables are different and should NOT be stored in one array.
NOTE (4): Design is NOT good, but the question is not about the design. It's about forcing of the listed code to work with minimal modifications with the complexity O(1), i.e. a few lines of code that don't depend on the N.
Solution Candidate (ACCEPTED): Add $:: before all imported variables. It's compliant with strict and also allows to differ my variables from imported in the code.
Change your script to:
use strict;
require Module;
print $Module::Var1;
The problem is the $Var1 isn't in the main namespace, it's in Module's namespace.
Edit: As is pointed out in comments below, you haven't named your module (i.e. it doesn't say package Module; at the top). Because of this, there is no Module namespace. Changing your script to:
use strict;
require Module;
print $main::Var1;
...allows the script to correctly print out 1\n.
If you have to import all the our variables in every module, there's something seriously wrong with your design. I suggest that you redesign your program to separate the elements so there is a minimum of cross-talk between them. This is called decoupling.
You want to export all variables from a module, and you want to do it in such a way that you don't even know what you're exporting? Forget about use strict and use warnings because if you put them in your program, they'll just run screaming out, and curl up in a corner weeping hysterically.
I never, and I don't mean hardly ever, never export variables. I always create a method to pull out the required value. It gives me vital control over what I'm exposing to the outside world and it keeps the user's namespace pure.
Let's look at the possible problems with your idea.
You have no idea what is being exported in your module. How is the program that uses that module going to know what to use? Somewhere, you have to document that the variable $foo and #bar are available for use. If you have to do that, why not simply play it safe?
You have the issue of someone changing the module, and suddenly a new variable is being exported into the program using that module. Imagine if that variable was already in use. The program suddenly has a bug, and you'll never be able to figure it out.
You are exporting a variable in your module, and the developer decides to modify that variable, or even removes it from the program. Again, because you have no idea what is being imported or exported, there's no way of knowing why a bug suddenly appeared in the program.
As I mentioned, you have to know somewhere what is being used in your module that the program can use, so you have to document it anyway. If you're going to insist on importing variables, at least use the EXPORT_OK array and the Exporter module. That will help limit the damage. This way, your program can declare what variables its depending upon and your module can declare what variables it knows programs might be using. If I am modifying the module, I would be extra careful of any variable I see I am exporting. And, if you must specify in your program what variables you're importing, you know to be cautious about those particular variables.
Otherwise, why bother with modules? Why not simply go back to Perl 3.0 and use require instead of use and forget about using the package statement.
It sounds like you have data in a file and are trying to load that data into your program.
As it is now, the our declarations in the module only declare variables for the scope of that file. Once the file finshes running, to access the variables, you need to use their fully qualified name. If your module has a package xyz; line, then the fully qualified name is $xzy::Var1. If there is no package declaration, then the default package main is used, giving your variables the name $main::Var1
However, any time that you are making many variables all with numeric name changes, you probably should be using an array.
Change your module to something like:
#My::Module::Data = ("1\n", "2\n" ... )
and then access the items by index:
$My::Module::Data[1]

How does Perl's lib pragma work?

I use use lib "./DIR" to grab a library from a folder elsewhere. However, it doesn't seem to work on my server, but it works fine on my local desktop.
Any particular reasons?
And one more question, does use lib get propagated within several modules?
Two situations:
Say I make a base class that requires a few libraries, but I know that it needs to be extended and the extended class will need to use another library. Can I put the use lib command in the base class? or will I need to put it in every extending class?
Finally, can I just have a use package where package contains a bunch of use lib, will it propagate the use lib statements over to my current module? <-- I don't think so, but asking anyways
The . in your use lib statement means "current working directory" and will only work when your script is run from the right directory. The server's idea of cwd is probably something different (or undefined). Assuming that the library directory is co-located with with script and private to it you want to do something like this instead:
use FindBin;
use lib "$FindBin::Bin/DIR";
A use lib statement affects #INC -- the list of locations perl searches when you use or require a module. It globally affects the current instance of the interpreter. You should really only put use lib statements in scripts, not in modules.
In principle, you could have a package MyLibs that consisted of a bunch of use lib statements and then use MyLibs before using any of the packages in those locations, but I wouldn't recommend it.
There's no way to know why it isn't working on your server without more information. In particular, check your server's error logs, and dump #INC somewhere if necessary, and compare that to your actual library paths.
use lib modifies #INC, which is global, so as long as you execute your use lib before other packages try to include stuff, it will work and all other packages will see the new include paths.
For more on #INC, see its entry in perlvar.

What's the best way to have two modules which use functions from one another in Perl?

Unfortunately, I'm a totally noob when it comes to creating packages, exporting, etc in Perl. I tried reading some of the modules and often found myself dozing off from the long chapters. It would be helpful if I can find what I need to understand in just one simple webpage without the need to scroll down. :P
Basically I have two modules, A & B, and A will use some function off from B and B will use some functions off from A. I get a tons of warning about function redefined when I try to compile via perl -c.
Is there a way to do this properly? Or is my design retarded? If so what would be a better way? As the reason I did this is to avoid copy n pasting the other module functions again into this module and renaming them.
It's not really good practice to have circular dependencies. I'd advise factoring something or another to a third module so you can have A depends on B, A depends on C, B depends on C.
So... the suggestion to factor out common code into another module is
a good one. But, you shouldn't name modules *.pl, and you shouldn't
load them by require-ing a certain pathname (as in require
"../lib/foo.pl";). (For one thing, saying '..' makes your script
depend on being executed from the same working directory every time.
So your script may work when you run it as perl foo.pl, but it won't
work when you run it as perl YourApp/foo.pl. That is generally not good.)
Let's say your app is called YourApp. You should build your
application as a set of modules that live in a lib/ directory. For
example, here is a "Foo" module; its filename is lib/YourApp/Foo.pm.
package YourApp::Foo;
use strict;
sub do_something {
# code goes here
}
Now, let's say you have a module called "Bar" that depends on "Foo".
You just make lib/YourApp/Bar.pm and say:
package YourApp::Bar;
use strict;
use YourApp::Foo;
sub do_something_else {
return YourApp::Foo::do_something() + 1;
}
(As an advanced exercise, you can use Sub::Exporter or Exporter to
make use YourApp::Foo install subroutines in the consuming package's
namespace, so that you don't have to write YourApp::Foo:: before
everything.)
Anyway, you build your whole app like this. Logical pieces of
functionally should be grouped together in modules (or even better,
classes).
To make all this run, you write a small script that looks like this (I
put these in bin/, so let's call it bin/yourapp.pl):
#!/usr/bin/env perl
use strict;
use warnings;
use feature ':5.10';
use FindBin qw($Bin);
use lib "$Bin/../lib";
use YourApp;
YourApp::run(#ARGV);
The key here is that none of your code is outside of modules, except a
tiny bit of boilerplate to start your app running. This is easy to
maintain, and more importantly, it makes it easy to write automated
tests. Instead of running something from the command-line, you can
just call a function with some values.
Anyway, this is probably off-topic now. But I think it's important
to know.
The simple answer is to not test compile modules with perl -c... use perl -e'use Module'
or perl -e0 -MModule instead.
perl -c is designed for doing a test compile of a script, not a module. When you run it
on one of your
When recursively using modules, the key point is to make sure anything externally referenced is set up early. Usually this means at least making use #ISA be set in a compile time construct (in BEGIN{} or via "use parent" or the deprecated "use base") and #EXPORT and friends be set in BEGIN{}.
The basic problem is that if module Foo uses module Bar (which uses Foo), compilation of Foo stops right at that point until Bar is fully compiled and it's mainline code has executed. Making sure that whatever parts of Foo Bar's compile and run-of-mainline-code
need are there is the answer.
(In many cases, you can sensibly separate out the functionality into more modules and break the recursion. This is best of all.)