How to best manage the multiple versions of Perl on HPC environment which are installed in a centralized, yet customized locations? - perl

In the HPC/cluster computing environment, most of the applications are usually installed as 'MODULE' in customized centralized repository, and also most often, many different versions of the software application may need to coexist. Perl is one of such commonly used general programming language. And I would like to ask for the best practice/solution to be able to not only install multiple Perl version in an isolated setting from each other, but also be able to add on more customized perl module later on to different perl installation. For example, I might need to add bioperl/1.7.2 to Perl/5.28.1, but will install bioperl/1.7.8 in perl/5.36.0. There are quite a lot suggestions on the Internet for how to achieve this. But I would like to find some more concise and clear way to do it. Based on my own experience, I would say probably the best practice would be making use of CPAN's custom configure file option 'cpan -j'. I will elaborate this later after I post this question.Thanks.
I have googled a lot on this and didn't find a good answer to my specific need. So I will write my own answer based on my so far experience with Perl.

Just use the cpan that was installed by the perl for which you want to install a module.
$ head -n 1 /home/ikegami/usr/perlbrew/perls/5.36.0t/bin/cpan
#!/home/ikegami/usr/perlbrew/perls/5.36.0t/bin/perl
$ /home/ikegami/usr/perlbrew/perls/5.36.0t/bin/cpan Text::CSV_XS
...
Installing /home/ikegami/usr/perlbrew/perls/5.36.0t/lib/site_perl/5.36.0/x86_64-linux-thread-multi/auto/Text/CSV_XS/CSV_XS.so
Installing /home/ikegami/usr/perlbrew/perls/5.36.0t/lib/site_perl/5.36.0/x86_64-linux-thread-multi/Text/CSV_XS.pm
Installing /home/ikegami/usr/perlbrew/perls/5.36.0t/man/man3/Text::CSV_XS.3
Appending installation info to /home/ikegami/usr/perlbrew/perls/5.36.0t/lib/5.36.0/x86_64-linux-thread-multi/perllocal.pod
HMBRAND/Text-CSV_XS-1.49.tgz
/usr/bin/make install -- OK
$ head -n 1 /home/ikegami/usr/perlbrew/perls/5.34.0t/bin/cpan
#!/home/ikegami/usr/perlbrew/perls/5.34.0t/bin/perl
$ /home/ikegami/usr/perlbrew/perls/5.34.0t/bin/cpan Text::CSV_XS
...
Installing /home/ikegami/usr/perlbrew/perls/5.34.0t/lib/site_perl/5.34.0/x86_64-linux-thread-multi/auto/Text/CSV_XS/CSV_XS.so
Installing /home/ikegami/usr/perlbrew/perls/5.34.0t/lib/site_perl/5.34.0/x86_64-linux-thread-multi/Text/CSV_XS.pm
Installing /home/ikegami/usr/perlbrew/perls/5.34.0t/man/man3/Text::CSV_XS.3
Appending installation info to /home/ikegami/usr/perlbrew/perls/5.34.0t/lib/5.34.0/x86_64-linux-thread-multi/perllocal.pod
HMBRAND/Text-CSV_XS-1.49.tgz
/usr/bin/make install -- OK
perlbrew can help you install multiple Perl builds, and it can help you manipulate which one is found the PATH in a shell. While the Perl builds in the example were installed with the help of perlbrew, that's not required.

Related

Bundling up a perl script with its dependencies?

I have a perl script that I've put together to do some monitoring and graphing.
It works nicely on my dev system, where I have carte-blanch to install my own modules from CPAN.
What I'm looking at doing is bundling it up to deploy onto another system. But here's the catch - this other system is 'standalone' and has no network connection. (And I have change control paperwork to fill in, indicating what I'm installing).
As a result, I'd really like a nice easy way to figure out:
- What modules my scripts are making use of. (Including dependencies)
- how to easily grab them (cpan get probably)
- Is there an easy way to tell what external binaries I'm using? (I'm using for sure ssh and rrdtool - the former is definitely installed, the latter probably not).
I have a few thoughts on how to do this, but it strikes me as something that should be smoother.
I may also need to deploy a new perl, so I'm pondering whether I'm better off 'installing' the modules with system perl (probably 5.8.8 on RHEL5), or just 'packaging' the whole thing in a directory of it's own with a standalone perl instance.
Use pp to package your script and all dependant modules and libraries into a stand alone executable.
pp -x yourscript.pl -o outputfilename
See the documentation for examples of how to link to external shared objects (etc) if required. With pp you don't need perl on the target system where outputfilename will run.
Revisiting this, as the need hasn't really gone away. I have moved towards using docker - this is an 'image' and 'container' system for app deployment, which amongst other things, allows you to 'package' an application.
You create a Dockerfile - which is analagous to a Makefile - that runs through the steps to install perl + dependencies (either via a package manager, or from CPAN).
Once it has, you have a self contained, runnable 'image' that you can clone and create an instance of (a "container" in docker parlance).
It's also quite useful - even if you don't then deploy via container - to figure out what the dependencies of this application/packages were. The version in the container has everything locally installed that it needed, because it was a clean build.
When you have a system where you can't control the Perl installation (and the install is a really, really old version of Perl like 5.8.8 which is missing many nice improvements like state variables, autodie, say, and switch), you should look into Perlbrew.
Perlbrew allows you to install a user version of Perl. (In fact, it allows you to install multiple versions of Perl), and allows you to switch between the Perlbrew install and the officially installed version. It makes doing everything in Perl much, much easier.
You will have freer access to install new Perl modules, and you can do that task yourself rather than wait for your IT department to do it for you.
I ended up using it on one of our systems where the primitive version of Perl just wasn't doing what my version of Perl did. I originally asked our IT to upgrade, but they really messed up the upgrade. After going back and forth, I simply asked if I could install Perlbrew.
Which is an important point. Always ask permission. A lot of time, the IT department is more than happy to oblige. They're not Perl people, and CPAN is a world they don't want to deal with. Being able to get out of having to answer your beck and call about installing this or that Perl module is a great relief.

Which cpan installer is the right one? (CPAN.pm/CPANPLUS/cpanminus)

There are multiple installers for cpan modules available; I know of at least CPAN.pm (comes with perl,) CPANPLUS, and cpanminus.
What is the difference between the three?
What situations call for using one over the other?
Are there other module installers I should know about?
CPAN.pm (cpan) is the original client. It comes with Perl, so you already have it. It has the most features. It has a lot of configuration options to customize the way it works, though virtually everyone accepts the default installation. It integrates easily with local::lib.
cpanminus (cpanm) is an attempt to make a zero-configuration client that automatically does the right thing for most users. It's also designed to run well on systems with limited resources (e.g. a VPS). It doesn't come with Perl, but it's easy to install. It integrates easily with local::lib.
Its biggest limitation is its lack of configuration. If you want to do something unusual, it may not support it.
CPANPLUS (cpanp) is an attempt to make a CPAN API that Perl programs can use, instead of an app that you use from the command line. The cpanp shell is more of a proof-of-concept, and I don't know of any real advantages to using it.
In summary, I'd recommend either cpan or cpanm. If you have trouble configuring cpan, try cpanm. If your situation is unusual, try cpan.
It's impossible answer this question because it is too subjective. :)
From my point of view: cpanm is the simplest way install perl modules. You can install cpanm with:
curl -L http://cpanmin.us | perl - --sudo App::cpanminus
and after it you can install modules with simple:
cpanm Some::Module
You can use cpanm for mirroring (part of) CPAN to you local machine too, so IMHO cpanm is the best for the most common CPAN needs.
Are there other module installers I
should know about?
If you're using a Linux distribution that packages CPAN modules, then it's worth using their package installation program to install modules. For example, Ubuntu/Debian have a huge number of CPAN modules that you can install using 'apt' and Red Hat/Centos/Fedora have a number that you can install using 'yum'.
CPAN is the standard. cpanminus (cpanm) asks fewer questions (best most of the time). I don't know anyone that uses cpanplus.
Since what these modules do is download, compile and install (place files in correct places) they all should do the same task. Some of the difference has to do with the permissions level you have. Perhaps you want to install some things local to your user and some things globally then you need a finer adjustment. Developers may also need to control/interrupt the process for debugging etc.
For daily use, use cpanm, unless you are too lazy to install it, then CPAN is fine.
cpanm uses much less memory. This makes it a better choice for environments where RAM is limited, such as shared hosting servers, where regular cpan might die before completing installation task, due to attempting to use more than available memory.
According to cpanm's (1.7044) documentation "When running, it requires only 10MB of RAM"

How do I use perlbrew to manage perl installations aimed at web applications?

I have been using perlbrew to manage multiple versions of perl on a Linux Fedora notebook. I have used it with great benefit to run command-line scripts mostly using App::cmd.
I now want to move to running web applications written using CGI::Application using different perls installed in my $HOME. I am familiar with running Perl web applications in $HOMEs using Apache's user_dir or creating Virtual Hosts but I am unable to come up with a clean way of integrating this and the perlbrew managed perls. Specifically I need help in understanding and finding answers to these questions:
How do install mod_perl under perlbrew?
Assuming this is done, how do I configure my VirtualHost so that it picks up the correct perl that is current?
If this is not possible, (which I doubt) can I at least use local installations to run vanilla CGI?
Thank you for your attention.
I don't think this is a good use for perlbrew, which moves around symlinks under its own directory. The trick is switching the mod_perl module around. Remember, mod_perl is going to be binary-incompatible between major versions of perl, and that you will have to compile it against apache for each version of perl (and apache) you want to use.
perlbrew really does two big things for you:
Installs perl, which is trivially easy to do on your own.
Switches around symlinks so perl is whatever version you want.
If you give up on that last one, perlbrew isn't really doing that much for you. I don't think the symlink feature is particularly valuable.
I think perlbrew is fine for what it is, but when you start doing things outside of its limited scope, it's time to not use it. It's supposed to be a tool to save you some time and headache, so if it's not accomplishing that goal, it's not the right tool for your situation.
In this situation, where I'm supporting a single, big web application, I give it its own perl installation that I don't let anything else use.
For your other questions:
markdown placeholder
You shouldn't have to configure any VirtualHost stuff. If you are using mod_perl, perl is already in there and you don't get to choose a perl. If you're using CGI stuff, you specify the perl on the shebang line. You will have to ensure apache picks up the right library directories, but I think perlbrew handles that. You might have to use SetEnv or something similar in your httpd.conf.
For vanilla CGI, just point to the right (symlink) path for whatever the default perlbrew version is. The CGI program will just use whatever perl that path points to.
See brian d foy's answer for why not to expect to use perlbrew to switch between versions of mod_perl. I also expect that you will need to run multiple Apache servers, if you need multiple different Perl versions under mod_perl.
However, using perlbrew as an easy way to build Perl is IMHO a valid thing to do, and there are few instructions available for how to run mod_perl under perlbrew.
First ensure perl is built with shared library support, by passing the -Duseshrplib flag (otherwise on 64-bit systems you will get a confusing build failure about -fPIC):
perlbrew install perl-5.16.3 -Duseshrplib
Install the development Apache libraries for your system. On Debian, this differs depending on the Apache MPM that you are using. For the prefork MPM:
sudo apt-get install apache2-prefork-dev
Or for the worker MPM:
sudo apt-get install apache2-threaded-dev
Then you need some options to build and install mod_perl2 into the right place. Note that this means cpanm will fail to build it, but you could use it to get hold of the source:
cpanm mod_perl2 # fails
cd ~/.cpanm/latest-build/mod_perl-2.0.8/ # adjust mod_perl version
Adjust the version of Perl below accordingly. (The MP_APXS option is to give the right path for Debian-based systems, which you might not need.)
perl Makefile.PL MP_APXS=/usr/bin/apxs2 \
MP_AP_DESTDIR=$HOME/perl5/perlbrew/perls/perl-5.16.3/
make
make install
Finally, change the LoadModule line in your Apache configuration file (adjusting paths accordingly):
LoadModule perl_module <your homedir>/perl5/perlbrew/perls/<your perl>/usr/lib/apache2/modules/mod_perl.so
Your mod_perl installation will now be running the version of Perl that you want. Install all your required CPAN modules and get going.

How can Install multiple Perl versions without them tripping over each other's XS modules?

I would like to install several different versions of perl in my home directory. I tried using App::perlbrew, but XS modules from one version were causing segfaults in the other version. Is there any way to install multiple versions of perl and have them automatically keep their XS modules separate?
You can install each perl completely separate from any other perl installation. It's binaries and modules will be completely separate from each other. Essentially, when you install each perl you give it its own prefix:
$ ./Configure -des -Dprefix=/usr/local/perls/perl-5.12.1
Everything is installed under that prefix, and all of the programs in the bin/ will use that particular perl. I go into this in more depth in Effective Perl Programming.
From there, I make symlinks in my ~/bin to each of those programs and attach the version number to it, so I have ~/perl5.12.1, perldoc5.12.1, and so on. I don't ever have to choose to have a version in the way that perlbrew wants you to. I write more about this in Make links to per-version tools. in the Effective Perler blog.
You might be able to use local::lib for this, but it's really designed for you to work with one version of Perl and use one personal library directory. You can tell it to use another directory, but at that point it's really not saving you anything over the traditional way.

How can I install a specific version of a set of Perl modules?

I'm tasked with replicating a production environment to create many test/sit environments.
One of the things I need to do is build up Perl, with all the modules which have been installed (including internal and external modules) over the years. I could just use CPAN.pm autobundle, but this will result in the test environment having much newer versions of the external modules that production has.
What is the easiest/best way to get and install (a lot of) version specific Perl modules.
bdfoy has the best large scale solution, but if you just want to install a few modules you can ask the CPAN shell to install a specific distribution by referencing a path to a tarball (relative to the top of the CPAN tree).
cpan> install MSCHWERN/Test-Simple-0.62.tar.gz
Throw a URL to BackPAN into your URL list and you can install any older version.
cpan> o conf urllist push http://backpan.perl.org/
This is in the CPAN.pm FAQ under "how do I install a 'DEVELOPER RELEASE' of a module?"
cpan install App::cpanminus
cpanm Your::Module#1.23
(Carton, as referenced in other answers, uses cpanm underneath to resolve explicit version requirements.)
Make your own CPAN mirror with exactly what you want. Stratopan.com, a service, and Pinto, tools that's built on top of, can help you do that.
The CPAN tools only install the latest version of any distribution because PAUSE only indexes the latest version. However, you can create your own, private CPAN that has exactly the distributions that you want. Once you have your own CPAN mirror with only what you want, you point your CPAN tools at only that mirror so it only installs those versions. More on that in a minute.
Now, you want to have several versions of that. You can create as many mirrors as you like, and you can also put the mirrors in source control so you can check out any version of the mirror that you like.
Tools such as CPAN::Mini::Inject can help you set up your own CPAN. Check out my talks on Slideshare for the basic examples, and some of my videos on Vimeo for some of the demonstrations. Look at anything that has "CPAN" or "BackPAN" in the title. I think I might have some stuff about it in The Perl Review too, or should by the next issue. :)
Lately, I've been working on a program called dpan (for DarkPAN) that can look at random directories, find Perl distributions in them, and create the structure and index files that you need. You run dpan, you get a URL to point your CPAN client toward, and off you go. It's part of my MyCPAN-Indexer project, which is in Github. It's not quite ready for unsupervised public use because I mostly work with corporate clients to customize their setup. If you're interested in that, feel free to ask me questions though.
Also, I recently released CPAN::PackageDetails that can help you build the right index file. It's still a bit young too, but again, if you need something special, just ask.
[It's almost five years on and this is a well-asked and well-answered question that has had a lot of views. Since this page must still come up in Google searches, an update can't hurt.]
Carton is worth mentioning here. Carton is a relatively recent tool in the same style as App::cpanminus, App::cpanoutdated, perlbrew, et. al. The author (Miyagawa) calls it "alpha" quality, but even in its current state carton helps simplify the maintenance of multiple environments of version tuned modules across machines.
Pinto too is another recent tool relevant to some of the responses (in fact one of the respondents is a contributor).
Stratopan.com is another alternative. Stratopan provides private CPANs in the cloud. You can fill your Stratopan repository with specific versions of modules (and their dependencies) and then install them using the standard Perl tool chain. The repository changes only when you decide to change it, so you'll get always get the versions of the modules that you want.
Disclaimer: I operate Stratopan.
It seems that creating a cpanfile listing all your modules and desired versions (using the == <version> syntax to lock it to a specific release) could serve well here, too. That would mean using Carton or cpanm for installing the modules.
Doing this would have the benefit of being able to quickly/easily tweak the file to test upgrading specific modules in a dev or staging environment - something that a private CPAN mirror wouldn't let you do (without creating multiple mirrors).