I'm working on building a docker image to be able to run all of our Perl applications. The applications require hundreds of CPAN modules to be installed. The full build of the docker image takes about an hour to complete.
After doing the initial image, I'm not sure how best to handle ongoing updates.
We could keep a single Dockerfile in git, and then modify this as required, and push new builds up to dockerhub. However if the person doing the build doesn't have all of the intermediate images, then adding a single CPAN module could be an extremely tedious process, and it might take an hour before they even know if the new module installs correctly. Also it would be downloading every CPAN module again, which seems a bit risky, as there might be a breaking change in the new module.
Alternatively, the person doing the build could pull the latest docker-hub image, and then install the cpan module interactively, commit the build and push the new image to dockerhub. However then we only have our dockerhub images, but not master Dockerfile.
Or another option would be to create a Dockerfile for each new build, which references the previous dockerhub image. This seems overly complicated though.
Option 1) seems wrong. I'm fairly sure we don't want to be rebuilding the entire image from the base OS just to install one additional module. However being dependent on images without Dockerfiles seems risky as well.
You could use the standard module installer for your underlying OS on your docker image.
For example, if its RedHat then use yum and only use CPAN when they are not available
FROM centos:centos7
RUN yum -y install cpanm gcc perl perl-App-cpanminus perl-Config-Tiny && yum clean all
RUN cpanm install Some::Module; rm -fr root/.cpanm; exit 0
taken from here and modified
I would try to have a base image which the actual applications use
I would also avoid doing things interactively (e.g. script a dockerfile) as you want to be able to repeat the build when upstream dependencies change, which docker hub does for you.
EDIT
You can convert perl modules into your own packages using dh-make-perl
You can load these into your own Ubuntu repo using reprepro or a paid solution of Artifactory
These can then be installed using apt-get when you use your repo as a source from within a dockerfile.
When I have tried a similar thing before There are a few problems
Your apps don't work with the latest version of modules
There are far more dependencies than you expected
Some modules wont package
Benefits are
You keep the build tools (gcc, etc) off the app servers
You know much more about your dependencies
Related
Because i have a long series of comments with #ikegami, I cleaning up the question, in a hope it will be more understandable. Unfortunately, english isn't my "main" language. :(
Let say, having an environment where:
no development tools are installed (no make, nor gcc or like)
perl is installed with its core packages, nothing more
no outgoing network access is allowed - e.g. the user couldn't use curl nor cpan to download/install perl dependencies
the user even doesn't have admin (root) rights
but want install and evaluate some perl based web-app, let call it as MyApp
The MyApp
doesn't uses any XS-based module. (at least, I hope - in the development me using plenv and cpanm, so never checked the installed dependencies in depth)
it is an pure PSGI app, the simple plackup app.psgi works OK
the app uses some data-files which should be included in the "deployment".
The main question is: how to prepare the MyApp, and the all used CPAN-modules, to be easily installed in such restricted environment?
The goal is:
i don't need save my efforts and my time
but i want save the user's time and want minimize the needed actions on his side, so the installation (deployment) should be simple-as-possible.
E.g. how to get an running web-app to the user's machine with minimum possible (his) steps.
- the simplest thing is could be something as:
- copy one file (zip, or tarbal)
- unpack it
- from the terminal execute some run.pl in the unpacked directory.
To get the above simple installation, my idea was the following:
1.) Create an tarball, and after the unpacking will contain 3 folders and 1 perl-script, let say:
myapp_repo/
myapp_repo/distlib #will contain all MyApp's perl modules also ALL used CPAN modules and their dependecies
myapp_repo/datafiles #will contain app-specific data files and such
myapp_repo/install.pl
myall_repo/lib #will contain modules directly used by the `install.pl`
2.) I will develop an install.pl script, and it will be used as the installer-tool, like
perl install.pl new /path/to/app_root
and it will (should):
create the all needed directories under the /path/to/app_root (especially the lib where the will install the perl modules)
will call "local" cpanm internally (from the myapp_repo/lib) to install the app's perl modules and their CPAN dependencies using only distribution files from the distlib.
will generate and install the needed runtime script and the app.psgi into the /path/to/app_root/bin
will install the needed data-files for the app.
3.) So, after this the user should be able to simply run:
/path/to/app_root/bin/plackup /path/to/app_root/bin/app.psgi
In short, the user should use:
the system-wide perl and the system-wide perl-core modules
and any other
runtime perl-scripts (like plackup)
and the required CPAN-modules
should be installed to an self-contained directory tree using only files (no net-access).
E.g. the install.pl should somewhat call internally the cpanm to achieve (as equivalent) for the following cpanm command
cpanm --mirror file://path/to/myapp_repo/distlib --mirror-only My::App
which, should install My::App and all dependencies without network access using only the files from the myapp_repo/distlib
Some questions:
Is possible to use cpanm (called as an locally installed module) without the make?
For creating the myapp_repo/distlib, me thinking about using Pinto. Is it the right tool for achieve the above?
forgot me something? or with other words:
Is the above an viable (read: working) way?
are are any other tools, which i could/should to use for simplifying the creation of such distribution tarball?
#ikegami suggesting some method:
- "install everything" in one fresh-directory on my machine
- transfer this self-contained directory to the target machine
It sound very good, because this directory could contain all the needed app-specific data-files too, unfortunately, I don't understand the details how his solution should be done.
The FatPacked solution looks interesting too - need learn about it.
Don't write your own make or installer. Just copy it make from a different machine (which is basically what apt/yum/etc do anyway, and which you'd have to do even if you wrote your own). You'd be able to use cpan in 5 minutes!
Also, that should allow you to install gcc if you need it (e.g. to install an XS module), although it doesn't sound like you do. If you do install gcc, I'd install my own perl to avoid having to deal with PERL5LIB.
Tools such as minicpan will allow you to install any module from CPAN without internet access. Of course, you can keep using the command you are already using it if mirrors the packages you need.
The above explains how to simply and quickly setup a machine so it can use cpan and thus install any module easily.
If you just want to install a specific module and its dependencies, you can completely avoid using cpan on the target machine. First, you need a fresh install of Perl (preferable of the same version as the one on the target system). Then, simply install the module to a fresh dir on your machine, and transfer that dir to the target machine. That's it; nothing else needs to be done. This even works for XS modules if the two machine are similar enough.
This is what ppm (ActiveState's Perl package manager) does.
Unfortunately, while this solution is almost as simple as the one above, it's not nearly as flexible, it doesn't run the test suite of the modules being installed, etc. It does have the advantage of not requiring the transfer of any binary (if you're not installing any XS modules).
I have a hard time understanding where is the right place to place a code that will install the needed packages for the given docker container managed by dokku.
We have a scala application and, unfortunately, we need to have one shell call that is dependent on an environment. I would like to install the given package for the given container using "apt-get install". Right now I am using a custom plugin with a file named "post-release-build". However, I don't have the permission to install anything in that phase.
Basically, my script that should be invoked looks like this (based on a dockerfile that is available online):
apt-get update
apt-get install -y build-essential xorg libssl-dev libxrender-dev wget gdebi
wget http://download.gna.org/wkhtmltopdf/0.12/0.12.2.1/wkhtmltox-0.12.2.1_linux-trusty-amd64.deb
gdebi --n wkhtmltox-0.12.2.1_linux-trusty-amd64.deb
echo "-----> wkhtmltox installed!"
Is there a way how to make it work? I would also prefer to have such a file somewhere in the application so I don't need to setup environment before pushing the app (in the future).
EDIT:
I have found a plugin that should be capable of installing packages using apt-get (https://github.com/F4-Group/dokku-apt) however, I am a little bit unlucky because it downloads a package that is not working properly.
Since just downloading with apt-get will download a package that fails, I investigated deeper into dokku and came out with a new plugin that should install the package for you.
I have created a script, documented how to use it and licenced it over MIT license so feel free to use it. Hopefully it will save you the time I had to spend realizing what is going on.
URL: https://github.com/mbriskar/dokku-wkhtmltopdf
I'm working on a Perl app that's intended to be deployed using Module::Build. I've need to install a number of modules through CPAN because they weren't available through Ubuntu's package manager—or, more correctly, the internal apt-get mirror all our servers use. While this is all well and good on our development server, IT is (understandably) reluctant to run code on production machines that isn't cached or otherwise controlled in-house.
As we don't currently have a CPAN mirror, this basically means that I need to get all of these non-Ubuntu modules into one place so they can be archived and/or committed to version control. The ideal solution would be to check the utility out from source control, change a couple config variables for databases and such, maybe run a build/install command, and be done. Fortunately, the development server is a clone of the production server, so modules using XS or other architecture-specific features shouldn't cause an issue.
I think the cleanest way to handle this would be checking in source tarballs for the modules I need and setting Module::Build to use those to resolve its dependencies instead of looking to CPAN, but I don't see an option for that. Is this something that's doable, or is there another way to round up all the modules I need for an essentially offline deployment?
As mentioned in the comments above, Pinto may suit your needs as it creates your own CPAN repo.
Pinto has two primary goals. First, Pinto seeks to address the problem
of instability in the CPAN mirrors. Distribution archives are
constantly added and removed from the CPAN, so if you use it to build
a system or application, you may not get the same result twice.
Second, Pinto seeks to encourage developers to use the CPAN toolchain
for building, testing, and dependency management of their own local
software, even if they never plan to release it to the CPAN.
Pinto accomplishes these goals by providing tools for creating and
managing your own custom repositories of distribution archives. These
repositories can contain any distribution archives you like, and can
be used with the standard CPAN toolchain. The tools also support
various operations that enable you to deal with common problems that
arise during the development process.
Second Answer
Alternatively, if you are only going to deploy to Ubuntu, you can turn CPAN modules - and your own into Debian packages with dh-make-perl. You can then host them in your own repo with reprepro. The beauty of this is that you can update the packages and do a
apt-get update
apt-get upgrade
on the client machines, so long as they have your own repo as a source
Check out Stratopan and Carton.
Also see:
Deploying Perl Application
How can I manage Perl module dependencies?
I'm not sure how common it is, but I've been using perlbrew and Pinto together to solve some of the issues you are talking about.
With perlbrew, I'm not interacting with the "system" perl. There is an application perl and a system perl, and there is no risk of me installing a later version of a module that somehow interferes with something the system perl was doing.
With Pinto, I have archived versions of the CPAN modules that I know will work.
When I deploy, I build a perlbrew perl (with an alias like "prod" or something) and then I install all the necessary modules into that perlbrew perl using the Pinto repository. I currently facilitate this with a cpan bundle module (that module also goes into the Pinto repo) so you can just install the bundle from the repository and it automatically puts all your dependencies in.
I have a Linux server that has no access to the internet (access is prevented by a firewall). I would like to install a new Perl. What are my options and what is the best way to do this? The system Perl (included in OS installation) must remain unchanged.
I have been using perlbrew and I think it is the best way to do an online installation. But all the steps involved in perlbrew seem to require internet access: you download it from the net, it downloads new Perl versions from the net etc. and I haven't found a glue how to make it work offline.
If perlbrew is out of question I could build Perl from source into a custom location on the server. I assume that this could end up being complicated, time-consuming and error-prone. And every time I update Perl I have make a new build manually.
There can also be other ways to install that I'm not currently aware of. And of course I could stick with the system Perl but it is an outdated version and I'm already using the new syntax features. Or I could start negotiations to change the firewall policy to allow internet access for perlbrew.
But all the steps involved in perlbrew seem to require internet access
Not if properly configured.
To install perlbrew itself off-line, install the App-perlbrew dist. Following its dependencies manually is a chore, so instead prepare a MiniCPAN mirror (with -p to include Perl dists), take it over to the target machine and configure CPAN to use the local mirror. Run cpan App::perlbrew to install.
After perlbrew is installed, run its mirror command to configure a CPAN mirror into $PERLBREWROOT/Config.pm. Edit this file to change it to the local MiniCPAN mirror. Drop Perl dist tarballs into $PERLBREWROOT/dists/.
Be aware that compiling Perl requires a working C compiler toolchain, and optionally the development files for libdb (BerkeleyDB) and gdbm. (Read the INSTALL file once over, even though perlbrew's autoconfiguration and Perl's configure.SH defaults hide these details from you.)
The compiler toolchain is probably much more difficult to procure off-line, unless the OS installation has already been used before for compiling other C stuff.
There's nothing that special about perlbrew. If you aren't going to use it to download the Perl sources, it's not saving you that much. Once you have the Perl sources, you just need to configure and install it:
% ./Configure -des -Dprefix=/path/to/installation
% make install
Once done, everything for that Perl is under that installation path.
I dislike perlbrew mostly because it hides from people how amazingly simple this task is so they feel like they can't do it on their own.
Have you considered attacking it from a different direction? Keeping this up-to-date is going to be a pain if you have to request internet access each time. Likewise, if you've missed out/misconfigured any packages in your CPAN mirror it's difficult to correct once you're actually trying to use them.
Perhaps just build a small VM with a cut-down linux + perl + modules. Keep that up-to-date at your end and just take the whole lot in on a USB stick. You'd have a known-working easy-to-setup installation.
What I personally do is using git checkout when I'm offline (and not on vacation). Once you have the whole git work directory, it's trivial to build any released version by checking out the tags:
git checkout v5.17.4
git clean -f # cleanup previously compiled .o files etc
sh ./Configure ...
Depending on how you can transfer files to your host, this can be handy, since you you can also setup a private git repo there so other computer can git push new commits to there.
After testing my Catalyst application and deciding to deploy it I would like to package it up so I can easily pull it in on the staging and live servers, manage dependencies and easily roll-back via the flexibility of package versioning. As my production OS is Ubuntu I figured packaging it as a deb package would make most sense.
I am predicting I will have to create a second package of all my perl module dependencies as many are not provided by my distribution, or package them independently - though that may be a lot of work.
Does anyone have any experience of doing this - or a sane, similar alternative?
To build your own Debian packages out of CPAN packages:
Install Debian helper scripts
sudo apt-get install dh-make-perl
Download MODULE from CPAN and build Debian package
cpan2deb MODULE
dh-make-perl is actually the right tool to put CPAN modules into Debian packages. Together with apt-file it can even prepare proper dependencies for you.
About being able to "easily roll-back" though requires special attention to versioning or workflows. There are several approaches that might get your job done here:
If you can force-downgrade packages you have won already most of the time unless you have very specific maintainer scripts that do jobs on package upgrades - then you will have to make them able to handle the downgrade, too
If you have to go the regular upgrade-path, using approaches like using "< newversion>+rollback< oldversion>" or similar might be something to consider.
Dependency-packages are always a good idea for deployments to make sure no required package actually is missing. Also, you might want to invest some time in management frameworks like puppet, they might come handy here, too.