Meta-configuration tool - configuration-files

We design and deliver bespoke stand-alone systems that consist of a variable number of software modules distributed over a variable number of machines. One example system consists of circa 140 software modules distributed over approximately 60 machines.
Each of the software modules requires some kind of configuration file that would include things like database access URLs, server IP addresses ad so on. Many of these parameters are interdependent; for example, if module "A" is deployed on machine "X" then module "B" needs the IP address for machine "X" in it's config file.
Before we hack something together in Python or some other scripting language, are there any existing tools that do this kind of thing? I'm thinking something like transforming a system description into a bunch of configuration files.

Apparently my google-fu has returned today, and I've found these:
Cfengine
Chef
Puppet
SmartFrog
And a wiki page with similar tools here.
All of these tools do far more than I had originally considered (for example, deployment), but this might be a useful thing for us anyway.

Related

A meta-build solution managing multiple projects

I'm looking for an advice about the software that did not catch my eye during a few days of searches.
Is there any software solution for deployment of multiple projects at once that operates at the level of source-building package manager (like ports, portages, or nix), but could be resided locally?
As for details, we have few loosely connected software projects with the following traits:
Projects are written mostly on C/C ++ and Python, but more languages are represented as a minor mixture (Haskell, Rust, Perl)
Projects are grouped together within git superprojects with certain build/deployment presets carefully tuned for particular environments (and adapted for particular purposes by subset of boolean-like options)
We have already done with well-elaborated CMake build scripts for C++ projects that do support options, build configurations, exporting targets, and so on. It would be expensive to switch from it now.
We are forced to deal with various Linux distributions (from Gentoo and Ubuntu to Debian and CentOS).
We need an unifying build&deployment tool for various environments. The CMake does not integrate well with non-compiling languages (e.g. does not natively support local Python installations via virtualenv).
Instead of changing the things we already developed, I would like to use them in the manner the OS package manager does. For my vision, it should be something pretty similar to the so-called meta-build tool. In fact, the Gentoo Portages are pretty close:
Easy customization with simple boolean options (useflags, profiles)
Delegation of the building procedures to the reliable and customizable tools designed and well-tested especially for this purpose (CMake, autotools, Bazel, etc.)
Offers an ability to change the target compiler installation and specify the building process in a clear declarative way with a standardized instructions set.
Portages can not be ran locally, though (and have other unrelated flaws).
I have to become very confident before switching the whole build system to something like Meson or Bazel or whatever else I could find up to the moment.
Update
To be more specific, I could refer to what we have done up to the moment. The one of the superprojects we're maintaining deals with particular scientific experiment:
All the sub-projects are listed as git submodules
A simple BASH-script maintaining the entire build-install-deploy lifecycle
The presets referring particular settings for a concrete environment are represented as BASH sources and are included by this maintaining shell script.
As few days more had come with no result, I coming up with intention to write my own solution based on experience we already gained due to these shell scripting activity.
For a similar task Bob was developed some time ago. If you still need a build system it might be worth a look. There are some small tutorials available as well as a complex basement example build a linux from scratch.
Basically it is just a environment for a controlled execution of bash scripts with dependency and variant handling, CI integration, IDE support,..
Take a look at Sparrow - Linux scripting platform, it supports projects hierarchy and tasks - configurable scripts, where configuration is made in declarative way in Yaml, Json formats. This hopefully might address many if not all the needs you've mentioned here.
Disclosure - I am the tool author.

How should I improve my perl application deployment process?

I develop and maintain a bioinformatics application suite of 50+ scripts and its deployment process is a mess:
Entire suite is in one big git repository. It has lots of CPAN dependencies, and dozens of internal modules as well.
Development platform is Linux.
Deployment platforms are Windows (20+ users), Mac (10+), Linux (2-3). Most are not 'power users'.
For windows, I have one installer (made with NSIS) for strawberry perl + required modules (ie, I installed strawberry on a windows box, installed all modules and zipped c:\strawberry), and another installer for the suite-- I did this b/c the suite is updated a lot more than the list of required modules.
For Mac, I bundle perl 5.14, all required cpan modules, and the application suite into a double-clickable installer. I don't use the system perl b/c it tends to be out of date. I bundle everything together unlike on windows b/c I suck at mac.
For Linux, I handle their installations manually since there are only a few of them, and they use different distros.
This is obviously a mess that grew organically over several generation of developers. Ideally I would like to create cpan-installable distributions out of the internal libraries and various groups of related scripts, and use module dependencies to let cpan install them for me.
But I'm not sure what the best approach for this is, b/c I'd still need distribute perl itself, would have to write some sort of non-command-line interface to CPAN, control the exact versions of 3rd party CPAN modules, point it by default to my "DarkPan" where I would store our modules, how I would push updates, etc. etc.
I don't think I can use PerlApp or Par since afaik those are for bundling single scripts, not an entire suite of them.
Any advice highly appreciated.
Besides the 3 platforms mentioned (more, if you count the Linux variants), you really have a couple different problems:
Deployment of a standard known-good Perl executable and libraries (CPAN modules).
Deployment of your Perl scripts and modules.
Once upon a time, I supported a large Solaris Perl installation. I tried for a while to stand up a Linux Perl installation "side-by-side", re-using the same CPAN modules. Didn't work. The big problem for me is that a fair number of Perl modules require compilation, which means they target a specific platform. I ended up just with 2 installs, and always remembered to install a new CPAN module in both areas.
We're now 100% Windows, so I don't have the same issue. However, we do run Perl off a shared network drive. All the users map this drive, and run a Registry script that associates .PL files with the network install of Perl. (See my answer to this other Perl question.)
So, besides the mapped drive and the Registry script, users don't need to install anything. Even the CPAN modules are picked up from the network. This solves item #1 (for Windows only users).
Same thing holds true for item #2: the scripts are stored on a network drive (same one) and the users run another Registry script to include the scripts folder in their search PATH. We edit our scripts in one area, and have a "Check-In 'n Release" ("CINR") that we use to, well, check-in and release scripts to the area the users point to. The users can double-click the scripts in Explorer, run them in DOS, or even better yet get them included in the contextual-menu in Explorer, etc. (Actually, we use a .NET application to map the drive and make all these settings for the user, but it can be done much simpler.)
So, how does this help with the other platforms, Linux and Mac? As I ran into with my Solaris/Linux experiment, I think you're stuck with different Perl installation for all 3 platforms, although you should be able to reach the same network drive for your Perl scripts and modules.
The Perl installation might even be OK on a network drive for the Linux users. It's probably easier for them than the Windows users. Mac users are tough. I administer a home Mac network, and I think network drives are very difficult to do in Mac OS X compared to other OSes. It should be as easy as in Linux since so much is the same, but there are very strange problems (for me) mapping NFS and SMB drives. AFP drives are a little easier for the user to map manually, but not so easy to map programmatically.
My Mac recommendation is to try using Platypus. It's definitely great at bundling scripts into a double-clickable application, although your interface options are limited to output only (no user input allowed during execution that I can tell). Not sure if you could put an entire Perl installation into the Platypus app or not, but if you could the paths figured out, you might be able to.
Good luck!
You may wish to check out CAVA packager. It can deal with multiple scripts in a single package.

Automating Solaris custom software deployment and configuration for multiple nodes

Essentially, the question I'd like to ask is related to the automation of software package deployments on Solaris 10.
Specifically, I have a set of software components in tar files that run as daemon processes after being extracted and configured in the host environment. Pretty much like any server side software package out there, I need to ensure that a list of prerequisites are met before extracting and running the software. For example:
Checking that certain users exists, and they are associated with one or many user groups. If not, then create them and their group associations.
Checking that target application folders exist and if not, then create them with pre-configured path values defined when the package was assembled.
Checking that such folders have the appropriate access control level and ownership for a certain user. If not, then set them.
Checking that a set of environment variables are defined in /etc/profile, pointed to predefined path locations, added to the general $PATH environment variable, and finally exported into the user's environment. Other files include /etc/services and /etc/system.
Obviously, doing this for many boxes (the goal in question) by hand will certainly be slow and error prone.
I believe a better alternative is to somehow automate this process. So far I have thought about the following options, and discarded them for one reason or another.
Traditional shell scripts. I've only troubleshooted these before, and I don't really have much experience with them. These would be my last resort.
Python scripts using the pexpect library for analyzing system command output. This was my initial choice since the target Solaris environments have it installed. However, I want to make sure that I'm not reinveting the wheel again :P.
Ant or Gradle scripts. They may be an option since the boxes also have Java 1.5 enabled, and the fileset abstractions can be very useful. However, they may fall short when dealing with user and folder permissions checking/setting.
It seems obvious to me that I'm not the first person in this situation, but I don't seem to find a utility framework geared towards this purpose. Please let me know if there's a better way to accomplish this.
I thank you for your time and help.
Most of those steps sound like things handled by use of a packaging system to install your package. On Solaris 10, that would be the SVR4 packaging system included with the OS.

Best practices for deploying tools & scripts to production?

I've got a number of batch processes that run behind the scenes for a Linux/PHP website. They are starting to grow in number and complexity, so I want to bring a small amount of process to bear on them.
My source tree has a bunch of cpp files and scripts, organized with development but not deployment in mind. After compiling all the executables, I need to put various scripts and binaries on a cluster of machines. Different machines need different executables, scripts, and config files for their batch processes. I also have a few of tools that I've written that belong on every machine. At the moment, this deployment process is manual and error prone.
I'm guessing I'm just going to end up with a script that runs at the root of the source tree and builds a smaller tree of everything necessary for any of the machines. Then, I'll just rsync that to the appropriate machines. But I'm curious how other people are managing this type of problem. Any ideas?
There are a several categories of tool here. Some people use a combination of tools from these categories. I sometimes use, for example, both Puppet and Capistrano. See Puppet or Capistrano - Use the Right Tool for the Job for a discussion.
Scripting Tools aimed at Deploying an Application:
The general pattern with tools in this category is that you create a script and/or config file, often with sets of commands similar to a Makefile, and the tool will ssh over to your production box, do a checkout of your source, and run whatever other steps are necessary.
Tools in this area usually have facilities for rollback to a previous version. So they'll check out your source to releases/ directory, and create a symbolic link from "current" to "releases/" if all goes well. If there's a problem, you can revert to the previous version by running a command that will remove "current" and link it to the previous releases/ directory.
Capistrano comes from the Rails community but is general-purpose. Users of Capistrano may be interested in deprec, a set of deployment recipes for Capistrano.
Vlad the Deployer is an alternative to Capistrano, again from the Rails community.
Write your own shell script or Makefile.
Options for getting the files to the production box:
Direct checkout from source. Not always possible if your production boxes lack development tools, specifically source code management tools.
Checkout source locally, then tar/zip it up. Use scp or rsync to copy the tarball over. This is sometimes preferred for something like an Amazon EC2 deployment, where a compressed tarball can save time/bandwidth.
Checkout source locally, then rsync it over to the production box.
Packaging Tools
Use your OS's packaging system to generate packages containing the files for your app. Create a master package that has as dependencies the other packages you need. The RubyWorks system is an example of this, used to deploy a Rails stack and sample application. Then it's a matter of using apt, yum/rpm, Windows msi, or whatever to deploy a given version. Rollback involves uninstalling and reinstalling an old version.
General Tools Aimed at Installing Apps/Configs and Maintaining a Set of Systems
These tools do not specifically target the problem of deploying a web app, but rather the more general problem of deploying/maintaining Apps/Configs for a set of servers, or an entire company's workstations. They are aimed more at the system administrator than the web developer, though either can find them useful.
Cfengine is a tool in this category.
Puppet aims to improve on Cfengine. It's got a learning curve but many find it worth the time to figure out how to do the configs. Once you've got it going, each box checks the central server periodically and makes sure everything is up to date. If someone edits a file or changes a permission, this is detected and corrected. So, unlike the deployment tools above, Puppet not only puts files in the right place for you, it ensures they stay that way.
Chef is a little younger than Puppet with a similar approach.
Smartfrog is another tool in this category.
Ansible works with plain YAML files and does not require agents running on the servers it manages
For a comparison of these and many more tools in this category, see the Wikipedia article, Comparison of open source configuration management software.
Take a look at the cfengine tutorial to see if cfengine looks like the right tool for your situation. It may be a little too complicated for a small website, but if it is going to involve more computers and more configuration in the future, at some point you will end up using cfengine or something like that.
Create your own packages in the format your distribution uses, e.g. Debian packages (.deb). These can either be copied to each machine and installed manually, or you can set up your own repository, and add it to your list of sources.
Your packages should be set up so that the scripts they contain consult a configuration file, which is different on each host, depending on what scripts need to be run on each.
To tie it all together, you can create a meta package that just depends on each of the other packages you create. That way, when you set up a new server, you install that one meta package, and the other packages are brought in as dependencies.
Although this process sounds a bit complicated, if you have many scripts and many hosts to deploy them to, it can really pay off in the long run.
I have to roll out PHP scripts and Apache configurations to several customers on a frequent basis. Since they all run Debian Linux, I've set up a Debian package repository on my server and the all the customer has to do is type apt-get upgrade and they get the latest version.
The first thing to do is get all these scripts into a source control repository (svn or git are good) so that you can track changes to these scripts over time.
If you are interested in ruby, check out Capistrano, it is well suited deploying things to multiple machines in a cluster, and is fairly easy to set up. It can read files directly from your version control system.
Puppet is another tool that can be used in this situation. It is similar to cfengine - you create a model of the desired deployment and Puppet figures how to get the environment to this state.

Deploying Perl to a share nothing cluster

Anyone have suggestions for deployment methods for Perl modules to a share nothing cluster?
Our current method is very manual.
Take down half the cluster
Copy Perl modules ( CPAN style modules ) to downed cluster members
ssh to each member and run perl Makefile.pl; make ; make install on each module to be installed
Confirm deployment
In service the newly deployed cluster members, out of service the old cluster members and repeat steps 2 -> 4
This is obviously far from optimal, anyone have or know of good tool chains for deploying Perl modules to a shared nothing cluster?
Take one node offline, install Perl, and then use it to reimage the other nodes.
At least, that's how I imagine you'd want to install software in a shared-nothing cluster. Perl is just the application you happen to be installing.
Assuming all the machines are identical, you should be able to keep one canonical installation, and use rsync or something to keep the others in updated.
I have, in the past, developed a Perl program which used the Expect module (from CPAN) to automate basically the process you described, automatically sshing to each host, copying any necessary files, and performing the installations. Unfortunately, this was developed on-site for a client, so I do not have access to the code to share. If you're familiar with Expect, it shouldn't be too difficult to set up, though.
We currently have a clustered Perl application that does data processing. We also have numerous CPAN modules and modules that we've developed that the software depends on. When you say 'shared nothing', I'm assuming you're referring to things like NFS mounts.
If the machines have identical configurations, then you may be able to build your entire application into a single directory structure (eg: /opt/my-app), tar it up and that could be come the only thing you need to push to the boxes.
As far as deploying it to the boxes, you might be able to use Capistrano. We developed a couple of our own cluster utilities that piggybacked off of ssh - I've released one form of that utility: parallel-jobs. Its README shows an example of executing multiple parallel ssh commands. It's a small step to extend that program to be able to know about your cluster and then be able to execute the same command across the cluster (as opposed to a series of different commands).
If you are using Debian or Ubunto OS you could package your Perl modules - I have open sourced some code to help with this: Perl module builder it's still very rough but does work and can be made to work on your own code as well as CPAN modules, this then makes deployment much easier.
There is also project to get RedHat rpms for all of CPAN, Dave Cross gave a talk Perl in RPM-Land which may be of use.
If you are on some other system which doesn't have packaging then the rsync option (install on one machine and then rsync to the others) should work as well, note you can mount a windows share and rsync to it across unix if needed.
Using a central manager like Puppet makes creating and maintaining machines in a cluster a lot easier to manage, from installing code to managing users and email configuration. There is also a Perl project in the pipeline to do something similar but this has not been made public yet.
Capistrano is a tool that allows you to run commands on a group of servers; it is perfectly suited to making your task considerably easier.
Further down the line of automation, but also complexity, is Puppet that allows you do define a group of servers, give them roles and then push out sets of code to every machine subscribing to a certain role.
I am not sure exactly what a share nothing cluster is, but if it uses some base *nix system like Fedora, Mandriva, or Ubuntu. Many of the perl modules are precompiled for specific architectures. You can easily run these.
If these systems are of the same arch you can do as someone else said and just copy the compiled modules from system to system, just make sure you have all of the dependancies as well on the recipient system.