Version Control a file with per machine dependent stuff

Version Control a file with per machine dependent stuff - version-control

I have a config file in my project that includes some info that is per machine dependent (db username, password, path). I understand that in this particular case, I could enforce everybody to use the same username, db path, and password to keep this simple, but there must be another way to deal with this problem.
I use mercurial, if you care, but I am ok with just a theoretical answer if you are unfamiliar with hg specifics.

A common way to handle this is to put a config.example or similar under version control and force the user to copy it and make any necessary changes. That way the user can pull down the overall structure of the file from your repository without overwriting local changes.
Alternatively, you could make your config file provide only defaults, with the option to source a subset of variables from a higher-priority custom config file (in the same format) which the user may or may not provide.

You'll want to use the .hgignore file to not include the config file in the repository.
This will allow everyone to have their own version of the config file.
Basically, you just want to add the relative path to the config file and Mercurial commands will ignore it. So the file would look like this:
config/dbconfig.ext
Edit
I just realized you still want to be able to version control the config file (misunderstood the question). So I suggest moving the parts of the config file that are dependent into their own config file and then applying the fix above. That way, you can still have the regular config information under version control and keep part of it separate for each person's machine.

I have per machine databases for my PHP projects. What I do is check the hostname at runtime. If it is one host, I feed it certain credentials. If another, feed it different credentials.
On some systems I create a list of credentials and then just go down the line trying them until one of the connections works. If the list is exhausted, the connection cannot be made.

I've never found a solid method for handling this type of configuration files. My final solution was to just maintain a version of each file and use symbolic links. That way each server has the same file path, but different root file.

Without knowing exactly what is in your config file, I'm going to assume your file has some stuff that is machine-dependent (e.g., db password, paths) and other stuff that is not (db hostname, maybe some paths relative to a path that is configured on a per-machine basis, etc.)
If that's the case, what you want to do is re-factor your config file so that you have two config files---one for the common stuff, one for the machine-specific stuff. Check the common one in, and add the machine-specific configuration to the ignore file.

Related

How to make Snakemake recognize Globus remote files using Globus CLI?

I am working in a high performance computing grid environment, where large-scale data transfers are done via Globus. I would like to use Snakemake to pull data from a Globus path, process the data, and then push the processed data to a different Globus path. Globus has a command-line interface.
Pulling the data is no problem, for I'd just create a rule that would run globus transfer to create the requisite local file. But for pushing the data back to Globus, I think I'll need a rule that can "see" that the file is missing at the remote location, and then work backwards to determine what needs to happen to create the file.
I could create local "proxy" files that represent the remote files. For example I could make a rule for creating 'processed_data_1234.tar.gz' output files in a directory. These files would just be created using touch (thus empty), and the same rule will run globus transfer to push the files remotely. But then there's the overhead of making sure that the proxy files don't get out of sync with the real Globus-hosted files.
Is there a more elegant way to do this akin to the Remote File capability? Is it difficult to add a Globus CLI support for Snakemake? Thanks in advance for any advice!

Would it help to create a utility function that would generate a list of all desired files and compare it against the list of files available on globus? Something like this (pseudocode):
def return_needed_files():
list_needed_files = [] # either hard-coded or specified with some logic
list_available = [] # as appropriate, e.g. using globus ls
return [i for i in list_needed_files if i not in list_available]
# include all the needed files in the all rule
rule all:
input: return_needed_files

How to read multiple config file from Spring Cloud Config Server

Spring cloud config server supports reading property files with name ${spring.application.name}.properties. However I have 2 properties files in my application.
a.properties
b.properties
Can I get the config server to read both these properties files?

Rename your properties files in git or file system where your config server is looking at.
a.properties -> <your_application_name>.properties
a.properties -> <your_application_name>-<profile-name>.properties
For example, if your application name is test and you are running your application on dev profile, below two properties will be used together.
test.properties
test-dev.properties
Also you can specify additional profiles in bootstrap.properties of your config client to retrieve more properties files like below. For example,
spring:
profiles: dev
cloud:
config:
uri: http://yourconfigserver.com:8888
profile: dev,dev-db,dev-mq
If you specify like above, below all files will be used together.
test.properties
test-dev.properties
test-dev-db.prpoerties
test-dev-mq.properties

Note that the provided answer assumes your property files address different execution profiles. If they dont, i.e., your properties are split into different files for some other reason, e.g., maintenance purposes, divided by business/functional domain, or any other reason that suits your needs, then, by defining a profile for each such file, you are just "abusing" the profile feature, for achieving your goal (multiple property files per app).
You could then ask "OK, so what is the problem with that?". The problem is that you restrain yourself from various possibilities that you would otherwise have. If you actually want to customize your application configuration by profile you will have to create pseudo, sub, profiles for that since the file name is already a profile. Example:
Your application configuration could be customized by different profiles, which you use inside your springboot application (e.g. in #Profile() annotation), let them be dev, uat, prod. You can boot your application setting different profiles as active, e.g. 'dev' vs 'uat', and get the group of properties that you desire. For your a.properties b.properties and c.properties file, if different file names were supported, you would have a-dev.properties b-dev.properties and c-dev.properties files vs a-uat.properties b-uat.properties and c-uat.properties files for 'dev' and 'uat' profile.
Nevertheless, with the provided solution, you already have defined 3 profiles for each file: appname-a.properties appname-b.properties, and appname-c.properties: a, b, and c. Now imagine you have to create a different profile for each... profile(! it already shows something goes wrong here)! you would end up with a lot of profile permutations (which would get worse as files increase): The files would be appname-a-dev.properties, appname-b-dev.properties, app-c-dev.properties vs appname-a-uat.properties, appname-b-uat.properties, app-c-uat.properties, but the profiles would have been increased from ['dev', ' uat'] to ['a-dev', 'b-dev', 'c-dev', 'a-uat', 'b-uat', 'c-uat'] !!!
Even worse, how are you going to cope with all these profiles inside your code and more specifically your #Profile() annotations? Will you clutter the code space with "artificial" profiles just because you want to add one or two more different property files? It should have been sufficient to define your dev or uat profiles, where applicable, and define somewhere else the applicable property file names (which could then be further supported by profile, without any other configuration action), just as it happens in the externalized properties configuration for individual springboot apps
For argument completeness, I will just add here that if you want to switch to .yml property files one day, with the provided profile-based naming solution, you also loose the ability to define different "yaml document sections per profile" inside the same .yml file (Yes, in .yml you can have one property file yet define multiple logical yml documents inside, which its usually done for customizing the properties for different profiles, while having all related properties in one place). You loose the ability because you have already used the profile in the file name (appname-profile.yml)
I have issued a pull request with a minor fix for spring-cloud-config-server 1.4.x, which allows defining additionally supported file names (appart from "application[-profile]" and "{appname}[-profile]", that are currently supported) by providing a spring.cloud.congif.server.searchNames environment property - analogous to spring.config.name for springboot apps. I hope it gets reviewed and accepted.

I came across the same requirement lately with a little more constraint that I am not allowed to play around the environment profiles. So I wasn't allowed to do as the accepted answer. I'm sharing how I did it as an alternative to those who might have same case as me.
In my application, I have properties such as:
appxyz-data-soures.properties
appxyz-data-soures-staging.properties
appxyz-data-soures-production.properties
appxyz-interfaces.properties
appxyz-interfaces-staging.properties
appxyz-interfaces-production.properties
appxyz-feature.properties
appxyz-feature-staging.properties
appxyz-feature-production.properties
application.properties // for my use, contains local properties only
bootstrap.properties // for my use, contains management properties only
In my application, I have these particular properties set that allow me to achieve what I needed. But note I have the rest of needed config as well (enable cloud config, actuator refresh, eureka service discovery and so on) - just highlighting these for emphasis:
spring.application.name=appxyz
spring.cloud.config.name=appxyz-data-soures,appxyz-interfaces,appxyz-feature
You can observe that I didn't want to play around my application name but instead I used it as prefix for my config property files.
In my configuration server I configured in application.yml to capture pattern: 'appxyz-*':
spring:
cloud:
config:
server:
git:
uri: <git repo default>
repos:
appxyz:
pattern: 'appxyz-*'
uri: <another git repo if you have 1 repo per app>
private-key: ${git.appxyz.pk}
strict-host-key-checking: false
ignore-local-ssh-settings: true
private-key: ${git.default.pk}
In my Git repository I have the following. No application.properties and bootstrap because I didn't want those to be published and overridden/refreshed externally but you can do if you want.
appxyz-data-soures.properties
appxyz-data-soures-staging.properties
appxyz-data-soures-production.properties
appxyz-interfaces.properties
appxyz-interfaces-staging.properties
appxyz-interfaces-production.properties
appxyz-feature.properties
appxyz-feature-staging.properties
appxyz-feature-production.properties
It will be the pattern matching pattern: 'appxyz-*' that will capture and return the matching files from my git repository. The profile will also apply and fetch the correct property file accordingly. The prioritization of value is also preserved.
Furthermore, if you wish to add more file in your application (say appxyz-circuit-breaker.properties), we only need to do:
Add the name pattern in the spring.cloud.config.name=...,appxyz-circuit-breaker
The add the copies of the file locally and also externally (in the git repo.
No need to add/modify more or restart your configuration server later on. For new application, it's like a one time registration thing to add an entry under the repos of application.yml.
Hope it helps in one way or another!

In your application bootstrap.properties, you have to specify like below:
spring.application.name=a,b

Configuring evolutions script for different environment

These are my configuration files for both development and testing environments. I'm displaying only the db configuration section.
dev.conf
db.default.driver=org.postgresql.Driver
db.default.url="jdbc:postgresql://localhost/mydb"
db.default.user=admin
db.default.password=admin
applyEvolutions.default=true
evolutionplugin=disabled
test.conf
db.default.driver=org.postgresql.Driver
db.default.url="jdbc:postgresql://localhost/mytestdb"
db.default.user=admin
db.default.password=admin
applyEvolutions.default=true
evolutionplugin=enabled
Basically I'm planning to execute the evolutions db script only to the testing database. So I will clean up the records before triggering the test-script.
Based on the documentation the evolution scripts has to be put in folder with the same name as the datasource, which is default in this case:
~/conf/evolutions/default/
My question:
Is there a way for me to put the scripts in different location and set the configuration file to refer to that one instead? I'd love to put the test scripts in this path:
~/conf/evolutions/test/
It'll be troublesome for me if in one way or another someone accidentally enable the evolutions in the dev.conf file and since both configuration files share the same datasource name(default) then all the clean-up queries in the default folder are executed.
Another workaround that I can think of right now is by using different datasource name for different environments but this will imply code change because then the application doesn't use the default datasource anymore. I'd like to avoid that.

Maybe you could use the evolutions logic directly from a text fixture of some kind?
play.api.db.evolutions.Evolutions.applyFor(dbName, path) seems like it might do the trick.

CherryPy : Accessing Global config

I'm working on a CherryPy application based on what I found on that BitBucket repository.
As in this example, there is two config files, server.cfg (aka "global") and app.cfg.
Both config files are loaded in the serve.py file :
# Update the global settings for the HTTP server and engine
cherrypy.config.update(os.path.join(self.conf_path, "server.cfg"))
# ...
# Our application
from webapp.app import Twiseless
webapp = Twiseless()
# Let's mount the application so that CherryPy can serve it
app = cherrypy.tree.mount(webapp, '/', os.path.join(self.conf_path, "app.cfg"))
Now, I'd like to add the Database configuration.
My first thought was to add it in the server.cfg (is this the best place? or should it be located in app.cfg ?).
But if I add the Database configuration in the server.cfg, I don't know how to access it.
Using :
cherrypy.request.app.config['Database']
Works only if the [Database] parameter is in the app.cfg.
I tried to print cherrypy.request.app.config, and it shows me only the values defined in app.cfg, nothing in server.cfg.
So I have two related question :
Is it best to put the database connection in the server.cfg or app.cfg file
How to access server.cfg configuration (aka global) in my code
Thanks for your help! :)

Put it in the app config. A good question to help you decide where to put such things is, "if I mounted an unrelated blog app at /blogs on the same server, would I want it to share that config?" If so, put it in server config. If not, put it in app config.
Note also that the global config isn't sectioned, so you can't stick a [Database] section in there anyway. Only the app config allows sections. If you wanted to stick database settings in the global config anyway, you'd have to consider config entry names like "database_port" instead. You would then access it directly by that name: cherrypy.config.get("database_port").

What's the best Perl module for hierarchical and inheritable configuration?

If I have a greenfield project, what is the best practice Perl based configuration module to use?
There will be a Catalyst app and some command line scripts. They should share the same configuration.
Some features I think I want ...
Hierarchical Configurations to cleanly maintain different development and live settings.
I'd like to define "global" configurations once (eg, results_per_page => 20), have those inherited but override-able by my dev/live configs.
Global:
results_per_page: 20
db_dsn: DBI:mysql;
db_name: my_app
Dev:
inherit_from: Global
db_user: dev
db_pass: dev
Dev_New_Feature_Branch:
inherit_from: Dev
db_name: my_app_new_feature
Live:
inherit_from: Global
db_user: live
db_pass: secure
When I deploy a project to a new server, or branch/fork/copy it somewhere new (eg, a new development instance), I want to (one time only) set which configuration set/file to use, and then all future updates are automatic.
I'd envisage this could be achieved with a symlink:
git clone example.com:/var/git/my_project . # or any equiv vcs
cd my_project/etc
ln -s live.config to_use.config
Then in the future
git pull # or any equiv vcs
I'd also like something that akin to FindBin, so that my configs can either use absolute paths, or relative to the current deployment. Given
/home/me/development/project/
bin
lib
etc/config
where /home/me/development/project/etc/config contains:
tmpl_dir: templates/
when my perl code looks up the tmpl_dir configuration it'll get:
/home/me/development/project/templates/
But on the live deployment:
/var/www/project/
bin
lib
etc/config
The same code would magically return
/var/www/project/templates/
Absolute values in the config should be honoured, so that:
apache_config: /etc/apache2/httpd.conf
would return "/etc/apache2/httpd.conf" in all cases.
Rather than a FindBin style approach, an alternative might be to allow configuration values to be defined in terms of other configuration values?
tmpl_dir: $base_dir/templates
I'd also like a pony ;)

Catalyst::Plugin::ConfigLoader supports multiple overriding config files. If your Catalyst app is called MyApp, then it has three levels of override: 1) MyApp.pm can have a __PACKAGE__->config(...) directive, 2) it next looks for MyApp.yml in the main directory of the app, 3) it looks for MyApp_local.yml. Each level may override settings in each other level.
In a Catalyst app I built, I put all of my immutable settings in MyApp.pm, my debug settings in MyApp.yml, and my production settings in MyApp_<servertype>.yml and then symlinked MyApp_local.yml to point at MyApp_<servertype>.yml on each deployed server (they were all a little different...).
That way, all of my config was in SVN and I just needed one ln -s step to manually config a server.

Perl Best Practices warns against exactly what you want. It states that config files should be simple and avoid the sort of baroque features you desire. It goes on to recommend three modules (none of which are Core Perl): Config::General, Config::Std, and Config::Tiny.
The general rational behind this is that the editing of config files tends to be done by non-programmers and the more complicated you make your config files, the more likely they will screw them up.
All of that said, you might take a look at YAML. It provides a full featured, human readable*, serialization format. I believe the currently recommend parser in Perl is YAML::XS. If you do go this route I would suggest writing a configuration tool for end users to use instead of having them edit the files directly.
ETA: Based on Chris Dolan's answer it sounds like YAML is the way to go for you since Catalyst is already using it (.yml is the de facto extension for YAML files).
* I have heard complaints that blind people may have difficulty with it

YAML is hateful for config - it's not non-programmer friendly partly because yaml in pod is by definition broken as they're both white-space dependent in different ways. This addresses the main problem with Config::General. I've written some quite complicated config files with C::G in the past and it really keeps out of your way in terms of syntax requirements etc. Other than that, Chris' advice seems on the money.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse