Ansible: Playback with same defaults/magic as a role's main.yml - deployment

I often have a need for "custom playbooks" that do specific tasks, but still within one role, e.g. for a database backup task, I'd want it to be in roles/databases/backup.yml. A custom task like this would enjoy the same "magic" that main.yml enjoys (automatically reading role variables and so on).
The only workaround for this is relying on tags inside main.yml, but that's cumbersome - requires creating an "obstacle course" of tags just to ensure certain tasks are run, and specifying the tag on command-line (since a play cannot run a tag-filtered list of other plays).
I end up having to do everything manually and explicitly in a custom file.

Thinking further, the confusion is because I'm trying to work around two limitations of Ansible: (a) as mentioned, there's no way for a task to run a list of tagged tasks; (b) there's no such thing (yet) as "explicit" tags, ie tags that disable a task unless the tag is explicitly invoked. This means there's no simple way to run a particular subset of special/exceptional tasks within a playbook.
I was trying to work around this restriction by making a separate playbook. However, that would end up copying a bunch of logic from the main role playbook anyway.
The best approach for now is the workaround others have mentioned, which is to rely on variables as a workaround for "whitelisting" certain tasks. Then make a wrapper script which declares those variables and may also use skip-tags to eliminate unnecessary/slow tasks.

Related

Secrets as environmental variables in vstest

In my tests I'm using environmental variable. Due to security reasons, when setting it in azure devops pipeline I need to mark it as secret. Is it possible to pass it as argument, within vstest task?
Apparently lacking enough reputation to comment on the answer by #daniel-mann, I need to follow-up through an answer myself.
Regarding the use of a runsettings file;
Yes, I'm sure you can do it like this, but there's a much simpler way. In the task definition, you have the option to override test run parameters.
For a non-YAML pipeline, you do this in the "Override test run parameters" option (the tooltip says "Override parameters defined in the TestRunParameters section of runsettings file or Properties section of testsettings file. For example: -key1 value1 -key2 value2."), so like this then:
-SomeSecret $(SomeSecret)
For a YAML-pipeline, you can do this:
overrideTestrunParameters: '-SomeSecret $(SomeSecret)'
The nice thing is that the test run parameter that you'd like to override DOES NOT need to exist in your runsettings file. In fact, there's not even need for a runsettings file at all!
The bad thing about this approach in general is that you'll have to access your secrets through the "TestContext.Properties" collection (for NUnit), which really sucks if you want a transparent solution that works equally well both for local development (using "user secrets") and in an Azure pipeline.
Another potentially bad thing about this is that these "overridden" test run parameters are just that - parameters for THAT test run only. If you happen to have a mixture of .NET Framework and .NET Core tests, you would want to execute those in two different test runs (for it to work at all...), and then you'd need to duplicate those overrides (given that some of the same secrets are needed, that is).
Regarding adding an additional task in your pipeline to set the appropriate environment variables;
Absolutely. Using the "Batch script" task is well suited for this, where you just pass your secret-based variables as parameters to the script and pick them up inside the script file.
For this to work as expected, though, you will need to allow the task to modify the environment.
For a non-YAML pipeline, this is done by ticking off the "Modify Environment" checkbox.
For a YAML pipeline, you could do it like this:
- task: BatchScript#1
displayName: 'Export key vault vars as env. vars (for tests)'
inputs:
filename: 'ExportKeyVaultEnvironmentVariables.cmd'
arguments: '$(SomeSecret)'
modifyEnvironment: true
Where in the "ExportKeyVaultEnvironmentVariables.cmd" script, you just do this:
set SomeSecret=%1
Note: If your secret by chance has some funny characters, especially having a trailing "=" character, you might experience that what you get when collecting the parameter inside the script is NOT what you sent in.
You can avoid this problem by enclosing the parameters in double quotes, like this:
arguments: '"$(SomeSecret)"'
And then collect the parameter by removing those surrounding quotes by using the "~" parameter modifier, like this:
set SomeSecret=%~1
A nice bonus effect of this approach is that your shiny new full-blown environment variables persist for the remainder of your pipeline. Referencing back to my "bad thing" about having to potentially duplicate the test run parameter overrides, that would not be needed here.
Regarding the additional option mentioned by the OP;
Absolutely, in which case you wouldn't need a "Batch script" task (that needs to call a script file), but just a "Command line" task.
BUT, be aware of a possible gotcha! Yes, this will create an environment variable for your test code to pick up, if you access it through "Environment.GetEnvironmentVariable".
In my case, I was building an "IConfigurationRoot" instance, like this:
/// <summary>
/// Gets a configuration instance, based on user secrets (for local test execution) and environment variables (for Azure pipeline execution).
/// </summary>
/// <typeparam name="T">The type of the class that represents the "runtime" (and thus is able to get hold of any configuration).</typeparam>
/// <returns>An <see cref="IConfigurationRoot"/> instance.</returns>
public static IConfigurationRoot GetConfigurationRoot<T>()
where T : class =>
new ConfigurationBuilder()
//// Note: The "AddUserSecrets" method requires the "Microsoft.Extensions.Configuration.UserSecrets" package
.AddUserSecrets<T>()
//// Note: The "AddEnvironmentVariables" method requires the "Microsoft.Extensions.Configuration.EnvironmentVariables" package
.AddEnvironmentVariables()
.Build();
This works by adding various "configuration providers", which eventually allows you to access them all seamlessly through "configuration["SomeSecret"]".
What I found, though, if I'm not seriously mistaken, is that "SomeSecret" was still not available in the "EnvironmentVariablesConfigurationProvider" that's added, even though I could perfectly fine access it directly with the above mentioned method. Go figure (but I might be mistaken...).
Possible alternative approach (but for YAML pipelines only?);
It seems that for a YAML pipeline, you can explicitly set environment variables for a task, like this:
- task: VSTest#2
env:
SomeSecret: $(SomeSecret)
[...]
I haven't tested this myself, but seen a colleague do it (myself, I currently don't have a YAML pipeline). At least this variable can be picked up with "Environment.GetEnvironmentVariable", but I don't know if this can be picked up through an "IConfigurationRoot" instance.
I haven't seen any option to achieve the same in a non-YAML pipeline.
But again, this also suffers from the same possible "having to duplicate the environment variables across several test runs" problem.
See also my solution to my own question over at the Azure DevOps guys
I've been through this. There's no way to pass variables to VSTest on the command line, which means you have to jump through a few hoops.
You have a few options:
Use a runsettings file with a TestRunParameters section, then access it via TestContext.Properties["variableName"] within the tests themselves. You can use standard token replacement patterns to transform the XML file.
Use an app.config or appsettings.json (depending on your platform). This works pretty much the same as above, except, of course, you use the standard configuration classes to retrieve the values.
Add a step to your pipeline that sets the appropriate environment variables. Secrets don't get automatically mapped to environment variables for security purposes, but there's nothing that's stopping you from doing it yourself.
Move the secret values into a keyvault or some other sort of external secret storage and configure the test to pull the secrets at runtime.

What is the practical difference between a sub-workflow and the includes directive? [Snakemake]

In the Snakemake documentation, the includes directive can incorporate all of the rules of another workflow into the main workflow and apparently can show up in snakemake --dag -n | dot -Tsvg > dag.svg. Sub-workflows, on the other hand, can be executed prior to the main workflow should you develop rules which depend on their output.
My question is: how are these two really different? Right now, I am working on a workflow, and it seems like I can get by on just using includes and putting the name of the output in rule all of the main workflow. I could probably even place the output in the input of a main-workflow rule, making the includes workflow execute prior to that rule. Additionally, I can't visualize a DAG which includes the sub-workflow, for whatever reason. What do sub-workflows offer that the includes directive can't do?
The include doesn't "incorporate another workflow". It just adds the rules from another file, like if you add them with copy/paste (with a minor difference that include doesn't affect your target rule). The subworkflow has an isolated set of rules that work together to produce the final target file of this subworkflow. So it is well structured and isolated from both main workflow and other subworkflows.
Anyway, my personal experience shows that there are some bugs in Snakemake that make using subworkflows quite difficult. Including the file is pretty straightforward and easy.
I've never used subworkflows, but here's a case where it may be more convenient to use them rather than the include directives. (In theory, I think you don't need include and subworkflow as you could write everything in a massive Snakefile, the point is more about convenience.)
Imagine you are writing a workflow that depends on result files from a published work (or from a previous project of yours). The authors did not make public the files you need but they provide a snakemake workflow to produce them. Their snakemake workflow may be quite complex and the files you need may be just intermediate steps. So instead of making sense of the all workflow and parsing it into your own include directives, you use subworkflow to generate the required file(s). E.g.:
subworkflow jones_etal:
workdir:
"./jones_etal"
snakefile:
"./jones_etal/Snakefile"
rule all:
input:
'my_results.txt',
rule one:
input:
jones_etal('from_jones.txt'),
output:
'my_results.txt',
...

How to replace a shared file when deploying code with Capistrano?

Update: TL;DR there seems to be no built-in way to achieve this, so a custom task is an easy solution.
Capistrano provides facilities to share files and directories over all releases. This is convenient and provides even some safety on files that should not be easily changed (or must remain the same across releases), e.g. a database configuration file.
But when it comes to replace or just update one of these shared files, I end up doing it manually, directly on the target machine. I would like to improve on that, for instance by asking Capistrano to overwrite some or all shared files when deploying. A kind of --force flag with some granularity.
I am not aware of any such kind of facility, and failing so far in my search. Any pointer?
Thinking about it
One of the reason why this facility does not exist (except that I did not find it!) is that it may be harder than it looks. For example, let's assume we have a shared database configuration file, and we exclude it from version control for security reason (common practice). Current release relies on version 1 of the DB configuration. The next release requires version 2 of the DB configuration. If the deployment goes well, everything's good. It gets harder when rolling back after some error with the new release (e.g. a regression), as version 1 must then be available.
Such automation would be cool and convenient, but dangerous as well. Yet I have practical use cases at hand.
I created a template method to do this. For example, I could have a task like this:
task :create_database_yml do
on roles(:app, :db) do
within(shared_path) do
template "local/path/to/database.yml.erb",
"config/database.yml",
:mode => "600"
end
end
end
And then I have a database.yml.erb template that uses things like fetch(:database_password) to fill in appropriate values. You can use the ask method in Capistrano to prompt for these values so they are never committed.
The implementation of template can be very simple: you just need to read the file, pass it through ERB, and then use Capistrano's upload! to place the results on the server.
My version is a little more complicated than yours probably needs to be, but in case you are curious:
https://github.com/mattbrictson/capistrano-mb/blob/7600440ecd3331945d03e059368b75849857f1fb/lib/capistrano/mb/dsl.rb#L104
One approach is to use a system configuration tool like Chef or Puppet to deploy the configuration files distinctly from Capistrano.
Another approach is to create a custom task to do this: https://coderwall.com/p/wgs6gw/copy-local-files-to-remote-server-using-capistrano-3
I personally don't change on-server configs often enough or on enough servers yet to have tried to automate it. Crafting an scp command which copies the desired config file to all of the required servers has sufficed in the past.

How to handle environment-specific application configuration organization-wide?

Problem
Your organization has many separate applications, some of which interact with each other (to form "systems"). You need to deploy these applications to separate environments to facilitate staged testing (for example, DEV, QA, UAT, PROD). A given application needs to be configured slightly differently in each environment (each environment has a separate database, for example). You want this re-configuration to be handled by some sort of automated mechanism so that your release managers don't have to manually configure each application every time it is deployed to a different environment.
Desired Features
I would like to design an organization-wide configuration solution with the following properties (ideally):
Supports "one click" deployments (only the environment needs to be specified, and no manual re-configuration during/after deployment should be necessary).
There should be a single "system of record" where a shared environment-dependent property is specified (such as a database connection string that is shared by many applications).
Supports re-configuration of deployed applications (in the event that an environment-specific property needs to change), ideally without requiring a re-deployment of the application.
Allows an application to be run on the same machine, but in different environments (run a PROD instance and a DEV instance simultaneously).
Possible Solutions
I see two basic directions in which a solution could go:
Make all applications "environment aware". You would pass the environment name (DEV, QA, etc) at the command line to the app, and then the app is "smart" enough to figure out the environment-specific configuration values at run-time. The app could fetch the values from flat files deployed along with the app, or from a central configuration service.
Applications are not "smart" as they are in #1, and simply fetch configuration by property name from config files deployed with the app. The values of these properties are injected into the config files at deploy-time by the install program/script. That install script takes the environment name and fetches all relevant configuration values from a central configuration service.
Question
How would/have you achieved a configuration solution that solves these problems and supports these desired features? Am I on target with the two possible solutions? Do you have a preference between those solutions? Also, please feel free to tell me that I'm thinking about the problem all wrong. Any feedback would be greatly appreciated.
We've all run into these kinds of things, particularly in large organizations. I think it's most important to manage your own expectations first, and also ask whether it's really necessary to tell every system and subsystem on a given box to "change to DEV mode" or "change to PROD mode". My personal recommendation is as follows:
Make individual boxes responsible for a different stage - i.e. "this is a DEV box", and "this is a PROD box".
Collect as much of the configuration that differs from box to box in one location, even if it requires soft links or scripts that collect the information to then print out.
A. This way, you can easily "dump this box's configuration" in two places and see what differs, for example after a new deployment.
B. You can also make configuration changes separate from software changes, at least to some degree, which is a good way to root out bugs that happen at release time.
Then have everything base its configuration on something/somewhere that is not baked-in or hard-coded - just make sure to collect and document it in that one location. It almost doesn't matter what the mechanism is, which is a good thing, because some systems just don't want to be forced to use some mechanisms or others.
Sorry if this is too general an answer - the question was very general. I've worked in several large software-based organizations before, and this seemed to be the best approach. Using a standalone server as "one unit of deployment" is the most realistic scenario (though sometimes its expensive), since applications affect each other, and no matter how careful you are, you destabilize a whole system when you move any given gear or cog.
The alternative gets very complex very quickly. You need to start rewriting the applications that you have control over in order to have them accept a "DEV" switch, and you end up adding layers of kludge to the ones you don't have control over. Usually, the ones you don't have control over at least base their properties on something defined on a system-wide level, unless they are "calling the mothership for instructions".
It's easier to redirect people to a remote location and have them "use DEV" vs "use PROD" than it is to "make this machine run like DEV" vs "make this machine run like PROD". And if you're mixing things up, like having a DEV task run together on the same box as a PROD task, then that's not a realistic scenario anyways: I guarantee that eventually you will be granting illegal DEV-only access to somebody on PROD, and you'll have a DEV task wipe out a PROD database.
Hope this helps. Let me know if you'd like to discuss more specifics involved.
I personally prefer solution 2 (the app should know itself, by its configuration, what environment it is running in). With solution 1 (pass the environment name as a startup parameter) the danger of using the wrong environment specifier is much too high. Accessing the TEST database from PROD code and vice versa may cause mayhem, if the two installed code bases are not of the same version, as is often the case.
My current project uses solution 1, but I don't like that. A previous project I worked on used a variation of solution 2: The build process generated one setup file for every environment, making sure that they contained the same code base but appropriate configuration paramters. That worked like a charm, but I know it contradicts the paradigm that the "exact same build files must be deployed everywhere".
I think I have asked a related, self-answered, question, before I read this one : How to organize code so that we can move and update it without having to edit the location of the configuration file? . So, on that basis, I provide an answer here. I don't like the idea of "smart" application (solution 1 here) for such a simple task as finding environment settings. It seems a complicated framework for something that should be simple. The idea of an install script (solution 2 here) is powerful, but it is useful to allow the user to change the content of the config file, but would it allow to change the location of this config file? What is this "central configuration service", where is it located? My answer is that I would go with option 2, if the goal is to set the content of the configuration file, but I feel that the issue of the location of this configuration file remains unanswered here.
If you're using JSON to store/transmit configuration (or can use JSON in your pre-deploy process to output to some other format) you can annotate key/property names for environment/context-specific values with arbitrary or environment-specific suffixes, and then dynamically prefer/discriminate them at build/deploy/run/render -time, while leaving un-annotated properties alone.
We have used this to avoid duplicating entire configuration files (with the associated problems well known) AND to reduce repetition. The technique is also perfect for internationalization (i18n) -- even within the same file, if desired.
Example, snippet of pre-processed JSON config:
var config = {
'ver': '1.0',
'help': {
'BLURB': 'This pre-production environment is not supported. Contact Development Team with questions.',
'PHONE': '808-867-5309',
'EMAIL': 'coder.jen#lostnumber.com'
},
'help#www.productionwebsite.com': {
'BLURB': 'Please contact Customer Service Center',
'BLURB#fr': 'S\'il vous plaît communiquer avec notre Centre de service à la clientèle',
'BLURB#de': 'Bitte kontaktieren Sie unseren Kundendienst!!1!',
'PHONE': '1-800-CUS-TOMR',
'EMAIL': 'customer.service#productionwebsite.com'
},
}
... and post-processed (in this case, at render time) given dynamic, browser-environment-known location.hostname='www.productionwebsite.com' and navigator.language of 'de'):
prefer(config,['www.productionwebsite.com','de']); // prefer(obj,string|Array<string>)
JSON.stringify(config); // {
'ver': '1.0',
'help': {
'BLURB': 'Bitte kontaktieren Sie unseren Kundendienst!!1!',
'PHONE': '1-800-CUS-TOMR',
'EMAIL': 'customer.service#productionwebsite.com'
}
}
If a non-annotated ('base') property has no competing annotated property, it is left alone (presumably global across environments) otherwise its value is replaced by an annotated value, if the suffix matches one of the inputs to the preference/discrimination function. Annotated properties that do not match are dropped entirely.
You can mix and match this behaviour to annotate configuration to achieve distinctions of global, default, specific that are (assuming you're sensible) readable with zero/minimal duplication.
The single, recursive prefer() function (as we're calling it, lacking the need or desire to make an entire project/framework out of it) we've developed so far (see jsFiddle, with inline docs) goes a bit further than this simple example, and (explained in greater detail here) handles deeply-nested configuration objects, as well as preferential ordering and (if you need to stay flat) combination of suffixes.
The function relies on JS ability to reference object properties as strings, dynamically, and tolerate # and & delimiters in property names which are not valid in dot-notation syntax but consequently (help) prevent developers from breaking this technique by accidentally referring to pre-processed/annotated attributes in code (unless they, non-conventionally don't prefer to use dot-notation.)
We have yet to have this break anything for us, nor have we been schooled on any fundamental flaws of this technique, beyond irresponsible/unintended usage or investment/fondness for existing frameworks/techniques that pre-exist. We have also not profiled it for performance (we only tend to run this once per build/session, etc.) so in your own usage, YMMV.
Most configurations transmitted client-side of course would not want to contain sensitive pre-production values, so one could (should!) use the same function to generate a production-only version (with no annotations) in pre-deploy, while still enjoying a SINGLE configuration file upstream in your process.
Further, if you're doing this for i18n, you may not want the entire wad going over the wire, so could process it server-side (cached or live, etc.) or pre-process it in build/deploy by splitting into separate files, but STILL enjoying a single source of truth as early in your workflow as possible.
We have not explored implementing the same function in Java (or C#, PERL, etc.) assuming it's even possible (with some exotic reflection maybe?) but a build environment that includes NodeJS could farm that step out easily.
Well if it suits your needs and you have no problem of storing the connection strings in the source control repository, you could create files like:
appsettings.dev.json
appsettings.qa.json
appsettings.staging.json
And choose the right one in the deployment script and rename it to the actual appsettings.json, which is then read by your app.

Windows Workflow - IfElse branch

I am trying to use Windows Workflow and have a model that looks similar to the image in the link below:
After each of the Send Activities (GetSomthing, GetSomthingElse, GetSomeMoreStuff) the same custom activity is being called (LogSomthingBadHappened).
While it might not look so bad in this picture in my real model the custom activity is a SequenceActivty, has quite a few nodes, and when its repeated 3 times starts to make the workflow look very ugly.
I would like to do something like this:
Can the IfElse branches be merged like this?
Should I be using a State Machine work flow instead (haven't figured these out yet)?
Use a FaultHandler on the workflow and throw a specific exception type that the handler will catch. Not the most graceful, but I think it should work.
In sequential workflows all steps must appear in a specific order, and the execution path is regulated exclusively by control structures (IF, WHILE).Altering the execution path in the way you describe would be like using a GOTO statement in imperative code, which we know leads to unnecessary complexity.
If the activities contained in the SequenceActivity that you need to execute at different stages of your workflow are exacty the same, you could embed them in a custom activity. This way it is easier to manage them since they are contained in a single logical unit.In imperative code, this would be like refactoring out a portion of duplicated code into a method, which is then invoked multiple places.
Another alternative that might work is to put your LogSomthingBadHappened activity into a custom workflow and include that several time. Several things to watch out for: Subworkflow are executed asynchronously, if the LogSomthingBadHappened activity needs state information from the main workflow, copying it to the sub workflow might be hard.
I have not tried this, so it might not even work.
I think the answer by gbanfill points to the right direction.
To generalize, I define the problem as:
Is there a way to define a group of activities that will be executed in several places of a workflow?
Further requirements are:
The group of activities should be defined in XAML only ie no code.
Type of input to this group will, of course, be fixed but actual values should depend on call (like calling a function).
Maybe the way to do it is define sub-workflows and build a custom activity that would instantiate the sub-workflow and wait for it to complete before continuing.
This custom activity should have at least two parameters: the sub-workflow id and input parameters.