Create custom Argo artifact type - argo-workflows

Whenever an S3 artifact is used, the following declaration is needed:
s3:
endpoint: s3.amazonaws.com
bucket: "{{workflow.parameters.input-s3-bucket}}"
key: "{{workflow.parameters.input-s3-path}}/scripts/{{inputs.parameters.type}}.xml"
accessKeySecret:
name: s3-access-user-creds
key: accessKeySecret
secretKeySecret:
name: s3-access-user-creds
key: secretKeySecret
It would be helpful if this could be abstracted to something like:
custom-s3:
bucket: "{{workflow.parameters.input-s3-bucket}}"
key: "{{workflow.parameters.input-s3-path}}/scripts/{{inputs.parameters.type}}.xml"
Is there a way to make this kind of custom definition in Argo to reduce boilerplate?

For a given Argo installation, you can set a default artifact repository in the workflow controller's configmap. This will allow you to only specify the key (assuming you set everything else in the default config - if not everything is defined for the default, you'll need to specify more things).
Unfortunately, that will only work if you're only using one S3 config. If you need multiple configurations, cutting down on boilerplate will be more difficult.
In response to your specific question: not exactly. You can't create a custom some-keyname (like custom-s3) as a member of the artifacts array. The exact format of the YAML is defined in Argo's Workflow Custom Resource Definition. If your Workflow YAML doesn't match that specification, it will be rejected.
However, you can use external templating tools to populate boilerplate before the YAML is installed in your cluster. I've used Helm before to do exactly that with a collection of S3 configs. At the simplest, you could use something like sed.
tl;dr - for one S3 config, use default artifact config; for multiple S3 configs, use a templating tool.

Related

Services section of Azure YAML pipeline

I'm looking at an example of the YAML pipeline with a services section. Here is a sample:
The YAML schema doesn't have services defined.
Where can I get information about the services section of the pipeline?
Update: Per Bowman's answer, the services section is part of the job step. In this scenario, there is only one job so the job step is omitted.
In the simplest case, a pipeline has a single job. In that case, you do not have to explicitly use the job keyword unless you are using a template. You can directly specify the steps in your YAML file.
here is the reference
There is in the official document:
https://learn.microsoft.com/en-us/azure/devops/pipelines/yaml-schema/jobs-job?view=azure-pipelines
services: # Container resources to run as a service container.
I think you directly check it in the top level, right? In fact, in this situation, there is a hidden default job, the definition of the job also be hidden. services section is under the job definition of that hidden job, not the top level.

Move variable groups to the code repository and reference it from YAML pipelines

We are looking for a solution how to move the non-secret variables from the Variable groups into the code repositories.
We would like to have the possibilities:
to track the changes of all the settings in the code repository
version value of the variable together with the source code, pipeline code version
Problem:
We have over 100 variable groups defined which are referenced by over 100 YAML pipelines.
They are injected at different pipeline/stage/job levels depends on the environment/component/stage they are operating on.
Example problems:
some variable can change their name, some variable can be removed and in the pipeline which targets the PROD environment it is still referenced, and on the pipeline which deploys on DEV it is not there
particular pipeline run used the version of the variables at some date in the past, it is good to know with what set of settings it had been deployed in the past
Possible solutions:
It should be possible to use the simple yaml template variables file to mimic the variable groups and just include the yaml templates with variable groups into the main yamls using this approach: Variable reuse.
# File: variable-group-component.yml
variables:
myComponentVariable: 'SomeVal'
# File: variable-group-environment.yml
variables:
myEnvVariable: 'DEV'
# File: azure-pipelines.yml
variables:
- template: variable-group-component.yml # Template reference
- template: variable-group-environment.yml # Template reference
#some stages/jobs/steps:
In theory, it should be easy to transform the variable groups to the YAML template files and reference them from YAML instead of using a reference to the variable group.
# Current reference we use
variables:
- group: "Current classical variable group"
However, even without implementing this approach, we hit the following limit in our pipelines: "No more than 100 separate YAML files may be included (directly or indirectly)"
YAML templates limits
Taking into consideration the requirement that we would like to have the variable groups logically granulated and separated and not stored in one big yml file (in order to not hit another limit with the number of variables in a job agent) we cannot go this way.
The second approach would be to add a simple script (PowerShell?) which will consume some key/value metadata file with variables (variableName/variableValue) records and just execute job step with a command to
##vso[task.setvariable variable=one]secondValue.
But it could be only done at the initial job level, as a first step, and it looks like the re-engineering variable groups mechanism provided natively in Azure DevOps.
We are not sure that this approach will work everywhere in the YAML pipelines when the variables are currently used. Somewhere they are passed as arguments to the tasks. Etc.
Move all the variables into the key vault secrets? We abandoned this option at the beginning as the key vault is a place to store sensitive data and not the settings which could be visible by anyone. Moreover storing it in secrets cause the pipeline logs to put * instead of real configuration setting and obfuscate the pipeline run log information.
Questions:
Q1. Do you have any other propositions/alternatives on how the variables versioning/changes tracking could be achieved in Azure DevOps YAML pipelines?
Q2. Do you see any problems in the 2. possible solution, or have better ideas?
You can consider this as alternative:
Store your non-secrets variable in json file in a repository
Create a pipeline to push variables to App Configuration (instead a Vault)
Then if you need this settings in your app make sure that you reference to app configuration from the app instead running replacement task in Azure Devops. Or if you need this settings directly by pipelines Pull them from App Configuration
Drawbacks:
the same as one mentioned by you in Powershell case. You need to do it job level
What you get:
track in repo
track in App Configuration and all benefits of App Configuration

Advantages of Templates ( ie infrastructure as code) over API calls

I am trying to setup a module to deploy resources in the cloud (it could be any cloud provider). I don't see the advantages of using templates (ie. the deploy manager) over direct API calls :
Creation of VM using a template :
# deployment.yaml
resources:
- type: compute.v1.instance
name: quickstart-deployment-vm
properties:
zone: us-central1-f
machineType: f1-micro
...
# bash command to deploy yaml file
gcloud deployment-manager deployments create vm-deploy --config deployment.yaml
Creation of VM using a API call :
def addInstance(http, listOfHeaders):
url = "https://www.googleapis.com/compute/v1/projects/[PROJECT_ID]/zones/[ZONE]/instances"
body = {
"name": "quickstart-deployment-vm",
"zone": " us-central1-f",
"machineType": "f1-micro",
...
}]
bodyContentURLEncoded = urllib.urlencode(bodyContent)
http.request(uri=url, method="POST", body=body)
Can someone explain to me what benefits I get using templates?
readability\easy of use\authentication handled for you\no need to be a coder\etc. There can be many advantages, it really depends on how you look at it. It depends on your background\tools you use.
It might be more beneficial to use python all the way for you specifically.
It's easier to use templates and you get a lot of builtin functionality such as running a validation on your template to scan for possible security vulnerabilities and similar. You can also easily delete your infra using the same template as you create it. FWIW, I've gone all the way with templates and do as much as I can with templates and in smaller units. It makes it easy to move out a part of the infra or duplicate it to another project, using a pipeline in GitLab to deploy it for example.
The reason to use templates over API calls is that templates can be used in use cases where a deterministic outcome is required.
Both Template and API call has its own benefits. There is always a tradeoff between the two options. If you want more flexibility in the deployment, then the API call suits you better. On the other hand, if the security and complete revision is your priority, then Template should be your choice. Details can be found in this online documentation.
When using a template, orchestration of the deployment is handled by the platform. When using API calls (or other imperative approaches) you need to handle orchestration.

Google Cloud Compute, using environment variables

I have found lots of information on how to use environment variables in Google App Engine projects.
However I am yet to find some best practice on what to do with environment variables on compute engine.
Is it possible to use Google Cloud Deployment Manager to achieve this? My main goal is to simplify deployment between prod/stag/dev.
Right now I am moving towards using dotenv files.
Stack is webpack 4, express, node 10, vuejs 2.
For Compute Engine instances I'd suggest to use custom metadata. You can find detailed documentation about this here. From within your instance, you can access your custom metadata by performing an empty request to the instance().get method, for example:
GET https://www.googleapis.com/compute/v1/projects/myproject/zones/us-central1-a/instances/example-instance
Now, to set your custom metadata, you can indeed use the Google Cloud Deployment Manager. As per the doc here, you just need to add the metadata property and the relevant metadata keys and values for your VM resource, for example:
resources:
- name: my-first-vm-template
type: compute.v1.instance
properties:
zone: us-central1-a
machineType:
...[snip]...
metadata:
items:
- key: custom-key
value: "custom-value"

Convert Terraform Templates to Cloudformation Templates

I want to convert the existing terraform templates(hcl) to aws cloudformation templates(json/yaml).
I basically want to find security issues with these templates through CFN_NAG.
An approach that I have already tried was converting HCL to JSON and then passing the template to CFN_NAG but I received a failure since both the templates have different structure.
Can anyone please provide any suggestions here?
A rather convoluted way of achieving this is to use Terraform to stand-up actual AWS environments, and then to use AWS’s CloudFormer to extract CloudFormation templates (JSON or YAML) from what Terraform has built. At which point you can use cfn-nag.
CloudFormer has some limitations, in that not all AWS resources are currently supported (RDS Security Groups for example) , but it will get you all the basic AWS resources.
Don't forget to remove all the environments, including CloudFormer's, to minimise the cost.
You want to use static code analysis to find security issues in your Terraform setup.
Trying to converting Terraform to CloudFormation to later use cfn-nag is one way. However, there exist tools now that directly operate on the Terraform setup.
I would recommend to take a look at terrascan. It is built on terraform_validate.
https://github.com/bridgecrewio/checkov/ runs security scanning for both terraform and cloudformation