pass artifacts between Concourse jobs without S3 or similar external resource - concourse

I am using concourse and build binaries that I would like to send off to integration tests. However they are lightweight and using an S3 bucket for permanent storage seems like overkill. Additionally I am versioning with semver-resource, which also seems to require S3 or such to back it.
Is there a way to configure a local on-worker or similar blobstore? can I use the Concourse postgres db to store my semver? it's small enough it should fit in a DB table.

Short answer: no.
Concourse is designed so that the Concourse deployment itself is stateless, explicitly not providing artifact persistence and striving to be largely free of configuration.
This forces pipelines to be self-contained, which makes your CI reproducible. If your Concourse server burns down, you haven't lost anything special. You can just spin up another one and send up the original pipeline. Everything will then continue from where it left off: your versions will continue counting from where they were, rather than restarting from 0.0.0, and all of your artifacts are still wherever they are.
All that being said, you're free to deploy your own S3-compatible blob store. The s3 resource should talk to it just fine.

We use the semver resource with gist. Just get the clone id from the gist page:
then set your resource:
- name: version
type: semver
source:
driver: git
branch: master
uri: {{version-url}}
file: Version
private_key: {{github-private-key}}

Related

AWS Glue - version control and setting up for continuous integration

We are in the process of setting up the CI / CD process for AWS Glue ETL Process. The existing ETL process contains the following AWS Glue Components - Crawlers, Registered tables in catalog, Jobs, Triggers and workflows.
Obviously the first step is to set up a code repository and link the existing artifacts from different components mentioned above to the repository, which will ideally need to facilitate the developers in performing the check-ins and pull request from the tool (Something similar to ADF and Databricks). However as far as we have explored, AWS glue does not have integration to any of the source code repository which can directly provide this feature unless we are missing something.
Hence what is the method to setup the environment for CI (I'm still not talking about CD), the below link gives a reference for CI/CD:
https://aws.amazon.com/blogs/big-data/implement-continuous-integration-and-delivery-of-serverless-aws-glue-etl-applications-using-aws-developer-tools/
However it mentions at the beginning that, AWS CloudFormation template file for deploying the ETL jobs are both committed to version control - so not clear on how this is done for the on-going regular commits from the developers.
However as far as we have explored, AWS glue does not have integration
to any of the source code repository which can directly provide this
feature unless we are missing something.
Correct, Glue does not have VC integration.
I develop (python and cloudformation) locally on vscode and use it's git integration plugin. And I use a container if I want to test something locally, but Glue also has a Dev Endpoint for similar tasks.

How do I automate Kubernetes deployment YAML without relying on :latest?

I have a repository with a Kubernetes deployment YAML. Pipelines run on each commit that builds and pushes an image into our repository, versioned with the commit (eg. my_project:bm4a83). Then I'm updating the deployment image
kubectl set image deployment/my_deployment my_deployment=my_project:bm4a83.
This works, but I also want to keep the rest of the deployment YAML specification in version control.
I thought I could just keep it in the same repository, but that means my changes that may only be infrastructure (eg, changing replicas) triggers new builds, without code changes.
What felt like it made the most sense was keeping the deployment YAML in a totally separate repository. I figured I can manage all the values from there, independently from actual code changes. The only problem with that is the image key would need to be kept up to date. The only way around that, is working with some floating :latest-type version, but I don't really think that's ideal.
What's a sensible workflow for managing this? Am I missing something entirely?
What's a sensible workflow for managing this? Am I missing something entirely?
Some of the answer depends on the kind of risk you're trying to drive down with any process you have in place. If it's "the cluster was wiped out by a hurricane and I need to recover my descriptors," then Heptio Ark is a good solution for that. If the risks are more "human-centric," then IMHO you will have to walk a very careful line between locking down all the things and crippling the very agile, empowering, tools that kubernetes provides to a team. A concrete example of that model running up against your model is: what happens when a developer edits a Deployment but does not (remember|know) to update the descriptor in the repo? So do you revoke the edit rights? Use some diff-esque logic to detect a changed in-cluster config?
To speak to something you said specifically: it is a highly suboptimal idea to commit a descriptor change just to resize a (Deployment|ReplicationController|StatefulSet). Separately, a well-built CI pipeline would also understand if no buildable artifact changed and bail out (either early, or not even triggering a build, if the CI tool is that smart).
Finally, if you do want to carry on with the current situation, then the best practice I can think of is textual replacement right before applying a descriptor:
$ grep "image: " the-deployment.yml
image: example.com/something:#CI_PIPELINE_IID#
$ sed -i'' -e "s/#CI_PIPELINE_IID#/${CI_PIPELINE_IID}/" the-deployment.yml
$ kubectl apply -f the-deployment.yml
so that the copy in the repo remains textually pristine, and also isn't inadvertently actually applied since it won't actually result in a runnable Deployment.
but I also want to keep the rest of the deployment YAML specification in version control.
Yes, you want to do that. Putting everything under version control is a good practice to achieve immutable infrastructure.
If you want the deployment to have a separate piece of metadata (for whatever auditing / change tracking reason), why can't you just leverage the Kubernetes metadata block?
metadata:
name: my_deployment
commit: bm4a83
Then you inject such information through Jinja, Ruby ERBs, Go Templates, etc.

How do I version control a kubernetes application?

I've checked out helm.sh of course, but at first glance the entire setup seems a little complicated (helm-client & tiller-server). It seems to me like I can get away by just having a helm-client in most cases.
This is what I currently do
Let's say I have a project composed of 3 services viz. postgres, express, nginx.
I create a directory called product-release that is as follows:
product-release/
.git/
k8s/
postgres/
Deployment.yaml
Service.yaml
Secret.mustache.yaml # Needs to be rendered by the dev before use
express/
Deployment.yaml
Service.yaml
nginx/
Deployment.yaml
Service.yaml
updates/
0.1__0.2/
Job.yaml # postgres schema migration
update.sh # k8s API server scritps to patch/replace existing k8s objects, and runs the state change job
The usual git stuff can apply now. Everytime I make a change, I make changes to the spec files, test them, write the update scripts to help move from the last version to this current version and then commit it and tag it.
Questions:
This works for me so far, but is this "the right way"?
Why does helm have the tiller server? Isn't it simpler to do the templating on the client-side? Of course, if you want to separate the activity of the deployment from the knowledge of the application (like secrets) the templating would have to happen on the server, but otherwise why?
Seems that https://redspread.com/ (open source) addresses this particular issue, but needs more development before it'll be production ready - at least from my team quick glance at it.
We'll stick with keeping yaml files in git together with the deployed application for now I guess.
We are using kubernetes/helm (the latest/incubated version) and a central repository for Helm charts (with references container images built for our component releases).
In other words, the Helm package definitions and its dependencies are separate from the source code and image definitions that make up the several components of our web applications.
Notice: Tiller has been removed in Helm v3. Checkout this answer to see details on why it needs tiller in Helm v2 and why it's removed in Helm v3: https://v3.helm.sh/docs/faq/#removal-of-tiller
According to the idea of GitOps, what you did is a right way (to perform release from a git repo). However, if you want to push it further to make it more common, you can plan more goals including:
Choose a configuration management system beyond k8s app declarative definition only. E.g., Helm (like above answer https://stackoverflow.com/a/42053983/914967), Kustomize. They're pure client-side only.
avoid custom release process by altering update.sh with popular tools like kubectl apply or helm install.
drive change delivery from git tags/branches by using a CI/CD engine like argocd, Travis CI or GitHub Actions.
Uses branching strategy so that you can try changes on test/staging/production/ environment before delivering it directly.

Deploy using artifact from artifactory

Is there a way in Bamboo to deploy artifacts from artifactory rather than only local published artifacts? I've found the Artifactory Plugin but as far as I could see, it only allows for deploying stuff into artifactory.
I'm using Bamboo 5.4.2
You can use your build server to deploy from Artifactory to your application server, that's a very detoured way to go. You already uploaded all the binaries to Artifactory why would you want to download them to the build server again?
You have number of ways to get the needed files to your application server right from Artifactory, without involving the CI server, and the selection depends on how complicated your requirements are. If all you need is to get the latest version of some artifact from Artifactory to app server, tools like LiveRebel are a great match. If you need to do more, e.g. deploy on sophisticated topology of clustered environment with sharded data schema upgrade without downtime, you might need something more free-style like Puppet, Chef, Ansible, or Salt.
In any way, Artifactory Properties and the REST API to work with them are your best friend. Using properties in your REST queries for artifacts allows expressing queries like "Give me all the artifacts that were produced by certain Bamboo build, but only those, which were staged, have the QA level of 'production' and matching the target deployment target".

How to perform automated deployment - with a Pull model

We're currently doing continuous deployment to our dev/qa servers, and manually triggered automated deployment to our production boxes. Currently we're using TeamCity/PowerShell/MsDeploy. We now have a requirement to deploy to a server that sits on an external network, on which the target server cannot be accessed externally. Instead, it will have to "call home" for updates - and presumably then push the results back if it succeeds or not.
I'm thinking we could write a service that requests a particular URL on our build server with delivers the artifacts that would have been used for deployment, pull that down - and then fire off the build script.
However, I'm not entirely sure how we'd deal with updating the updater, and failures when they occur. Does anyone have any recommendations on how to approach this?
Sounds like you need a release repository. The build server pushes files into it and each deploy job pulls from it. This would neatly decouple the two processes.
A release repository could be as simple as a shared NAS, or something more sophisticated such as the Nexus repository manager.