I've been using terraform for a while and I really like it. I also set up Atlantis so that my team could have a "GitOps" flow. This is my current process:
Add or remove resources from Terraform files
Push changes to GitHub and create a pull request
Atlantis picks up changes and creates a terraform plan
When the PR is approved, Atlantis applies the changes
I recently found myself needing to set up a few managed Kubernetes clusters using Amazon EKS. While Terraform is capable of creating most of the basic infrastructure, it falls short when setting up some of the k8s resources (no support for gateways or ingress, no support for alpha/beta features, etc). So instead I've been relying on a manual approach using kubectl:
Add the resource to an existing file or create a new file
Add a line to a makefile that runs the appropriate command (kubectl apply or create) on the new file
If I'm using a helm chart, add a line with helm template and then kubectl apply (I didn't really like using tiller, and helm3 is getting rid of it anyway)
If I want to delete a resource, I do it manually with kubectl delete
This process feels nowhere near as clean as what we're doing in Terraform. There are several key problems:
There's no real dry-run. Using kubectl --dry-run or kubectl diff doesn't really work, it's only a client-side diff. Server-side diff functions are currently in alpha
There's no state file. If I delete stuff from the manifests, I have to remember to also delete it from the cluster manually.
No clear way to achieve gitops. I've looked at Weaveworks Flux but that seems to be geared more towards deploying applications.
The makefile is getting more and more complicated. It doesn't feel like this is scaleable.
I should acknowledge that I'm fairly new to Kubernetes, so might be overlooking something obvious.
Is there a way for me to achieve a process similar to what I have in Terraform, within the Kubernetes universe?
This is more of an opinion question so I'll answer with an opinion. If you like to manage configuration you can try some of these tools:
If you want to use existing YAML files (configurations) and use something at a higher level you can try kustomize.
If you want to manage Kubernetes configurations using Jsonnet you should take a look at Ksonnet. Keep in mind that Ksonnet will not be supported in the future.
If you want to just automatically do a helm update in an automated way, there is not a tool there yet. You will have to build something at this point to orchestrate everything. For example, we ended up creating an in house tool that does this.
Related
I have a lot of cronjobs I need to set on Kubernetes.
I want a file to manage them all and set them to Kubernetes on deployment. I wish that if I remove a cron from that file it will be removed from Kubernetes too.
Basically, I want to handle the corns like I'm handling them today on the machine (from a cron file that I would deploy). Add, remove and change crons.
I couldn't find a way of doing so. Does someone have an idea?
Library or framework I can use like helm? Or any other solution.
I highly recommend using gitops with argocd as a solution for Kubernetes configure management. Run crontab in deployment is a bad ideal because it hard to monitor your job result (cronjob job result can be get by kube-state-metrics exporter).
The ideal is packaging your manifest (it may be kubernetes manifest, kustomize, helm...etc...) -> put them to git -> argocd makes sure your configure deployed correctly
The advantages of gitops are include:
centralize your configuration
versioning your configuration
git authentication & authorization
traceable
multi-cluster deployment with argocd
automation deployment & sync
...
Gitops is not a difficult and is the mordern way for kubernetes configure management. Let's try
I used Helm to do so. I built a template to go over all crons, which I inserted as values to the helm template (Very similar to crontab but more structured) - see in the example.
Then, all I need to do is run a helm upgrade with a new corn (values) file and it updates everything accordingly. If I updated, removed, or added a new corn everything is happening automatically and with versioning. You can also add a namespace to your cronjobs to make it more encapsulated.
Here is a very good and easy-to-understand example I used. And its git repo
Using Kubernetes we make use of Helm and Kustomize to bundle our application. This helps consistently updating something like an application, but gets kind of bloated for a hole “environment” or cluster.
ArgoCD seems like a good solution for updating a hole cluster, as you can “mirror” your git state to the cluster. This works even when dropping resources or updating an existing complex deployment.
Now I want to build branch based ephemeral environments and think ArgoCD seems a bit bloated for this feature as for every branch environment I would have to commit to the git repository and add something.
The idea is, every branch based environment lives in its own namespace. I search for a tool managing this namespace and being able to do updates, and drops of the hole thing.
What is a good solution for this problem?
I have a repository with a Kubernetes deployment YAML. Pipelines run on each commit that builds and pushes an image into our repository, versioned with the commit (eg. my_project:bm4a83). Then I'm updating the deployment image
kubectl set image deployment/my_deployment my_deployment=my_project:bm4a83.
This works, but I also want to keep the rest of the deployment YAML specification in version control.
I thought I could just keep it in the same repository, but that means my changes that may only be infrastructure (eg, changing replicas) triggers new builds, without code changes.
What felt like it made the most sense was keeping the deployment YAML in a totally separate repository. I figured I can manage all the values from there, independently from actual code changes. The only problem with that is the image key would need to be kept up to date. The only way around that, is working with some floating :latest-type version, but I don't really think that's ideal.
What's a sensible workflow for managing this? Am I missing something entirely?
What's a sensible workflow for managing this? Am I missing something entirely?
Some of the answer depends on the kind of risk you're trying to drive down with any process you have in place. If it's "the cluster was wiped out by a hurricane and I need to recover my descriptors," then Heptio Ark is a good solution for that. If the risks are more "human-centric," then IMHO you will have to walk a very careful line between locking down all the things and crippling the very agile, empowering, tools that kubernetes provides to a team. A concrete example of that model running up against your model is: what happens when a developer edits a Deployment but does not (remember|know) to update the descriptor in the repo? So do you revoke the edit rights? Use some diff-esque logic to detect a changed in-cluster config?
To speak to something you said specifically: it is a highly suboptimal idea to commit a descriptor change just to resize a (Deployment|ReplicationController|StatefulSet). Separately, a well-built CI pipeline would also understand if no buildable artifact changed and bail out (either early, or not even triggering a build, if the CI tool is that smart).
Finally, if you do want to carry on with the current situation, then the best practice I can think of is textual replacement right before applying a descriptor:
$ grep "image: " the-deployment.yml
image: example.com/something:#CI_PIPELINE_IID#
$ sed -i'' -e "s/#CI_PIPELINE_IID#/${CI_PIPELINE_IID}/" the-deployment.yml
$ kubectl apply -f the-deployment.yml
so that the copy in the repo remains textually pristine, and also isn't inadvertently actually applied since it won't actually result in a runnable Deployment.
but I also want to keep the rest of the deployment YAML specification in version control.
Yes, you want to do that. Putting everything under version control is a good practice to achieve immutable infrastructure.
If you want the deployment to have a separate piece of metadata (for whatever auditing / change tracking reason), why can't you just leverage the Kubernetes metadata block?
metadata:
name: my_deployment
commit: bm4a83
Then you inject such information through Jinja, Ruby ERBs, Go Templates, etc.
I've checked out helm.sh of course, but at first glance the entire setup seems a little complicated (helm-client & tiller-server). It seems to me like I can get away by just having a helm-client in most cases.
This is what I currently do
Let's say I have a project composed of 3 services viz. postgres, express, nginx.
I create a directory called product-release that is as follows:
product-release/
.git/
k8s/
postgres/
Deployment.yaml
Service.yaml
Secret.mustache.yaml # Needs to be rendered by the dev before use
express/
Deployment.yaml
Service.yaml
nginx/
Deployment.yaml
Service.yaml
updates/
0.1__0.2/
Job.yaml # postgres schema migration
update.sh # k8s API server scritps to patch/replace existing k8s objects, and runs the state change job
The usual git stuff can apply now. Everytime I make a change, I make changes to the spec files, test them, write the update scripts to help move from the last version to this current version and then commit it and tag it.
Questions:
This works for me so far, but is this "the right way"?
Why does helm have the tiller server? Isn't it simpler to do the templating on the client-side? Of course, if you want to separate the activity of the deployment from the knowledge of the application (like secrets) the templating would have to happen on the server, but otherwise why?
Seems that https://redspread.com/ (open source) addresses this particular issue, but needs more development before it'll be production ready - at least from my team quick glance at it.
We'll stick with keeping yaml files in git together with the deployed application for now I guess.
We are using kubernetes/helm (the latest/incubated version) and a central repository for Helm charts (with references container images built for our component releases).
In other words, the Helm package definitions and its dependencies are separate from the source code and image definitions that make up the several components of our web applications.
Notice: Tiller has been removed in Helm v3. Checkout this answer to see details on why it needs tiller in Helm v2 and why it's removed in Helm v3: https://v3.helm.sh/docs/faq/#removal-of-tiller
According to the idea of GitOps, what you did is a right way (to perform release from a git repo). However, if you want to push it further to make it more common, you can plan more goals including:
Choose a configuration management system beyond k8s app declarative definition only. E.g., Helm (like above answer https://stackoverflow.com/a/42053983/914967), Kustomize. They're pure client-side only.
avoid custom release process by altering update.sh with popular tools like kubectl apply or helm install.
drive change delivery from git tags/branches by using a CI/CD engine like argocd, Travis CI or GitHub Actions.
Uses branching strategy so that you can try changes on test/staging/production/ environment before delivering it directly.
I am hoping to find a good way to automate the process of going from code to a deployed application on my kubernetes cluster.
In order to build and deploy my app I need to first build the docker image, tag it, and then push it to ECR. I then need to update my deployment.yaml with the new tag for the docker image and run the deployment with kubectl apply -f deployment.yaml.
This will go and perform a rolling deployment on the kubernetes cluster updating the pods to the new version of the container image, once this deployment has completed I may need to do other application specific things such as running database migrations, or cache clear/warming which may or may not need to run for a given deployment.
I suppose I could just write a shell script that runs all of these commands, and run it whenever I want to start up a new deployment, but I am hoping there is a better/industry standard way to solve these problems that I have missed.
As I was writing this question I noticed stackoverflow recommend this question: Kubernetes Deployments. One of the answers to it seems to imply at least some of what I am looking for is coming soon to kubernetes, but I want to make sure that if there is a better solution I could be using now that I at least know about it.
My colleague has a good blog post about this topic:
http://blog.jonparrott.com/building-a-paas-on-kubernetes/
Basically, Kubernetes is not a Platform-as-a-Service, it's a toolkit on which you can build your own Platform-a-as-Service. It's not very opinionated by design, instead it focuses on solving some tricky problems with scheduling, networking, and coordinating containers, and lets you layer in your opinions on top of it.
One of the simplest ways to automate the workflows you're describing is using a Makefile.
A step up from that, you can design your own miniature PaaS, which the author of the first blog post did here:
https://github.com/jonparrott/noel
Or, you could get involved in more sophisticated efforts to build an open source PaaS on Kubernetes, like OpenShift:
https://www.openshift.com/
or Deis, which is building a Heroku-like platform on Kubernetes:
https://deis.com/
or Redspread, which is building "Git for Kubernetes cluster":
https://redspread.com/
and there are many other examples of people building PaaS on top of Kubernetes. But I think it will be a long time, if ever, that there is an "industry standard" way to deploy to Kubernetes, since half the purpose is to enable multiple deployment workflows for different use cases.
I do want to note that as far as building container images, Google Cloud Container Builder can be a useful tool, since you can do things like use it to automatically build an image any time you push to a repository which could then get deployed. Alternatively, Jenkins is a popular way to automate CI/CD flows with Kubernetes.
I suppose I could just write a shell script that runs all of these commands, and run it whenever I want to start up a new deployment, but I am hoping there is a better/industry standard way to solve these problems that I have missed.
The company I work for (Weaveworks) and other folks in the space had been advocating for an approach that we call GitOps, please take a look at our series of blog posts covering the topic:
GitOps - Operations by Pull Request
The GitOps Pipeline - Part 2
GitOps Part 3 - Observability
Storing Secure Sealed Secrets using GitOps
The gist of it is that you push images from CI, your checked YAML manifests in git (usually different repo from app code). This repo with manifests is then applied to each of your clusters (dev/prod) by a reconciliation operator. You can automate it all yourself quite easily, but also do take a look at what we have built.
Disclaimer: I am a Kubernetes contributor and Weaveworks employee. We build open-source and commercial tools that help people to get to production with Kubernetes sooner.
We're working on an open source project called Jenkins X which is a proposed sub project of the Jenkins foundation aimed at automating CI/CD on Kubernetes using Jenkins and GitOps for promotion.
When you merge a change to the master branch, Jenkins X creates a new semantically versioned distribution of your app (pom.xml, jar, docker image, helm chart). The pipeline then automates the generation of Pull Requests to promote your application through all of the Environments via GitOps.
Here's a demo of how to automate CI/CD with multiple environments on Kubernetes using GitOps for promotion between environments and Preview Environments on Pull Requests - using Spring Boot and nodejs apps (but we support many languages + frameworks).