Helm vs Replace Tokens in VSTS - azure-devops

I have been asked to set up CI/CD for a new app using VSTS and Kubernetes.
It was suggested to me that we could use Helm (but it was made clear it was not mandatory).
The value I am seeing for this tool in our project is to define different values for different environments e.g. database connection string.
But for that we can also use the Replace Tokens VSTS task which is a lot simpler.
A definition explains that Helm is a chart manager and it sort of connections all resources of a system to deploy to Kubernetes.
Our system is just 1 web API (could grow later) so I feel deploying using Helm would be over-engineering the deployment process. Plus, we need this for yesterday.
Question
According to the current context, should I go with Replace Tokens VSTS task or Helm?

Just based on your requirement, for example, which is easier to deploy, which is easier to manage, which you familiar or which is easier for requirement changes.
You also can custom build task to achieve it.

I would go for helm because it gives you more flexibility and it's more cross-platform; moreover, when adding more API's/components or microservices it will be easier to control configuration (a single or multiple values.yaml, using git submodules for helm charts and so on).
Surely it requires a slightly bigger time investment than simple value substitution in your CI/CD tools, but has a potential payback that far outweighs the effort (again, based on my experience and the limited information about your environment).
I'm curious, what did you end up using?

Related

How to manage logical grouping of microservice based application to ensure version compatibility for CI/CD Pipeline?

For the MicroService Architecture based application, I'm trying to understand a standard process about how to logically group and manage correct version compatibility among independently deployable microservices. Let me elaborate with practical scenario :
Say, I am building a software application which is composed of 10 microservices. All the microservices have their independent repositories(branching workflow etc.) and their separate CI/CD Pipeline.
The CI/CD Pipeline gets triggered whenever any change pushed to 'master' branch for respective microservice.
Considering Helm chart and Kubernetes based deployment, all the microservices will get deployed with version 1.0 for the very first deployment and our system would work. For subsequent releases, we might have only couple of services that will get deploy. So after couple of production releases, each microservice will be at different version to constituent an application at that point of time.
My question is :
How to logically group independently deployable microservices in order to deploy or rollback to earlier release i.e. how to determine what was the version of different microservices for earlier releases?
Is there any existing tool or standard practice to track versions of each microservice for given release to seamlessly rollback to expected release?
If not automated solution, what would be the right approach to address such requirement?
Appreciate your thoughts and suggestion on this.
With consideration kuberenets:
1. Helm is nice tool to deploy and track.
2. Native k8s deployment works nice, you need to use deployment properly especially look --record flag in k8s commands eg check this link
With AWS ECS clusters:
1. they have task definations and tasks. I think that works for you.
Not have pointers for docker-compose, swarm, and other tools. But you can always use the power of git and some scripting.
the idea is make a file that lists all versions of services/containers/code . and commit that file in git with code. Make tag out of it for simplicity. your script should compare this state file and current state and apply specific changes only. Look at git submodules also. it is nothing but a group of many git projects and it tracks status of each project with help of commit id of each project. This helped us in the situation you mention.
This is a fairly new problem, we just launched a new tool Reliza Hub to solve that. Also here is my post on the subject: Microservices – Combinatorial Explosion of Versions. Currently, we are at the MVP stage and a lot of work is going on - see this video tutorial if our direction makes sense for you https://www.youtube.com/watch?v=yDlf5fMBGuI
If you decide to implement and have any questions or need help with integration, just tag me on SO and I'd be very much willing to make it work for you.
To sum up few things that we are doing - we denote developer facing projects (those that map to source code) as Projects and customer facing projects (bundles that customer sees) as Products.
And we say that Products are essentially composition of Projects and provide tooling how you can compile different versions of Projects into what's called a Product bundle. You can then integrate this into any CI or CD tool out there or start manually if you haven't configured CICD yet.
Other than that, yes - I highly recommend helm and kubernetes - this is what we use on newer projects. (And I can also add ArgoCD and Spinnaker to the existing tooling). But it is not enough to track permutations of different versions of microservices and establishing which configurations are good and which are not between different environments.

Structuring kubernetes configuration files

Say that I have 5 apis that i want to deploy in a Kubernetes cluster, my question is simply what is the best practice to store the yaml files related to Kubernetes.
In projects I've seen online, Kubernetes yaml files are just added to the the api project itself. I wonder if it makes sense to decouple all files related to Kubernetes in an entirely separate "project", and which is managed by VCS as a completely separated entity from the api projects themselves.
This question arises since I'm currently reading a book about Kubernetes, on the topic namespaces, and considered it might be a good idea to have separate namespaces per environment (DEV / UAT / PROD), and it may make sense to have these files in a centralized "Kubernetes" project (unless it might be better to have a separate cluster per environment (?)).
Whether to put the yaml in the same repo as the app is a question that projects answer in different ways. You might want to put them together if you find that you often change both at the same time or you just find it clearer to see everything in one place. You might separate if you mostly work on the yaml separately or if you find it less clutttered or want different visibility for it (e.g. different teams to look at it). If things get more sophisticated then you'll actually want to generate the yaml from templates and inject environment-specific configuration into it at deploy time (whether those environments are namespaces or clusters) - see Best practices for storing kubernetes configuration in source control for more discussion on this.
From Production k8s experience for CI/CD:
One cluster per environment such as dev , stage , prod ( optionally per data centre )
One namespace per project
One git deployment repo per project
One branch in git deployment repo per environment
Use configmaps for configuration aspects
Use secret management solution to store and use secrets

VSTS + Octopus Deploy? Why do I see a lot of CI/CD setups with both?

I'm a developer whose transitioning into Devops. By observation, I've noticed that a lot of dev shops have started using Octopus Deploy and Azure Devops Services (AzDo, formerly VSTS), or they are starting new projects to setup devops ci/cd pipelines AND they spec to use both tools.
I've been through some quick training for both tools and though they aren't perfectly the same, AzDo seems to offer all of the same features as Octopus Deploy.
So, my question is if a company is already using AzDo for much of their version control, or anything CI/CD pipeline-related, why would you use Octopus? What benefit does it offer to use Octopus for your build and deploys to AzDo?
Note, I am very, very new to Devops. I'm just asking because at the "10,000 feet view" there doesn't seem to be any reason for Octopus if you're already using AzDo. I mention Octopus Deploy by name because I see it come up frequently. However, I assume there could be other tools that serve the same purpose of automatic build and deploying that might also integrate with AzDo. However, AzDo offers build and deploy built in one. Why split out the work?
Let me preface why I like to both build and deploy with VSTS:
Same permissioning end to end
Line of sight from end to end build and deployment
Reasons I favor Octopus Deploy over VSTS Release:
Ability to upload packages/artifacts
External ones that are maybe one off packages to get deployed for a specific release
Target Definition
When you create Targets or servers you are deploying to, you are able to add a target to one or multiple environments and assign tags/roles to a target. What does this mean? More flexible server definition rather than defining strict Agents to a pool or servers to a Deployment Group, you can allow a target to span multiple (ie: a testing server that spans your Dev and Test environments and only gets triggered on steps that are defined for that role). I realize you can accomplish similar things to this in VSTS but in my opinion it's far more cumbersome.
Variable Definition
Variables can be grouped at a global level and grouped by a specific pipeline/process (that part is similar to VSTS). Variables can also be grouped or scoped by environments or roles (above) so you are able to have different variable values per role per environment; both super granular and flexible. Places this comes in handy is if you have a backend server with a connection string and maybe 2 content delivery nodes (role - content delivery) that get slightly different values than the backend server. At the moment, I do not know (other than creating new environments) how one would accomplish the same in VSTS.
Process Definition
All of the above comes together in the process definition of Octopus Deploy. The super flexible and granular variables and target definition allows you to focus on the actual deployment process rather than getting hung up on the nuances of the UI and its limitations. One example would be defining a process where the first step would be taking something out of a load balancer from a central server, step two deploy code to delivery server one, step three put back in lb, step 4 take out node two from lb called from a central server, step 5 deploy code to node two, and last step, back into load balancer. I realize it's a very simple hypothetical, but within Octopus Deploy, it's one steady process filtered to execute on specific roles, within VSTS you would have to break that down into different agent phases and probably pipelines.
Above are really the biggest points I see to use Octopus Deploy over VSTS Release. Now why would someone use VSTS to build and OD to release/deploy? There are a lot of different factors that go into it, some are corporate drivers like having an enterprise git client that has permissions handled thru MSDN. Sometimes it's a project management driver of having work items tied tightly to commit and builds, but with the added flexibility that OD brings to the table for free/minimal cost.
Hoping this help shine a little light into maybe why some people are crossing streams and using both VSTS and OD.
A lot of good points have been made already, but it really comes down to what you need. I would venture a lot of us started using Octopus before Release Management was really a thing.
We use VSTS for all our source control and builds and then all our deployments are handled through Octopus.
When we started evaluating tools, VSTS had nothing for deployments. Even now, they are still playing catch up to Octopus in feature set.
If you are doing true multi-tenanted and multi-environment deployments, I don't think VSTS really compares. We are using Octopus with around 30 tenants, some on Azure, some on premise. We deploy a mix of web and desktop apps. We are even using Octopus to deploy some legacy VB6 and winforms applications.
Multi-Tenancy (critical for us)
VSTS added Deployment Groups a while ago which sound pretty similar to Octopus Environments before multi-tenancy was implemented. Before Octopus had true multi-tenancy (it's been around a while now), people would work around it by creating different environments per tenant, like "CustomerA - Dev", "CustomerA - Prod", etc. Now you just have your Dev/Test/Prod environments and each tenant can have variables scoped to those individual environments.
Support
Documentation is excellent and it's really easy to get up and running.
The few times I've needed to contact someone at Octopus, they've answered very quickly and knowledgeably.
Usability
Having the Octopus dashboard giving us an overview of all our projects is amazing. I don't know of anyway to do this in VSTS, without going into each individual project.
Octopus works great on a mobile device for checking deployment status and even starting new deployments.
Community
Octopus works with their customers to understand what they want and they often release draft RFCs and have several times completely changed course based on customer feedback.
If we know what sort of applications you are deploying, and to what kinds of environments, we would be able to better tailor our responses.
The features you see today in VSTS weren't there a few years ago, so there might be an historical reason.
But I want to state here some non-opinionated reasons that may suggest an organization to opt for different tools instead of one.
Separate responsibility and access levels
Multiple CI tools in dev teams (orgs that are using also Jenkins or TeamCity or else) and need to standardize and control deployments
An org needs a feature available only in Octopus (maybe Multi-tenancy)
Octopus does a great job of focusing on deployments. Features reach octopus before vsts, support is local and responsive. That, and you never run out of build/release minutes!
Seriously though, I just like to support smaller companies where possible and if all features were equal, I'd still pick them.
The big reason in the past was that TFS On prem and early VSTS did NOT support non-Microsoft (.Net) code very well if at all. You could utilize the source control and work features of TFS and then use octopus/Jenkins etc... as the build release parts to cover code that TFS didn't really know what to do with.
Also the release pipelines used to be very simplistic and not that useful where the other products were all plugin based and could do (almost) anything you needed them to. Most of that has changed so that VSTS is much better at working with Non-Microsoft code bases then it used to be. Over time integrations get created inside a companies walls and undoing those decisions can be more painful then just having "too many" tools. Also I feel like there is just more people out there familiar with those tools since they have been mature longer and cover a larger part of the development world then VSTS has in the past.
To fully implement CD you need both. VSTS runs tests and is a build server. OD isn’t. VSTS is light on sophisticated application installations. And if you are provisioning environments, IaC style, you need Terraform in addition. Don’t try to shoehorn everything into a single tool. DevOps requires a whole ecosystem. The reasons are not historical.

How do I automate Kubernetes deployment YAML without relying on :latest?

I have a repository with a Kubernetes deployment YAML. Pipelines run on each commit that builds and pushes an image into our repository, versioned with the commit (eg. my_project:bm4a83). Then I'm updating the deployment image
kubectl set image deployment/my_deployment my_deployment=my_project:bm4a83.
This works, but I also want to keep the rest of the deployment YAML specification in version control.
I thought I could just keep it in the same repository, but that means my changes that may only be infrastructure (eg, changing replicas) triggers new builds, without code changes.
What felt like it made the most sense was keeping the deployment YAML in a totally separate repository. I figured I can manage all the values from there, independently from actual code changes. The only problem with that is the image key would need to be kept up to date. The only way around that, is working with some floating :latest-type version, but I don't really think that's ideal.
What's a sensible workflow for managing this? Am I missing something entirely?
What's a sensible workflow for managing this? Am I missing something entirely?
Some of the answer depends on the kind of risk you're trying to drive down with any process you have in place. If it's "the cluster was wiped out by a hurricane and I need to recover my descriptors," then Heptio Ark is a good solution for that. If the risks are more "human-centric," then IMHO you will have to walk a very careful line between locking down all the things and crippling the very agile, empowering, tools that kubernetes provides to a team. A concrete example of that model running up against your model is: what happens when a developer edits a Deployment but does not (remember|know) to update the descriptor in the repo? So do you revoke the edit rights? Use some diff-esque logic to detect a changed in-cluster config?
To speak to something you said specifically: it is a highly suboptimal idea to commit a descriptor change just to resize a (Deployment|ReplicationController|StatefulSet). Separately, a well-built CI pipeline would also understand if no buildable artifact changed and bail out (either early, or not even triggering a build, if the CI tool is that smart).
Finally, if you do want to carry on with the current situation, then the best practice I can think of is textual replacement right before applying a descriptor:
$ grep "image: " the-deployment.yml
image: example.com/something:#CI_PIPELINE_IID#
$ sed -i'' -e "s/#CI_PIPELINE_IID#/${CI_PIPELINE_IID}/" the-deployment.yml
$ kubectl apply -f the-deployment.yml
so that the copy in the repo remains textually pristine, and also isn't inadvertently actually applied since it won't actually result in a runnable Deployment.
but I also want to keep the rest of the deployment YAML specification in version control.
Yes, you want to do that. Putting everything under version control is a good practice to achieve immutable infrastructure.
If you want the deployment to have a separate piece of metadata (for whatever auditing / change tracking reason), why can't you just leverage the Kubernetes metadata block?
metadata:
name: my_deployment
commit: bm4a83
Then you inject such information through Jinja, Ruby ERBs, Go Templates, etc.

How should I manage deployments with kubernetes

I am hoping to find a good way to automate the process of going from code to a deployed application on my kubernetes cluster.
In order to build and deploy my app I need to first build the docker image, tag it, and then push it to ECR. I then need to update my deployment.yaml with the new tag for the docker image and run the deployment with kubectl apply -f deployment.yaml.
This will go and perform a rolling deployment on the kubernetes cluster updating the pods to the new version of the container image, once this deployment has completed I may need to do other application specific things such as running database migrations, or cache clear/warming which may or may not need to run for a given deployment.
I suppose I could just write a shell script that runs all of these commands, and run it whenever I want to start up a new deployment, but I am hoping there is a better/industry standard way to solve these problems that I have missed.
As I was writing this question I noticed stackoverflow recommend this question: Kubernetes Deployments. One of the answers to it seems to imply at least some of what I am looking for is coming soon to kubernetes, but I want to make sure that if there is a better solution I could be using now that I at least know about it.
My colleague has a good blog post about this topic:
http://blog.jonparrott.com/building-a-paas-on-kubernetes/
Basically, Kubernetes is not a Platform-as-a-Service, it's a toolkit on which you can build your own Platform-a-as-Service. It's not very opinionated by design, instead it focuses on solving some tricky problems with scheduling, networking, and coordinating containers, and lets you layer in your opinions on top of it.
One of the simplest ways to automate the workflows you're describing is using a Makefile.
A step up from that, you can design your own miniature PaaS, which the author of the first blog post did here:
https://github.com/jonparrott/noel
Or, you could get involved in more sophisticated efforts to build an open source PaaS on Kubernetes, like OpenShift:
https://www.openshift.com/
or Deis, which is building a Heroku-like platform on Kubernetes:
https://deis.com/
or Redspread, which is building "Git for Kubernetes cluster":
https://redspread.com/
and there are many other examples of people building PaaS on top of Kubernetes. But I think it will be a long time, if ever, that there is an "industry standard" way to deploy to Kubernetes, since half the purpose is to enable multiple deployment workflows for different use cases.
I do want to note that as far as building container images, Google Cloud Container Builder can be a useful tool, since you can do things like use it to automatically build an image any time you push to a repository which could then get deployed. Alternatively, Jenkins is a popular way to automate CI/CD flows with Kubernetes.
I suppose I could just write a shell script that runs all of these commands, and run it whenever I want to start up a new deployment, but I am hoping there is a better/industry standard way to solve these problems that I have missed.
The company I work for (Weaveworks) and other folks in the space had been advocating for an approach that we call GitOps, please take a look at our series of blog posts covering the topic:
GitOps - Operations by Pull Request
The GitOps Pipeline - Part 2
GitOps Part 3 - Observability
Storing Secure Sealed Secrets using GitOps
The gist of it is that you push images from CI, your checked YAML manifests in git (usually different repo from app code). This repo with manifests is then applied to each of your clusters (dev/prod) by a reconciliation operator. You can automate it all yourself quite easily, but also do take a look at what we have built.
Disclaimer: I am a Kubernetes contributor and Weaveworks employee. We build open-source and commercial tools that help people to get to production with Kubernetes sooner.
We're working on an open source project called Jenkins X which is a proposed sub project of the Jenkins foundation aimed at automating CI/CD on Kubernetes using Jenkins and GitOps for promotion.
When you merge a change to the master branch, Jenkins X creates a new semantically versioned distribution of your app (pom.xml, jar, docker image, helm chart). The pipeline then automates the generation of Pull Requests to promote your application through all of the Environments via GitOps.
Here's a demo of how to automate CI/CD with multiple environments on Kubernetes using GitOps for promotion between environments and Preview Environments on Pull Requests - using Spring Boot and nodejs apps (but we support many languages + frameworks).