Azure DevOps > Helm > Azure Kubernetes Deployment - Deletes Azure File share when deployment is deleted - kubernetes

TL;DR
My pods mounted Azure file shares are (inconsistently) being deleted by either Kubernetes / Helm when deleting a deployment.
Explanation
I've recently transitioned to using Helm for deploying Kubernetes objects on my Azure Kubernetes Cluster via the DevOps release pipeline.
I've started to see some unexpected behaviour in relation to the Azure File Shares that I mount to my Pods (as Persistent Volumes with associated Persistent Volume Claims and a Storage Class) as part of the deployment.
Whilst I've been finalising my deployment, I've been pushing out the deployment via the Azure Devops release pipeline using the built in Helm tasks, which have been working fine. When I've wanted to fix / improve the process I've then either manually deleted the objects on the Kubernetes Dashboard (UI), or used Powershell (command line) to delete the deployment.
For example:
helm delete myapp-prod-73
helm del --purge myapp-prod-73
Not every time, but more frequently, I'm seeing the underlying Azure File Shares also being deleted as I'm working through this process. There's very little around the web on this, but I've also seen an article outlining similar issues over at: https://winterdom.com/2018/07/26/kubernetes-azureFile-dynamic-volumes-deleting.
Has anyone in the community come across this issue?

Credit goes to https://twitter.com/tomasrestrepo here on pointing me in the right direction (the author of the article I mentioned above).
The behaviour here was a consequence of having the Reclaim Policy on the Storage Class & Persistent Volume set to "Delete". When switching over to Helm, I began following their commands to Delete / Purge the releases as I was testing. What I didn't realise, was that deleting the release would also mean that Helm / K8s would also reach out and delete the underlying Volume (in this case an Azure Fileshare). This is documented over at: https://kubernetes.io/docs/concepts/storage/persistent-volumes/#delete
I'll leave this Q & A here for anyone else that misses this subtly with the way in which the Storage Classes, Persistent Volumes (PVs) & underlying storage operates under K8s / Helm.
Note: I think this issue was made slightly more obscure by the fact I was manually creating the Azure Fileshare (through the Azure Portal) and trying to mount that as a static volume (as per https://learn.microsoft.com/en-us/azure/aks/azure-files-volume) within my Helm Chart, but that the underlying volume wasn't immediately being deleted when the release was deleted (sometimes an hour later?).

Related

Migrating resourses from an openshift cluster to another

I have an Openshift cluster and I want to move its resources to another cluster,
e.g. I have 40 Secrets, and 20 ConfigMaps, and some other resources such as deployment configs and more.
Moving these secrets and config maps manually is mind-blowing.
What is the best approach?
I would recommend trying out Monokle's Compare & Sync feature.
It allows you to visually compare the resources of two clusters and deploy resources from one to the other.
Here's a screenshot of the UI:
You can read more about how this works in the docs.
OpenShift has an "official" process for this called "Migration Toolkit for Containers (MTC)":
https://docs.openshift.com/container-platform/4.12/migration_toolkit_for_containers/about-mtc.html
Velero is also a great tool for your scenario. You can backup your namespaces with the granularity of the objects included, and restore them elsewhere with or without making changes:
https://velero.io/docs/v1.10/migration-case/
Follow these steps:
move secrets and config maps
move deployments
move services
move routes
As an example of how I'll do each step mentioned above, follow these steps for each of them:
1 - Login to the first cluster:
oc login --token="your-token-for-first-server" --server="your-first-server"
2 - Export your resources:
oc get -o yaml cm > configmaps.yaml
oc get -o yaml secrets > secrets.yaml
...
There are also some default ConfigMaps and Secrets which you don't need to copy, you can erase them after making the files.
3 - Login to the second cluster:
oc login --token="your-token-for-second-server" --server="your-second-server"
If you forget this step, you may get an error that says resource already exists, but be careful not to forget this step.
4 - Load resources to the second cluster
oc create -f configmaps.yaml
oc create -f secrets.yaml
...
There might be easier ways too, and there are a lot of information about this which is out of my knowledge.
There are also some considerations you need to aware of:
You may not need to move pods, usually they are made and controlled by other resources such as deployment configs.
In some companies, databases are managed completely separately by DBA teams, you may not need to change anything, but if your database is within your cluster, you should consider moving it's PV.
Using Helm chart or Openshift templates can help you make this kind of task so easier.
You can include templates in your GitLab CI/CD pipelines and just change your cluster URL and everything is up and running and redeploy.
In the end, if you are migrating from version 3 to 4, this article might be helpful.

Observing weird kubernetes behavior while deleting using yaml

When I run kubectl delete deployment.yaml, It is displayed on cli that the deployment is deleted. The pod also gets into terminating state. But a new pod is again created with the same deployment and replica-set.
On further digging in I found out that deployment and RS are not being removed. Any reason why deployment and RS wouldn't be removed? Why would the be terminated if deployment isn't removed?
Any leads are appreciated.
As OP confirmed in the comments that they are running argocd then the recreation of the resources is expected behaviour if argocd is running in auto sync mode for the impacted namespace.
Here is a short snippet from the document
Argo CD has the ability to automatically sync an application when it detects differences between the desired manifests in Git, and the live state in the cluster. A benefit of automatic sync is that CI/CD pipelines no longer need direct access to the Argo CD API server to perform the deployment. Instead, the pipeline makes a commit and push to the Git repository with the changes to the manifests in the tracking Git repo.
Solution: you can disable autosync and monitor the delta and approve sync manually. This is something decided at project level. you can read about it here.

How to backup PVC regularly

What can be done to backup kubernetes PVC regularly for GCP and AWS?
GCP has VolumeSnapshot but I'm not sure how to schedule it, like every hour or every day.
I also tried Gemini/fairwinds but I get the following error when for GCP. I installed the charts as mentioned in README.MD and I can't find anyone else encountering the same error.
error: unable to recognize "backup-test.yml": no matches for kind "SnapshotGroup" in version "gemini.fairwinds.com/v1beta1"
You can implement Velero, which gives you tools to back up and restore your Kubernetes cluster resources and persistent volumes.
Unfortunately, Velero only allows you to backup & restore PV, not PVCs.
Velero’s restic integration backs up data from volumes by accessing the node’s filesystem, on which the pod is running. For this reason, restic integration can only backup volumes that are mounted by a pod and not directly from the PVC.
Might wanna look into stash.run
Agree with #hdhruna - Velero is really the most popular tool for doing that task.
However, you can also try miracle2k/k8s-snapshots
Automatic Volume Snapshots on Kubernetes
How is it useful? Simply add
an annotation to your PersistentVolume or PersistentVolumeClaim
resources, and let this tool create and expire snapshots according to
your specifications.
Supported Environments:
Google Compute Engine disks,
AWS EBS disks.
I evaluated multiple solutions including k8s CSI VolumeSnapshots, https://stash.run/, https://github.com/miracle2k/k8s-snapshots and CGP disks snapshots.
The best one in my opinion, is using k8s native implementation of snapshots via CSI driver, that is if you have a cluster version > = 1.17. This allows snapshoting volumes while in use, doesn't require having a read many or write many volume like stash.
I chose gemini by fairwinds also to automate backup creation and deletion and restoration and it works like a charm.
I believe your problem is caused by that missing CRD from gemini in your cluster. Verify that the CRD is installed correctly and also that the version installed is indeed the version you are trying to use.
My installation went flawlessly using their install guide with Helm.

How to include AWS EKS with CI/CD?

I am studying about CI/CD on AWS (CodePipeline/CodeBuild/CodeDeploy) and found it to be a very good tool for managing a pipeline on the cloud with everything managed (don't even need to install Jenkins on EC2).
I am now reading about container building and deployment. For the build phase, CodeBuild supports building container images. For the deploy phase, while I could find a CodeDeploy solution to ECS cluster, it seems there is no direct CodeDeploy solution for EKS (kindly correct if I am wrong).
May I know if there is a solution to integrate EKS cluster (i.e. the deploy phase can fetch the docker image from ECR or dockerhub and deploy to EKS)? I have come across some ideas using lamda functions to trigger the cluster to perform rolling update of the container image, but I could not find a step-by-step guide on this.
=========================
(Update 17 Sep 2020)
Somehow managed to create a lambda function to trigger an update to EKS to perform rolling update of the k8s deployment. Thanks Prashanna for the source base.
Just want to share the key setups in the process.
(1) Update the lambda execution role to include permission to describe EKS clusters
Create a policy with describe EKS cluster access, and attach to the role:
Policy snippet:
...
......
"Action": "eks:Describe*"
...
......
Or you can create a "EKSFullAccess" policy, and attach to the lambda execution role
(2) Update the k8s ConfigMap, and supplement the lambda execution role ARN to the mapRole section. The corresponding k8s role should be a role that has permission to update container images (say system:masters) used for the k8s deployment
You can edit the map with command like below:
kubectl edit -n kube-system configmap/aws-auth
You don't have to add/update another ConfigMap even if your deployment is in another namespace. It will take effect as well.
Sample lambda function call request and response:
Gitab provides the inbuilt integration of EKS and deployment with the help of Helm charts. If you plan to use other tools Using AWS lambda to update the image is the best bet!
I've added my github project.
Setup a lambda with below code and give RBAC access to this lambda in your EKS. Try invoking the lambda by passing the required information like namespace, deployment, image etc
Lambda for Kubernetes image update
The lambda must require EKS:describecluster policy.
The Lambda role must be provided atleast update image RBAC role in EKS cluster RBAC role setup
Since there's no built-in CI/CD for EKS at the moment, this is going to be a showcase of success/failure stories of a 3rd-party CI/CDs in EKS :) My take: https://github.com/fluxcd/flux
Pros:
Quick to set up initially (until you get into multiple teams/environments)
Tracks and deploys image releases out of box
Possibility to split what to auto-deploy in dev/prod using regex. E.g. all versions to dev, only minor to prod. Or separate tag prefixes for dev/prod.
All state is in git - a good practice to start with
Cons:
Getting complex for further pipeline expansion, e.g. blue-green, canary, auto-rollbacks, etc.
The dashboard is proprietary (weave works product)
Not for on-demand parametrized job runs like traditional CIs.
Setup:
Setup an automated image build (looks like you've already figured out)
Setup flux and helm-operator into the cluster, point them to your "gitops repo"
For each app, create a HelmRelease object that describes a regex of image tag to track
Done. A newly published image tag that falls into regex will be auto-deployed to the cluster and the new version is committed to a gitops repo.

Deleting kubernetes yaml: how to prevent old objects from floating around?

i'm working on a continuous deployment routine for a kubernetes application: everytime i push a git tag, a github action is activated which calls kubectl apply -f kubernetes to apply a bunch of yaml kubernetes definitions
let's say i add yaml for a new service, and deploy it -- kubectl will add it
but then later on, i simply delete the yaml for that service, and redeploy -- kubectl will NOT delete it
is there any way that kubectl can recognize that the service yaml is missing, and respond by deleting the service automatically during continuous deployment? in my local test, the service remains floating around
does the developer have to know to connect kubectl to the production cluster and delete the service manually, in addition to deleting the yaml definition?
is there a mechanism for kubernetes to "know what's missing"?
You need to use a CI/CD tool for Kubernetes to achieve what you need. As mentioned by Sithroo Helm is a very good option.
Helm lets you fetch, deploy and manage the lifecycle of applications,
both 3rd party products and your own.
No more maintaining random groups of YAML files (or very long ones)
describing pods, replica sets, services, RBAC settings, etc. With
helm, there is a structure and a convention for a software package
that defines a layer of YAML templates and another layer that
changes the templates called values. Values are injected into
templates, thus allowing a separation of configuration, and defines
where changes are allowed. This whole package is called a Helm
Chart.
Essentially you create structured application packages that contain
everything they need to run on a Kubernetes cluster; including
dependencies the application requires. Source
Before you start, I recommend you these articles explaining it's quirks and features.
The missing CI/CD Kubernetes component: Helm package manager
Continuous Integration & Delivery (CI/CD) for Kubernetes Using CircleCI & Helm
There's no such way. You can deploy resources from yaml file from anywhere if you can reach the node and configure kube config. So kubernetes will not know how to respond on a file deletion. If you still want to do this, you can write a program (a go code) which checks the availability of files in one place and deletes the corresponding resource whenever the file gets deleted.
There's one way via kubernetes is by using kubernetes operator, and whenever there is any change in your files you can update the crd used to deploy resources via operator.
Before deleting the yaml file, you can run kubectl delete -f file.yaml, this way all the resources created by this file will be deleted.
However, what you are looking for, is achieving the desired state using k8s. You can do this by using tools like Helmfile.
Helmfile, allow you to specify the resources you want to have all in one file, and it will achieve the desired state every time you run helmfile apply