Best practice for sanity test a K8s cluster? (ideally all from command line) - kubernetes

I am new here, I tried to search for the topic before I post here, this may have been discussed before, please let me know before being to hash on me :)
In my project, after performing some changes on either the DevOps tool sets or infrastructures, we always do some manual sanity test, this normally includes:
Building a new image and update the helm chart
Push the image to Artifactory and perform a "helm update", and see it it runs.
I want to automate the whole thing, and try to get advice from the community, here's some requirement:
Validate Jenkins agent being able to talk to cluster ( I can do this with kubectl get all -n <some_namespace_jenkins_user_has_access_to)
Validate the cluster has access to Github (let's say I am using Argo CD to sync yamls)
Validate the cluster has access to Artifactory and able to pull image ( I don't want to build a new image with new tag and update helm chart, so that to force to cluster to pull new image)
All of the above can be done in command line (so that I can implement using Jenkins groovy)
Any suggestion is welcome.
Thanks guys

Your best bet is probably a combination of custom Jenkins scripts (i.e. running kubectl in Jenkins) and some in-cluster checks (e.g. using kuberhealthy).
So, when your Jenkins pipeline is triggered, it could do the following:
Check connectivity to the cluster
Build and push an image, etc.
Trigger in-cluster checks for testing if the cluster has access to GitHub and Artifactory, e.g. by launching a custom Job in the cluster, or creating a KuberhealthyCheck custom resource if you use kuberhealthy
During all this, the Jenkins pipeline writes the results of its test as metrics to a Pushgateway which is scraped by your Prometheus. The in-cluster checks also push their results as metrics to the Pushgateway, or expose them via kuberhealthy, if you decide to use it. In the end, you should have the results of all checks in the same Prometheus instance where you can react on them, e.g. creating Prometheus alerts or Grafana dashboards.

Related

How to include AWS EKS with CI/CD?

I am studying about CI/CD on AWS (CodePipeline/CodeBuild/CodeDeploy) and found it to be a very good tool for managing a pipeline on the cloud with everything managed (don't even need to install Jenkins on EC2).
I am now reading about container building and deployment. For the build phase, CodeBuild supports building container images. For the deploy phase, while I could find a CodeDeploy solution to ECS cluster, it seems there is no direct CodeDeploy solution for EKS (kindly correct if I am wrong).
May I know if there is a solution to integrate EKS cluster (i.e. the deploy phase can fetch the docker image from ECR or dockerhub and deploy to EKS)? I have come across some ideas using lamda functions to trigger the cluster to perform rolling update of the container image, but I could not find a step-by-step guide on this.
=========================
(Update 17 Sep 2020)
Somehow managed to create a lambda function to trigger an update to EKS to perform rolling update of the k8s deployment. Thanks Prashanna for the source base.
Just want to share the key setups in the process.
(1) Update the lambda execution role to include permission to describe EKS clusters
Create a policy with describe EKS cluster access, and attach to the role:
Policy snippet:
...
......
"Action": "eks:Describe*"
...
......
Or you can create a "EKSFullAccess" policy, and attach to the lambda execution role
(2) Update the k8s ConfigMap, and supplement the lambda execution role ARN to the mapRole section. The corresponding k8s role should be a role that has permission to update container images (say system:masters) used for the k8s deployment
You can edit the map with command like below:
kubectl edit -n kube-system configmap/aws-auth
You don't have to add/update another ConfigMap even if your deployment is in another namespace. It will take effect as well.
Sample lambda function call request and response:
Gitab provides the inbuilt integration of EKS and deployment with the help of Helm charts. If you plan to use other tools Using AWS lambda to update the image is the best bet!
I've added my github project.
Setup a lambda with below code and give RBAC access to this lambda in your EKS. Try invoking the lambda by passing the required information like namespace, deployment, image etc
Lambda for Kubernetes image update
The lambda must require EKS:describecluster policy.
The Lambda role must be provided atleast update image RBAC role in EKS cluster RBAC role setup
Since there's no built-in CI/CD for EKS at the moment, this is going to be a showcase of success/failure stories of a 3rd-party CI/CDs in EKS :) My take: https://github.com/fluxcd/flux
Pros:
Quick to set up initially (until you get into multiple teams/environments)
Tracks and deploys image releases out of box
Possibility to split what to auto-deploy in dev/prod using regex. E.g. all versions to dev, only minor to prod. Or separate tag prefixes for dev/prod.
All state is in git - a good practice to start with
Cons:
Getting complex for further pipeline expansion, e.g. blue-green, canary, auto-rollbacks, etc.
The dashboard is proprietary (weave works product)
Not for on-demand parametrized job runs like traditional CIs.
Setup:
Setup an automated image build (looks like you've already figured out)
Setup flux and helm-operator into the cluster, point them to your "gitops repo"
For each app, create a HelmRelease object that describes a regex of image tag to track
Done. A newly published image tag that falls into regex will be auto-deployed to the cluster and the new version is committed to a gitops repo.

How can I create a new Kubernetes pod from another existing pod?

I have a Kubernetes pod which downloading several types of files (let’s say X, Y and Z), and I have some processing scripts (each one is in a docker image) which are interested in one or more files (let's say processor_X_and_Y, processor_X_and_Z and processor_Z).
The first pod is always running, and I need to create a processor pod after downloading a file according to the file type, for example if the downloader downloads a file of type Z, I need to create a new instance of processor_X_and_Z and a new instance of processor_Z.
My current idea is to use Argo workflow by creating a simple workflow from 1 step for each processor, then starting the suitable workflows by calling the Argo REST API from the downloader pod. Thus I have achieved my goal and the auto-scaling of my system.
My question is is there another simpler engine or a service in Kubernetes which I can use to create a new prod from another pod without using this workflow engine?
You simply have to give your pod access to the api-server running on the control plane. That'll enable it to create/edit/delete pods by using kubectl or any other k8s library. You may want to use RBAC to restrict its permissions to the minimum required for the task at hand.
As mentioned in another answer, you can give your pod access to the Kubernetes API and then apply a Pod resource via kubectl.
If you want to start an Argo Workflow, you could use kubectl to apply a Workflow resource, or you could use the Argo CLI.
But if you're using Argo anyway, you might find it easier to use Argo Events to kick off a Workflow. You would have to choose an event source based on how/from where you're downloading the source files. If, for example, the files are on S3, you could use the SNS event source.
If you just need to periodically check for new files, you could use a CronWorkflow to perform the check and conditionally perform the rest of the workflow based on whether there's anything to download.

Deleting kubernetes yaml: how to prevent old objects from floating around?

i'm working on a continuous deployment routine for a kubernetes application: everytime i push a git tag, a github action is activated which calls kubectl apply -f kubernetes to apply a bunch of yaml kubernetes definitions
let's say i add yaml for a new service, and deploy it -- kubectl will add it
but then later on, i simply delete the yaml for that service, and redeploy -- kubectl will NOT delete it
is there any way that kubectl can recognize that the service yaml is missing, and respond by deleting the service automatically during continuous deployment? in my local test, the service remains floating around
does the developer have to know to connect kubectl to the production cluster and delete the service manually, in addition to deleting the yaml definition?
is there a mechanism for kubernetes to "know what's missing"?
You need to use a CI/CD tool for Kubernetes to achieve what you need. As mentioned by Sithroo Helm is a very good option.
Helm lets you fetch, deploy and manage the lifecycle of applications,
both 3rd party products and your own.
No more maintaining random groups of YAML files (or very long ones)
describing pods, replica sets, services, RBAC settings, etc. With
helm, there is a structure and a convention for a software package
that defines a layer of YAML templates and another layer that
changes the templates called values. Values are injected into
templates, thus allowing a separation of configuration, and defines
where changes are allowed. This whole package is called a Helm
Chart.
Essentially you create structured application packages that contain
everything they need to run on a Kubernetes cluster; including
dependencies the application requires. Source
Before you start, I recommend you these articles explaining it's quirks and features.
The missing CI/CD Kubernetes component: Helm package manager
Continuous Integration & Delivery (CI/CD) for Kubernetes Using CircleCI & Helm
There's no such way. You can deploy resources from yaml file from anywhere if you can reach the node and configure kube config. So kubernetes will not know how to respond on a file deletion. If you still want to do this, you can write a program (a go code) which checks the availability of files in one place and deletes the corresponding resource whenever the file gets deleted.
There's one way via kubernetes is by using kubernetes operator, and whenever there is any change in your files you can update the crd used to deploy resources via operator.
Before deleting the yaml file, you can run kubectl delete -f file.yaml, this way all the resources created by this file will be deleted.
However, what you are looking for, is achieving the desired state using k8s. You can do this by using tools like Helmfile.
Helmfile, allow you to specify the resources you want to have all in one file, and it will achieve the desired state every time you run helmfile apply

Openshift deployment validation - QA

wanted to know if there's any tool that can validate an openshift deployment. Let's say you have a deploy configuration file with different features (secrets, routes, services, environment variables, etc) and I want to validate after the deployment has finished and the POD/s is/are created in Openshift, that all those things are there as requested on the file. Like a tool for QA.
thanks
Readiness probe are there which can execute http requests on the pod to confirm its availability. Also it can execute commands to confirm desired resources are available within the container.
Readiness probe
There is a particular flag --dry-run in Kubernetes for resource creation which performs basic syntax verification and template object schema validation without real object implementation, therefore you can do the test for all underlying objects defined in the deployment manifest file.
I think it is also feasible to achieve through OpenShift client:
$ oc create -f deployment-app.yaml --dry-run
or
$ oc apply -f deployment-app.yaml --dry-run
You can find some useful OpenShift client commands in Developer CLI Operations documentation page.
For one time validation, you can create a Job (OpenShift) with Init Container (OpenShift) that ensures that all deployment process is done, and then run test/shell script with sequence of kubectl/curl/other commands to ensure that every piece of deployment are in place and in desired state.
For continuous validation, you can create a CronJob (OpenShift) that will periodically create a test Job and report the result somewhere.
This answer can help you to create all that stuff.

Auto update pod on every image push to GCR

I have a docker image pushed to Container Registry with docker push gcr.io/go-demo/servertime and a pod created with kubectl run servertime --image=gcr.io/go-demo-144214/servertime --port=8080.
How can I enable automatic update of the pod everytime I push a new version of the image?
I would suggest switching to some kind of CI to manage the process, and instead of triggering on docker push triggering the process on pushing the commit to git repository. Also if you switch to using a higher level kubernetes construct such as deployment, you will be able to run a rolling-update of your pods to your new image version. Our process is roughly as follows :
git commit #triggers CI build
docker build yourimage:gitsha1
docker push yourimage:gitsha1
sed -i 's/{{TAG}}/gitsha1/g' deployment.yml
kubectl apply -f deployment.yml
Where deployment.yml is a template for our deployment that will be updated to new tag version.
If you do it manually, it might be easier to simply update image in an existing deployment by running kubectl set image deployment/yourdeployment <containernameinpod>=yourimage:gitsha1
I'm on the Spinnaker team.
Might be a bit heavy, but without knowing your other areas of consideration, Spinnaker is a CD platform from which you can trigger k8s deployments from registry updates.
Here's a codelab to get you a started.
If you'd rather shortcut the setup process, you can get a starter Spinnaker instance with k8s and GCR integration pre-setup via the Cloud Launcher.
You can find further support on our slack channel (I'm #stevenkim).
It would need some glue, but you could use Docker Hub, which lets you define a webhook for each repository when a new image is pushed or a new tag created.
This would mean you'd have to build your own web API server to handle the incoming notifications and use them to update the pod. And you'd have to use Docker Hub, not Google Container Repository, which doesn't allow web hooks.
So, probably too many changes for the problem you're trying to solve.