Openshift deployment validation - QA - deployment

wanted to know if there's any tool that can validate an openshift deployment. Let's say you have a deploy configuration file with different features (secrets, routes, services, environment variables, etc) and I want to validate after the deployment has finished and the POD/s is/are created in Openshift, that all those things are there as requested on the file. Like a tool for QA.
thanks

Readiness probe are there which can execute http requests on the pod to confirm its availability. Also it can execute commands to confirm desired resources are available within the container.
Readiness probe

There is a particular flag --dry-run in Kubernetes for resource creation which performs basic syntax verification and template object schema validation without real object implementation, therefore you can do the test for all underlying objects defined in the deployment manifest file.
I think it is also feasible to achieve through OpenShift client:
$ oc create -f deployment-app.yaml --dry-run
or
$ oc apply -f deployment-app.yaml --dry-run
You can find some useful OpenShift client commands in Developer CLI Operations documentation page.

For one time validation, you can create a Job (OpenShift) with Init Container (OpenShift) that ensures that all deployment process is done, and then run test/shell script with sequence of kubectl/curl/other commands to ensure that every piece of deployment are in place and in desired state.
For continuous validation, you can create a CronJob (OpenShift) that will periodically create a test Job and report the result somewhere.
This answer can help you to create all that stuff.

Related

GCP Alerting Policy for failed GKE CronJob

What would be the best way to set up a GCP monitoring alert policy for a Kubernetes CronJob failing? I haven't been able to find any good examples out there.
Right now, I have an OK solution based on monitoring logs in the Pod with ERROR severity. I've found this to be quite flaky, however. Sometimes a job will fail for some ephemeral reason outside my control (e.g., an external server returning a temporary 500) and on the next retry, the job runs successfully.
What I really need is an alert that is only triggered when a CronJob is in a persistent failed state. That is, Kubernetes has tried rerunning the whole thing, multiple times, and it's still failing. Ideally, it could also handle situations where the Pod wasn't able to come up either (e.g., downloading the image failed).
Any ideas here?
Thanks.
First of all, confirm the GKE’s version that you are running. For that, the following commands are going to help you to identify the GKE’s
default version and the available versions too:
Default version.
gcloud container get-server-config --flatten="channels" --filter="channels.channel=RAPID" \
--format="yaml(channels.channel,channels.defaultVersion)"
Available versions.
gcloud container get-server-config --flatten="channels" --filter="channels.channel=RAPID" \
--format="yaml(channels.channel,channels.validVersions)"
Now that you know your GKE’s version and based on what you want is an alert that is only triggered when a CronJob is in a persistent failed state, GKE Workload Metrics was the GCP’s solution that used to provide a fully managed and highly configurable solution for sending to Cloud Monitoring all Prometheus-compatible metrics emitted by GKE workloads (such as a CronJob or a Deployment for an application). But, as it is right now deprecated in G​K​E 1.24 and was replaced with Google Cloud Managed Service for Prometheus, then this last is the best option you’ve got inside of GCP, as it lets you monitor and alert on your workloads, using Prometheus, without having to manually manage and operate Prometheus at scale.
Plus, you have 2 options from the outside of GCP: Prometheus as well and Ranch’s Prometheus Push Gateway.
Finally and just FYI, it can be done manually by querying for the job and then checking it's start time, and compare that to the current time, this way, with bash:
START_TIME=$(kubectl -n=your-namespace get job your-job-name -o json | jq '.status.startTime')
echo $START_TIME
Or, you are able to get the job’s current status as a JSON blob, as follows:
kubectl -n=your-namespace get job your-job-name -o json | jq '.status'
You can see the following thread for more reference too.
Taking the “Failed” state as the medullary point of your requirement, setting up a bash script with kubectl to send an email if you see a job that is in “Failed” state can be useful. Here I will share some examples with you:
while true; do if `kubectl get jobs myjob -o jsonpath='{.status.conditions[?(#.type=="Failed")].status}' | grep True`; then mail email#address -s jobfailed; else sleep 1 ; fi; done
For newer K8s:
while true; do kubectl wait --for=condition=failed job/myjob; mail#address -s jobfailed; done

Is there a Kubernetes rolling upgrade / downgrade finish hook

When you edit a deployment to update the docker image, I need to run a one-time script which changes parts of my application database and sends an email that the rolling upgrade process is complete and the result is passed / failed.
Is there a hook where I can attach this script to?
No, there is no such thing in Kubernetes. Usually this should be done by CI/CD pipeline.
Kubernetes doesn't implement such thing. This can be done by CI/CD pipeline or manually checking rolling update status. As you have said you can write simple script which will check status of rolling update and send it via e-mail and attach it to created pipeline in Jenkins.
To manually check status of rolling update execute command:
$ kubectl rollout status deploy/your-deployment -n your-namespace
If for example you are passing variables using ConfigMap you can use Reloader to perform your rolling updates automatically when a configmap/secret changed.
As far as I know, Kubernetes does not provide anything to support such functionality out of the box but you modify the script to check the status of the rollout using the following command with some sleep:
kubectl rollout status deployment/<deployment-name>

Deleting kubernetes yaml: how to prevent old objects from floating around?

i'm working on a continuous deployment routine for a kubernetes application: everytime i push a git tag, a github action is activated which calls kubectl apply -f kubernetes to apply a bunch of yaml kubernetes definitions
let's say i add yaml for a new service, and deploy it -- kubectl will add it
but then later on, i simply delete the yaml for that service, and redeploy -- kubectl will NOT delete it
is there any way that kubectl can recognize that the service yaml is missing, and respond by deleting the service automatically during continuous deployment? in my local test, the service remains floating around
does the developer have to know to connect kubectl to the production cluster and delete the service manually, in addition to deleting the yaml definition?
is there a mechanism for kubernetes to "know what's missing"?
You need to use a CI/CD tool for Kubernetes to achieve what you need. As mentioned by Sithroo Helm is a very good option.
Helm lets you fetch, deploy and manage the lifecycle of applications,
both 3rd party products and your own.
No more maintaining random groups of YAML files (or very long ones)
describing pods, replica sets, services, RBAC settings, etc. With
helm, there is a structure and a convention for a software package
that defines a layer of YAML templates and another layer that
changes the templates called values. Values are injected into
templates, thus allowing a separation of configuration, and defines
where changes are allowed. This whole package is called a Helm
Chart.
Essentially you create structured application packages that contain
everything they need to run on a Kubernetes cluster; including
dependencies the application requires. Source
Before you start, I recommend you these articles explaining it's quirks and features.
The missing CI/CD Kubernetes component: Helm package manager
Continuous Integration & Delivery (CI/CD) for Kubernetes Using CircleCI & Helm
There's no such way. You can deploy resources from yaml file from anywhere if you can reach the node and configure kube config. So kubernetes will not know how to respond on a file deletion. If you still want to do this, you can write a program (a go code) which checks the availability of files in one place and deletes the corresponding resource whenever the file gets deleted.
There's one way via kubernetes is by using kubernetes operator, and whenever there is any change in your files you can update the crd used to deploy resources via operator.
Before deleting the yaml file, you can run kubectl delete -f file.yaml, this way all the resources created by this file will be deleted.
However, what you are looking for, is achieving the desired state using k8s. You can do this by using tools like Helmfile.
Helmfile, allow you to specify the resources you want to have all in one file, and it will achieve the desired state every time you run helmfile apply

Kubernetes - validate deployments

I have a namespace namespace - which has ~10-15 deployments.
Creating a big yaml file, and apply it on a "deploy".
How do i validate, wait, watch, block, until all deployments have been rolledout ?
currently i am thinking of:
get list of deployments
foreach deployment - make api call to get status
once all deployments are "green" - end process, signaling deployment/ship is done.
what are the status'es of deployments, is there already a similar tool that can do it? https://github.com/Shopify/kubernetes-deploy is kind of what i am searching for, but it forces a yml structure and so on.
what would be the best approach?
Set a readiness probe and use kubectl rollout status deployment <deployment_name> to see the deployment rollout status
You'd better use Helm for managing deployments. Helm allows you to create reusable templates that can be applied to more than one environment. Read more here: https://helm.sh/docs/chart_template_guide/#getting-started-with-a-chart-template
You can create one big chart for all your services or you can create separate Helm charts for each your service.
Helm also allows you to run tests after deployment is done. Read more here: https://helm.sh/docs/developing_charts/#a-breakdown-of-the-helm-test-hooks
You probably want to use kubectl wait https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands#wait
It lets you wait for a specific condition of a specific object
In your case:
kubectl -n namespace \
wait --for=condition=Available --timeout=32s \
deployment/name
use --dry-run option in the apply/create command to check the syntax.

How to use helm chart test to do integration tests?

I am trying to use it to run some integration tests, so to verify the service code I am deploying is actually doing the right thing.
Basically how I setup is (as described here: https://docs.helm.sh/developing_charts/#chart-tests) creating this templates/tests/integration-test.yaml chart test file, and inside it specify to run a container, which basically is a customized maven image with test code added in and the test container is simply started by command “mvn test”, which does some simple curl check on the kube service this whole helm release deploys.
In this way, the helm test does work.
However, the issue is, during the helm test is running, the new version of the service code is actually already online and being exposed to the outside world/users. I can of course immediately do a roll back if the helm test fails, but this will not stop me hosting the problem-version of the service code for a while to the outside world.
Is there a way, where one can run a service/integration test on a pod, after the pod is started but before it is exposed to the Kubernetes service?
Ideally you'll install and test on a test environment first, either a dedicated test cluster or namepsace. For an additional check you could install the chart first into a new namespace and let the tests run there and then delete that namespace when it is all passed. This does require writing the tests in a way that they can hit URLs that are specific to that namespace. Cluster-internal URLs based on service names will be namespace-relative anyway but if you use external URLs in the tests then you'd either need to switch them to internal or use prefixing.
Use the readiness and liveness probes in the pod spec to ensure that the deployment won't even roll out if there are probe failures.