Find out what is OPA getting as request from api-server? - kubernetes

We have OPA installed in our Kubernetes cluster. Not Gatekeeper. The "original" OPA...
I don't understand how I can look at what OPA is receiving as input request from the API-server ?
=> If I knew exactly what the payload looks like then writing the Rego would be simple.
I tried to use -v=8 option in kubectl to see the request and response from api-server like so:
$ kubectl get pod -v=8
...
GET https://xxxx.k8s.ovh.net/api/v1/namespaces/default/pods?limit=500
...
Request Headers: Accept: application/json;as=Table;v=v1;g=meta.k8s.io,application/json;as=Table;v=v1beta1;g=meta.k8s.io,application/json
...
Response Body: {"kind":"Table","apiVersion":"meta.k8s.io/v1","metadata":{"resourceVersion":"37801112226"},"columnDefinitions":[{"name":"Name","type":"string","format":"name","description":"Name must be unique within a namespace. Is required when creating resources, although some resources may allow a client to request the generation of an appropriate name automatically. Name is primarily intended for creation idempotence and configuration definition. Cannot be updated. More info: http://kubernetes.io/docs/user-guide/identifiers#names","priority":0},{"name":"Ready","type":"string","format":"","description":"The aggregate readiness state of this pod for accepting traffic.","priority":0},{"name":"Status","type":"string","format":"","description":"The aggregate status of the containers in this pod.","priority":0},{"name":"Restarts","type":"string","format":"","description":"The number of times the containers in this pod have been restarted and when the last container in this pod has restarted.","priority":0},{"name":"Age","type [truncated 4024 chars]
Unfortunatly the above JSON payload doesn't match what I see in the different tutorials.
How is anybody able to write OPA rules for Kubernetes ???
Thx

You have two options:
Run the OPA server with debug level logging:
opa run --server --log-level debug ...
This is obviously very noisy, so beware.
Run the server with decision logging enabled. This is almost always preferable, and allows you to either dump decisions (including input data) to console, or for production deployments, to a remote server aggregating the logs. The decision logging system is really the native way of doing this, and comes with a bunch of features, like masking of sensitive data, etc.. but if you just want something printed to the console, you can run OPA like:
opa run --server --set decision_logs.console=true ...

Related

GCP Alerting Policy for failed GKE CronJob

What would be the best way to set up a GCP monitoring alert policy for a Kubernetes CronJob failing? I haven't been able to find any good examples out there.
Right now, I have an OK solution based on monitoring logs in the Pod with ERROR severity. I've found this to be quite flaky, however. Sometimes a job will fail for some ephemeral reason outside my control (e.g., an external server returning a temporary 500) and on the next retry, the job runs successfully.
What I really need is an alert that is only triggered when a CronJob is in a persistent failed state. That is, Kubernetes has tried rerunning the whole thing, multiple times, and it's still failing. Ideally, it could also handle situations where the Pod wasn't able to come up either (e.g., downloading the image failed).
Any ideas here?
Thanks.
First of all, confirm the GKE’s version that you are running. For that, the following commands are going to help you to identify the GKE’s
default version and the available versions too:
Default version.
gcloud container get-server-config --flatten="channels" --filter="channels.channel=RAPID" \
--format="yaml(channels.channel,channels.defaultVersion)"
Available versions.
gcloud container get-server-config --flatten="channels" --filter="channels.channel=RAPID" \
--format="yaml(channels.channel,channels.validVersions)"
Now that you know your GKE’s version and based on what you want is an alert that is only triggered when a CronJob is in a persistent failed state, GKE Workload Metrics was the GCP’s solution that used to provide a fully managed and highly configurable solution for sending to Cloud Monitoring all Prometheus-compatible metrics emitted by GKE workloads (such as a CronJob or a Deployment for an application). But, as it is right now deprecated in G​K​E 1.24 and was replaced with Google Cloud Managed Service for Prometheus, then this last is the best option you’ve got inside of GCP, as it lets you monitor and alert on your workloads, using Prometheus, without having to manually manage and operate Prometheus at scale.
Plus, you have 2 options from the outside of GCP: Prometheus as well and Ranch’s Prometheus Push Gateway.
Finally and just FYI, it can be done manually by querying for the job and then checking it's start time, and compare that to the current time, this way, with bash:
START_TIME=$(kubectl -n=your-namespace get job your-job-name -o json | jq '.status.startTime')
echo $START_TIME
Or, you are able to get the job’s current status as a JSON blob, as follows:
kubectl -n=your-namespace get job your-job-name -o json | jq '.status'
You can see the following thread for more reference too.
Taking the “Failed” state as the medullary point of your requirement, setting up a bash script with kubectl to send an email if you see a job that is in “Failed” state can be useful. Here I will share some examples with you:
while true; do if `kubectl get jobs myjob -o jsonpath='{.status.conditions[?(#.type=="Failed")].status}' | grep True`; then mail email#address -s jobfailed; else sleep 1 ; fi; done
For newer K8s:
while true; do kubectl wait --for=condition=failed job/myjob; mail#address -s jobfailed; done

GKE Metadata server errors

I have a GKE with Workload identity enabled.
Most of our workloads use Cloud Storage or Cloud logging GCP packages which means actually using the Workload identity for GCP access.
Recently we’ve started adding Secret Manager to the stack and started encountering random errors for the Metadata Server on workload startup. It happens on different frameworks.
Python:
File "/venv/lib/python3.8/site-packages/google/auth/compute_engine/credentials.py", line 117, in refresh six.raise_from(new_exc, caught_exc) File "<string>", line 3, in raise_from google.auth.exceptions.RefreshError: ("Failed to retrieve http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/?recursive=true from the Google Compute Enginemetadata service. Status: 404 Response:\nb'Not Found\\n'", <google.auth.transport.requests._Response object at 0x7f3a3084dd60>)
NodeJS:
failed to initialize. exiting. Error: 16 UNAUTHENTICATED: Failed to retrieve auth metadata with error: Could not refresh access token: network timeout at: http://169.254.169.254/computeMetadata/v1/instance/service-accounts/default/token?scopes=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fcloud-platform at Object
I’m trying to understand why it's happening.
First, 404 Not Found means we are trying to get metadata which does not exist/deleted. The thing is it recovers a few seconds later so I'm not sure how exactly.
Based on documentation, sometimes it takes some time for the metadata server to be available, and hence the error which ‘recover’ afterwards. So recommendation is to add delays on the app code or using init Containers until the Metadata server is operated.
I wonder if that's really the best approach, to add an init container to all of our workloads, and if it's really our use case as the error code is a bit misleading. Also, not quite sure why its only started when adding the secret manager.
This sometimes happens due to OOM issues on Metadata server. you can check status of the pod running metadata server using:
kubectl -n kube-system describe pods <pod_name>
you can get the pod_name using:
kubectl get pods --namespace kube-system .
the pod name will start with a prefix gke-metadata-server-
if you see something like following in output when you describe the pod:
Last State: Terminated
Reason: OOMKilled
then that would indicate OOM issue.
Some mitigations that you can try:
check if you have un-used ServiceAccounts in your cluster and if you can remove em.
check if you are creating too many clients (new one for every API
request). sharing clients if possible will reduce token refresh calls to Metadata server thus, saving memory.
check if you can find metadata server's definition under /etc/kubernetes/addons/. if you can, update the memory to increase it and apply the updated config.

Openshift deployment validation - QA

wanted to know if there's any tool that can validate an openshift deployment. Let's say you have a deploy configuration file with different features (secrets, routes, services, environment variables, etc) and I want to validate after the deployment has finished and the POD/s is/are created in Openshift, that all those things are there as requested on the file. Like a tool for QA.
thanks
Readiness probe are there which can execute http requests on the pod to confirm its availability. Also it can execute commands to confirm desired resources are available within the container.
Readiness probe
There is a particular flag --dry-run in Kubernetes for resource creation which performs basic syntax verification and template object schema validation without real object implementation, therefore you can do the test for all underlying objects defined in the deployment manifest file.
I think it is also feasible to achieve through OpenShift client:
$ oc create -f deployment-app.yaml --dry-run
or
$ oc apply -f deployment-app.yaml --dry-run
You can find some useful OpenShift client commands in Developer CLI Operations documentation page.
For one time validation, you can create a Job (OpenShift) with Init Container (OpenShift) that ensures that all deployment process is done, and then run test/shell script with sequence of kubectl/curl/other commands to ensure that every piece of deployment are in place and in desired state.
For continuous validation, you can create a CronJob (OpenShift) that will periodically create a test Job and report the result somewhere.
This answer can help you to create all that stuff.

echo container image:tag (URI) in kubernetes readinessProbe or livenessProbe

I have many versions and tags of containers used by Deployment in k8s (and hence many log groups).
It would be nice if i could display the container URI and tag in a readinessProbe or livenessProbe which then flows to persisted logging.
Basically so that I know my Pod whose logs I am viewing is running on the correct image.
I thought of simply echoing it as a container variable, so i thought of setting the container image URI as a container variable in the Pod manifest.
The docs at k8s EnvVarSource says it only supports certain fields for fieldRef, importantly, it doesn't support grabbing the spec.containers image field.
Anyone has any smart ideas how I might achieve this in other ways?
Or when/if does the kubernetes team support this?
UPDATE:
i found that doing echo under readinessProbe.exec.command works (the Pod is Ready status) but the echo output does not flow to the logs.
Only the application (server) output appear in the logs in my logging backend (CloudWatch).

Using kubectl with Kubernetes authorization mode ABAC

I sent up a 4 node cluster (1 master 3 workers) running Kubernetes on Ubuntu. I turned on --authorization-mode=ABAC and set up a policy file with an entry like the following
{"user":"bob", "readonly": true, "namespace": "projectgino"}
I want user bob to only be able to look at resources in projectgino. I'm having problems using kubectl command line as user Bob. When I run the following command
kubectl get pods --token=xxx --namespace=projectgino --server=https://xxx.xxx.xxx.xx:6443
I get the following error
error: couldn't read version from server: the server does not allow access to the requested resource
I traced the kubectl command line code and the problem seems to caused by kubectl calling function NegotiateVersion in pkg/client/helper.go. This makes a call to /api on the server to get the version of Kubernetes. This call fails because the rest path doesn't contain namespace projectgino. I added trace code to pkg/auth/authorizer/abac/abac.go and it fails on the namespace check.
I haven't moved up the the latest 1.1.1 version of Kubernetes yet, but looking at the code I didn't see anything that has changed in this area.
Does anybody know how to configure Kubernetes to get around the problem?
This is missing functionality in the ABAC authorizer. The fix is in progress: #16148.
As for a workaround, from the authorization doc:
For miscellaneous endpoints, like
/version, the resource is the empty string.
So you may be able to solve by defining a policy:
{"user":"bob", "readonly": true, "resource": ""}
(note the empty string for resource) to grant access to unversioned endpoints. If that doesn't work I don't think there's a clean workaround that will let you use kubectl with --authorization-mode=ABAC.