I'm currently trying to change the audit policy for the openshift-kube-apiserver pod to output more information that isn't set up by default, primarily getting the requestBody of all incoming requests. There is an option in the kube-apiserver to change the audit policy here: https://kubernetes.io/docs/tasks/debug-application-cluster/audit/. However, I can't seem to find that option on OpenShift. I suspect it might be within the openshift-kube-apiserver-operator, but have hit a dead end. Does anyone else have experience with this problem and can provide some guidance? Thank you in advance.
Unfortunately, at the moment OpenShift v4 does not allow you to custom audit policy. OpenShift v3 can custom it. But alternatively as of OCPv4.6 you can specify some predefined policies instead of your custom.
Refer Configuring the node audit log policy for more details.
OpenShift Container Platform provides the following predefined audit policy profiles:
Default
Logs only metadata for read and write requests; does not log request bodies. This is the default policy.
WriteRequestBodies
In addition to logging metadata for all requests, logs request bodies for every write request to the API servers (create, update, patch).
AllRequestBodies
In addition to logging metadata for all requests, logs request bodies for every read and write request to the API servers (get, list, create, update, patch).
You can change the audit policy as follows,
$ oc edit apiserver cluster
apiVersion: config.openshift.io/v1
kind: APIServer
metadata:
...
spec:
audit:
profile: WriteRequestBodies
After above changes, all kube-apiserver pods are going to restart through rolling update for taking effects.
Related
According to GKE, I can enable cluster audit logs from the k8s.io API, which will forward cluster events to Cloud Logging. However, I'm unable to find RBAC logs for read requests on custom resources.
Specifically, if I have a CR foo, I seem to only be able to view create and delete events on foo. get and list are separate permissions as well (in both IAM and cluster RBAC), but those calls don't seem to be audited.
Is there a way to see those requests, and their responses, or is that not possible?
It's weird because the cluster's own kube-apiserver.log seems to log those requests:
... httplog.go:109] "HTTP" verb="GET" URI="/apis/foo.io/v1/namespaces/foo-ns/custom-resource/foo" latency="26.286746ms" userAgent="kubectl/v1.xx.x (linux/amd64)" audit-ID="baz" srcIP="1.2.3.4:55555" resp=200
I am pretty new in the kubernetes field, and have specific interest in kubernetes eventing. What I did was creating a K8 cluster, assign PODs to that cluster, and type kubectl get events command to see the corresponding events. Now for my work, I need to explore how can I create an K8 event resource using the eveting API provided here, so that using the kubectl get events command I can see the event stored in etcd. But as I mentioned I don't know K8 that deep, I'm struggling to make this api work.
N.B. I had a look into knative eventing, but seems to be the eventing feature Knative provides, is different than the K8 eveting, as I can't see the Knative events in the kubectl get events command. (please correct me if I am wrong).
Thanks in advance.
Usually events are created in Kubernetes as a way of informing of relevant status changes of an object.
Creating an event is as simple as:
kubectl apply -f - <<EOF
apiVersion: v1
kind: Event
metadata:
name: myevent
namespace: default
type: Warning
message: 'not sure if this what you want to do'
involvedObject:
kind: someObject
EOF
But usually this is done programmatically, using the Kubernetes client library of your choice and the core (v1) API group.
Knative Eventing is platform that enables Event Driven architectures, which does not seems related to your question, but if it were, you can find all getting started docs here:
Installation: https://knative.dev/docs/install/yaml-install/eventing/install-eventing-with-yaml/
Getting started: https://knative.dev/docs/eventing/getting-started/
Maybe this could be helpfull to. Knative have this concepts of "Event Sources" that already exists and serves as links for events for certain producers and sinks.
In the case of the K8's events API there is one called the APIServer Source: https://knative.dev/docs/eventing/sources/apiserversource/
A bit more about Event Sources here: https://knative.dev/docs/eventing/sources/
We have constantly issues with our OpenShift Deployments. Credentials are missing suddenly (or suddenly we have the wrong credentials configured), deployments are scaled up and down suddenly etc.
Nobody of the team is aware of anything he did. However I am quite sure that this happens unknowingly from my recent experiences.
Is there any way to check the history of modifications to a resource? E.g. the last "oc/kubectl apply -f" - optimally with the contents that were modified and the user?
For a one off issue, you can also look at the replicaSets present in that namespace and examine them for differences. Depending on how much history you keep it may have already been lost, if it was present to begin with.
Try:
kubectl get rs -n my-namespace
Or, dealing with DeploymentConfigs, replicaControllers:
oc get rc -n my-namespace
For credentials, assuming those are in a secret and not the deployment itself, you wouldn't have that history without going to audit logs.
You need to configure and enable audit log, checkout the oc manual here.
In addition to logging metadata for all requests, logs request bodies
for every read and write request to the API servers...
K8s offers only scant functionality regarding tracking changes. Most prominently, I would look at kubectl rollout history for Deployments, Daemonsets and StatefulSets. Still, this will only tell you when and what was changes, but not who did it.
Openshift does not seem to offer much on top, since audit logging is cumbersome to configure and analyze.
With a problem like yours, the best remedy I see would be to revoke direct production access to K8s by the team and mandate changes to be rolled out via pipeline. That way you can use Git to track who did what.
The operator is https://operatorhub.io/operator/keycloak-operator version 11.0.0.
The cluster is Kubernetes version 1.18.12.
I was able to follow the steps from OperatorHub.io to install the Operator Lifecycle Manager and the Keycloak "OperatorGroup" and "Subscription".
It took much longer than I was expecting (maybe 20 minutes?), but eventually the corresponding "ClusterServiceVersion" was created.
However, now when I try to use it by creating the following resource, it doesn't seem to be doing anything at all:
apiVersion: keycloak.org/v1alpha1
kind: Keycloak
metadata:
name: example-keycloak
namespace: keycloak
labels:
app: sso
spec:
instances: 1
externalAccess:
enabled: true
extensions:
- https://github.com/aerogear/keycloak-metrics-spi/releases/download/1.0.4/keycloak-metrics-spi-1.0.4.jar
It accepts the new resource, so I know the CRD is in place. The documentation states that it should create a stateful set, an ingress, and more, but it just doesn't seem to create anything.
I checked the cluster logs and this is the error that is jumping out to me:
olm-operator ERROR controllers.operator Could not update Operator status {"request": "/keycloak-operator.my-keycloak-operator", "error": "Operation cannot be fulfilled on operators.operators.coreos.com \"keycloak-operator.my-keycloak-operator\": the object has been modified; please apply your changes to the latest version and try again"}
I have quite a bit of experience with plain kubernetes, but I'm brand new to "operators" and so I'm really not sure where to look next wrt what might be going wrong.
Any hints/suggestions/explanations?
UPDATE: I was creating the keycloak resource in a namespace OTHER than the one I installed the operator into. Since it allowed me to create the custom resource (Kind: Keycloak) into this namespace, I thought this was supported. However, when I created the keycloak resource to the same namespace where the operator was installed (my-keycloak-operator), then it actually tried to do something. Its still failing to bring up the pod, mind you, but at least its trying to do something.
Will leave this question open for a bit to see if the "Could not update Operator status" is something I should be concerned about or not...
It looks like the operator or/and the components that it wants to bring up cannot do a write (POST/PUT) to the kube-apiserver.
From what you describe, it appears that the first time when you installed the operator on a different namespace it just didn't have permissions to bring up anything at all. The second time when you installed it on the right namespace it looks like the operator was able to talk to the kube-apiserver but the components that it's bring up (Keycloak, etc) are not able to.
I would check the logs on the kube-apiserver (control plane) to see if you have some unauthorized requests, also check the log files of the components (pods, deployments, etc) that the operator is trying to bring up.
If you have unauthorized requests you may have to manually update the RBAC rules. Finally, I would check with IBM cloud to see what specific permission its K8s control plane could have that is preventing applications to talk to it (the kube-apiserver).
✌️
I have a simple spring boot application deployed on Kubernetes on GCP. The service is exposed to an external IP address. I am load testing this application using JMeter. It is just a http GET request which returns True or False.
I want to get the latency metrics with time to feed it to HorizontalPodAutoscaler to implement custom auto-scaler. How do I implement this?
Since you mentioned Custom Auto Scaler. I would suggest this simple solution which makes use of some of tools which you already might have.
First Part: Is to Create a service or cron or any time-based trigger which will on a regular interval make requests to your deployed application. This application will then store the resultant metrics to persistence storage or file or Database etc.
For example, if you use a simple Apache Benchmark CLI tool(you can also use Jmeter or any other load testing tool which generates structured o/p), You will get a detailed result for a single query. Use this link to get around the result for your reference.
Second Part Is that this same script can also trigger another event which will check for the latency or response time limit configured as per your requirement. If the response time is above the configured value scale if it is below scale down.
The logic for scaling down can be more trivial, But I will leave that to you.
Now for actually scaling the deployment, you can use the Kubernetes API. You can refer to the official doc or this answer for details. Here's a simple flow diagram.
There are two ways to auto scale with custom metrics:
1.You can export a custom metric from every Pod in the Deployment and target the average value per Pod.
2.You can export a custom metric from a single Pod outside of the Deployment and target the total value.
So follow these-
1. To grant GKE objects access to metrics stored in Stackdriver, you need to deploy the Custom Metrics Stackdriver Adapter. To run Custom Metrics Adapter, you must grant your user the ability to create required authorization roles by running the following command:
kubectl create clusterrolebinding cluster-admin-binding \
--clusterrole cluster-admin --user "$(gcloud config get-value account)"
To deploy adapter-
kubectl create -f https://raw.githubusercontent.com/GoogleCloudPlatform/k8s-stackdriver/master/custom-metrics-stackdriver-adapter/deploy/production/adapter.yaml
You can export your metrics to Stackdriver either directly from your application, or by exposing them in Prometheus format and adding the Prometheus-to-Stackdriver adapter to your Pod's containers.
You can view the exported metrics from the Metrics Explorer by searching for custom/[METRIC_NAME]
Your metric needs to meet the following requirements:
Metric kind must be GAUGE
Metric type can be either DOUBLE or INT64
Metric name must start with custom.googleapis.com/ prefix, followed by a simple name
Resource type must be "gke_container"
Resource labels must include:
pod_id set to Pod UID, which can be obtained via the Downward API
container_name = ""
project_id, zone, cluster_name, which can be obtained by your application from the metadata server. To get values, you can use Google Cloud's compute metadata client.
namespace_id, instance_id, which can be set to any value.
3.Once you have exported metrics to Stackdriver, you can deploy a HPA to scale your Deployment based on the metrics.
Vie this on GitHub for additional codes