When will a kube object have more than one ownerReferences

When will a kube object have more than one ownerReferences - kubernetes

When a kubernetes object has parent objects, it is mentioned under "ownerReferences". For example when i printed a pod spec in yaml format, i see ownerReferences mentioned as follows:
ownerReferences:
- apiVersion: apps/v1
blockOwnerDeletion: true
controller: true
kind: StatefulSet
name: statefuleset-name
uid: <uuid>
....
I see that ownerReferences is a list. Does anyone know when the ownerReferences will have more than one entries. I am not able to imagine a object having more than one owner.

If I understand you correctly it is possible in some circumstances.
In this blog you can see an example of multiple ownerReferences. The blog explains garbage collection in K8s and shows that Multiple ownerReferences are possible:
Yes, you heard that right, now postgres-namespace can be owned by more
than one database object.
I hope it helps.

You can have your own use-case and respective CRDs and there can be a requirement of associating an object with multiple owners.
Just taking a very basic example, consider there is a School, with multiple teachers and multiple students, if all 3 are different CRDs then the student may have OwnerReference of kind School with the school name and OwnerReference of kind Teacher with teacher name.
By the way, cluster-api uses multiple ownerReferences in few of it's CRDs.

Related

Understanding pod labels vs annotations

I am trying to understand the difference between pods and annotations.
Standard documentation says that annotations captures "non-identifying information".
While on labels, selectors can be applied. Labels are used to organise objects in kubernetes cluster.
If this is the case then why istio use pod annotations instead of labels for various different settings : https://istio.io/latest/docs/reference/config/annotations/
Isn't label is good approach ?
Just trying to understand what advantages does annotations provide, if istio developers chose to use annotations.

As Extending the Burak answer,
Kubernetes labels and annotations are both ways of adding metadata to
Kubernetes objects. The similarities end there, however. Kubernetes
labels allow you to identify, select and operate on Kubernetes
objects. Annotations are non-identifying metadata and do none of these
things.
Labels are mostly used to attach with the resources like POD, Replica set, etc. it also get used to route the traffic and routing deployment to service and other.
Labels are getting stored in the ETCD database so you can search using it.
Annotation is mostly to store metadata and config-if any.
Metadata like : owner details, last helm release if using helm, side car injection
You can store owner details in labels, K8s use labels for traffic routing from service to deployment and labels should be the same on both resources (deployment & service) to route traffic.
What will you do in that case to match labels for resources? Use service owner name same inside all deployment & service? when you are running multiple distributed services managed by diff team and service owners.
If you notice some of annotation of istio is just for storing metadata like : install.operator.istio.io/chart-owner, install.operator.istio.io/owner-generation
Read more at : https://istio.io/latest/docs/reference/config/annotations/
You should also check once syntax of both label and annotation.

Why doesn't the kubernetes documentation show the valid mappings for requests or capacity?

In kubernetes the correct syntax for specifying the storage requirement of a persistentVolume is:
spec:
capacity:
storage: 10Gi
when inspecting the documentation reference or using kubectl explain I would expect to see the storage key noted in the docs however it's absent, it's not clear from the documentation which mappings are allowed inside the capacity section of a persistentVolume spec and how to know that "storage" is the correct key to denote the disk requirement of the persistent volume.
~ kubectl explain --recursive persistentvolume.spec.capacity
FIELDS:
.
.
capacity <map[string]string>
.
.
You can see examples of course but that doesn't tell me what other possible keys there are, just what there is in the example.
In a similar vein, I can see from this table tucked away in the docs that allowed keys inside resources.requests are: [cpu, memory, hugepages-<size>] but if I call kubectl explain --recursive pod.spec.resources I would expect to see:
FIELDS:
limits <map[string]string>
cpu
memory
hugepages-<size>
requests <map[string]string>
cpu
memory
hugepages-<size>
but instead you see just:
FIELDS:
limits <map[string]string>
requests <map[string]string>
Whilst the datatype make it clears that limits and requests are both mappings the docs don't make it easy to find out what the valid mappings are.
For these mapping, other than finding examples, how do I know what all the possible valid mappings are, and how come they don't appear in the kubectl explain output

There are many possible explanations, but most probable are:
it just wasn't updated,
design choice.
Remember, docs are written by people, not generated. When things are getting updated, certain changes may not be properly described.

Large payload for Custom Objects

While I can create custom objects just fine, I am wondering how one is supposed to handle large payloads (Gigabytes) for an object.
CRs are mostly used in order to interface with garbage collection/reference counting in Kubernetes.
Adding the payload via YAML does not work, though (out of memory for large payloads):
apiVersion: "data.foo.bar/v1"
kind: Dump
metadata:
name: my-data
ownerReferences:
- apiVersion: apps/v1
kind: Deploy
name: my-deploy
uid: d9607a69-f88f-11e7-a518-42010a800195
spec:
payload: dfewfawfjr345434hdg4rh4ut34gfgr_and_so_on_...
One could perhaps add the payload to a PV and just reference that path in the CR.
Then I have the problem, that it seems like I cannot clean up the payload file, should the CR get finalized (could not find any info about custom Finalizers).
Have no clear idea how to integrate such a concept into Kubernetes lifetimes.

In general the limit on size for any Kube API object is ~1M due to etcd restrictions, but putting more than 20-30k in an object is a bad idea and will be expensive to access (and garbage collection will be expensive as well).
I would recommend storing the data in a object storage bucket and using an RBAC proxy like https://github.com/brancz/kube-rbac-proxy to gate access the bucket contents (use a URL to the proxy as a reference from your object). That gives you all the benefits of tracking the data in the api, but keeps the object size small. If you want a more complex integration you could implement an aggregated API and reuse the core Kubernetes libraries to handle your API, storing the data in the object store.

We still went with using the CO. Alongside, we created a Kubernetes Controller, which handles the lifetime in the PV. For us this works fine, since the Controller can be the single writer to the PV, while the actual Services only need read access to the PV.
Combined with ownerReference, this makes for a good integration into the Kubernetes lifetime.

When exactly do I set an ownerReference's controller field to true?

I am writing a Kubernetes controller.
Someone creates a custom resource via kubectl apply -f custom-resource.yaml. My controller notices the creation, and then creates a Deployment that pertains to the custom resource in some way.
I am looking for the proper way to set up the Deployment's ownerReferences field such that a deletion of the custom resource will result in a deletion of the Deployment. I understand I can do this like so:
ownerReferences:
- kind: <kind from custom resource>
apiVersion: <apiVersion from custom resource>
uid: <uid from custom resource>
controller: <???>
I'm unclear on whether this is case where I would set controller to true.
The Kubernetes reference documentation says (in its entirety):
If true, this reference points to the managing controller.
Given that a controller is running code, and an owner reference actually references another Kubernetes resource via matching uid, name, kind and apiVersion fields, this statement is nonsensical: a Kubernetes object reference can't "point to" code.
I have a sense that the documentation author is trying to indicate that—using my example—because the user didn't directly create the Deployment herself, it should be marked with some kind of flag indicating that a controller created it instead.
Is that correct?
The follow on question here is of course: OK, what behavior changes if controller is set to false here, but the other ownerReference fields are set as above?

ownerReferences has two purposes:
Garbage collection: Refer to the answer of ymmt2005. Essentially all owners are considered for GC. Contrary to the accepted answer the controller field has no impact on GC.
Adoption: The controller field prevents fighting over resources which are to be adopted. Consider a replica set. Usually, the replica set controller creates the pods. However, if there is a pod which matches the label selector it will be adopted by the replica set. To prevent two replica sets fighting over the same pod, the latter is given a unique controller by setting the controller to true. If a resource already has a controller it will not be adopted by another controller. Details are in the design proposal.
TLDR: The field controller is only used for adoption and not GC.

According to the source code of Kubernetes, the object will be garbage collected only after all objects in ownerReferences field are deleted.
https://github.com/kubernetes/apimachinery/blob/15d95c0b2af3f4fcf46dce24105e5fbb9379af5a/pkg/apis/meta/v1/types.go#L240-L247
// List of objects depended by this object. If ALL objects in the list have
// been deleted, this object will be garbage collected. If this object is managed by a controller,
// then an entry in this list will point to this controller, with the controller field set to true.
// There cannot be more than one managing controller.

According to the documentation:
Sometimes, Kubernetes sets the value of ownerReference automatically. For example, when you create a ReplicaSet, Kubernetes automatically sets the ownerReference field of each Pod in the ReplicaSet. In 1.6, Kubernetes automatically sets the value of ownerReference for objects created or adopted by ReplicationController, ReplicaSet, StatefulSet, DaemonSet, and Deployment.
You can also specify relationships between owners and dependents by manually setting the ownerReference field.
Basically, a Deployment is on the top of the ownership hierarchy, and ownerReference is not set to it automatically. Therefore, you can manually add ownerReference to your Deployment to create the reference to your Foo resource.
You asked:
The follow on question here is of course: OK, what behavior changes if controller is set to false here, but the other ownerReference fields are set as above?
OwnerReference is used by a Garbage Collector. The role of the Kubernetes Garbage Collector is to delete certain objects that once had an owner, but no longer have it.
Here is the link to the description of the OwnerReference structure on Github. As you mentioned, if controller: true, the reference points to the managing controller, in other words, the owner. And also, it is the instruction for Garbage Collector's behavior related to the object and its owner. If controller: false, Garbage Collector manages the object as an object without an owner, for example, allows to delete it freely.
For more information, you can visit the following links:
- Garbage Collection
- Deletion and Garbage Collection of Kubernetes Objects

Varying labels in Prometheus

I annotate my Kubernetes objects with things like version and whom to contact when there are failures. How would I relay this information to Prometheus, knowing that these annotation values will frequently change? I can't capture this information in Prometheus labels, as they serve as the primary key for a target (e.g. if the version changes, it's a new target altogether, which I don't want). Thanks!

I just wrote a blog post about this exact topic! https://www.weave.works/aggregating-pod-resource-cpu-memory-usage-arbitrary-labels-prometheus/
The trick is Kubelet/cAdvisor doesn't expose them directly, so I run a little exporter which does, and join this with the pod name in PromQL. The exporter is: https://github.com/tomwilkie/kube-api-exporter
You can do a join in Prometheus like this:
sum by (namespace, name) (
sum(rate(container_cpu_usage_seconds_total{image!=""}[5m])) by (pod_name, namespace)
* on (pod_name) group_left(name)
k8s_pod_labels{job="monitoring/kube-api-exporter"}
)
Here I'm using a label called "name", but it could be any label.
We use the same trick to get metrics (such as error rate) by version, which we then use to drive our continuous deployment system. kube-api-exporter exports a bunch of useful meta-information about Kubernetes objects to Prometheus.
Hope this helps!

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse