Understanding pod labels vs annotations - kubernetes

I am trying to understand the difference between pods and annotations.
Standard documentation says that annotations captures "non-identifying information".
While on labels, selectors can be applied. Labels are used to organise objects in kubernetes cluster.
If this is the case then why istio use pod annotations instead of labels for various different settings : https://istio.io/latest/docs/reference/config/annotations/
Isn't label is good approach ?
Just trying to understand what advantages does annotations provide, if istio developers chose to use annotations.

As Extending the Burak answer,
Kubernetes labels and annotations are both ways of adding metadata to
Kubernetes objects. The similarities end there, however. Kubernetes
labels allow you to identify, select and operate on Kubernetes
objects. Annotations are non-identifying metadata and do none of these
things.
Labels are mostly used to attach with the resources like POD, Replica set, etc. it also get used to route the traffic and routing deployment to service and other.
Labels are getting stored in the ETCD database so you can search using it.
Annotation is mostly to store metadata and config-if any.
Metadata like : owner details, last helm release if using helm, side car injection
You can store owner details in labels, K8s use labels for traffic routing from service to deployment and labels should be the same on both resources (deployment & service) to route traffic.
What will you do in that case to match labels for resources? Use service owner name same inside all deployment & service? when you are running multiple distributed services managed by diff team and service owners.
If you notice some of annotation of istio is just for storing metadata like : install.operator.istio.io/chart-owner, install.operator.istio.io/owner-generation
Read more at : https://istio.io/latest/docs/reference/config/annotations/
You should also check once syntax of both label and annotation.

Related

Manually spawn stateful pod instances

I'm working on a project where I need to spawn 1 instance per user (customer).
I figured it makes sense to create some sort of manager to handle that and host it somewhere. Kubernetes seems like a good choice since it can be hosted virtually anywhere and it will automate a lot of things (e.g. ensuring instances keep running on failure).
All entities are in Python and have a corresponding Flask API.
InstanceManager Instance (user1)
.-----------. .--------.
POST /instances/user3 --> | | ---------- | |---vol1
| | '--------'
| | -----.
'...........' \ Instance (user2)
\ .--------.
'- | |---vol2
'--------'
Now I can't seem to figure out how to translate this into Kubernetes
My thinking:
Instance is a StatefulSet since I want the data to be maintained through restarts.
InstanceManager is a Service with a database attached to track user to instance IP (for health checks, etc).
I'm pretty lost on how to make InstanceManager spawn a new instance on an incoming POST request. I did a lot of digging (Operators, namespaces, etc.) but nothing seems straightforward. Namely I don't seem to even be able to do that via kubectl. Am I thinking totally wrong on how Kubernetes works?
I've done some progress and thought to share.
Essentially you need to interact with Kubernetes REST API directly instead of applying a static yaml or using kubectl, ideally with one of the numerous clients out there.
In our case there's two options:
Create a namespace per user and then a service in that namespace
Create a new service with a unique name for each user
The first approach seems more sensible since using namespaces gives a lot of other benefits (network control, resource allocation, etc.).
The service itself can be pointing to a statefulset or a pod depending on the situation.
There's another gotcha (and possibly more). Namespaces, pod names, etc, they all need to conform to RFC 1123. So for namespaces, you can't simply use email addresses or even base64. You'll need to use something like user-100 and have a mapping table to map back to an actual user.

How can I enable a Kubernetes networkpolicy for a url?

In my kubernetes cluster all network traffic crossing the namespace border is blocked and I have to enable it manually with a network policy.
The official kubernetes documentation describes networkpolicies via pod labels or ip ranges, but I need to connect to a specific url.
Of course, I can lookup the ip of this url and enable it, but if the ip changes I will get into trouble.
Is there any recommended way to allow communication with only a specific url?
TL;DR: Not possible.
According to Kubernetes API Reference Docs - NetworkPolicyPeer v1 networking.k8s.io, fields you can specify in egress.to are:
ipBlock
IPBlock describes a particular CIDR (Ex. "192.168.1.1/24","2001:db9::/64") that is allowed to the pods matched by a NetworkPolicySpec's podSelector.
namespaceSelector
Selects Namespaces using cluster-scoped labels. This field follows standard label selector semantics; if present but empty, it selects all namespaces. If PodSelector is also set, then the NetworkPolicyPeer as a whole selects the Pods matching PodSelector in the Namespaces selected by NamespaceSelector. Otherwise it selects all Pods in the Namespaces selected by NamespaceSelector.
podSelector
This is a label selector which selects Pods. This field follows standard label selector semantics; if present but empty, it selects all pods. If NamespaceSelector is also set, then the NetworkPolicyPeer as a whole selects the Pods matching PodSelector in the Namespaces selected by NamespaceSelector. Otherwise it selects the Pods matching PodSelector in the policy's own Namespace.
Or, in more blunt terms - NetworkPolicy can be applied to specific IP range, specific namespace(s), or specific pod(s). URL are not supported.
Since you are already using Calico, you may want to have a look at Advanced egress access controls, which gives you exactly what you are looking for.
It is, however, behind a paywall, being a part of Calico Enterprise.

Kubernetes: How do we List all objects modified in N days in a specific namespace?

We have multiple people ( with admin access ) doing the deployments in kubernetes cluster. We are finding it difficult to manage who has modified which object.
We can control the access and privileges using RBAC with roles and role bindings. We are planning to implement well defined roles and rolebindings for different groups.
We would also want to list all objects modified in N days in a specific namespace. Is there a way to display the objects using kubectl? please let me know
This probably can't be done easily just with kubectl.
But you might look into Kubernetes auditing. It causes the API server to record all requests to the API, and you can query them in different ways. For example, it should be possible to query the audit logs for all the objects that have been specified in the last N days.

Ad Hoc Kubernetes Queries

Is there a way to easily query Kubernetes resources in an intuitive way? Basically I want to run queries to extract info about objects which match my criteria. Currently I face an issue where my match labels isn't quite working and I would like to run the match labels query manually to try and debug my issue.
Basically in a pseudo code way:
Select * from pv where labels in [red,blue,green]
Any third party tools who do something like this? Currently all I have to work with is the search box on the dashboard which isn't quite robust enough.
You could use kubectl with JSONPath (https://kubernetes.io/docs/reference/kubectl/jsonpath/). More information on JSONPath: https://github.com/json-path/JsonPath
It allows you to query any resource property, example:
kubectl get pods -o=jsonpath='{$.items[?(#.metadata.namespace=="default")].metadata.name}'
This would list all pod names in namespace "default". Your pseudo code would be something along the lines:
kubectl get pv -o=jsonpath='{$.items[?(#.metadata.label in ["red","blue","green"])]}'

Varying labels in Prometheus

I annotate my Kubernetes objects with things like version and whom to contact when there are failures. How would I relay this information to Prometheus, knowing that these annotation values will frequently change? I can't capture this information in Prometheus labels, as they serve as the primary key for a target (e.g. if the version changes, it's a new target altogether, which I don't want). Thanks!
I just wrote a blog post about this exact topic! https://www.weave.works/aggregating-pod-resource-cpu-memory-usage-arbitrary-labels-prometheus/
The trick is Kubelet/cAdvisor doesn't expose them directly, so I run a little exporter which does, and join this with the pod name in PromQL. The exporter is: https://github.com/tomwilkie/kube-api-exporter
You can do a join in Prometheus like this:
sum by (namespace, name) (
sum(rate(container_cpu_usage_seconds_total{image!=""}[5m])) by (pod_name, namespace)
* on (pod_name) group_left(name)
k8s_pod_labels{job="monitoring/kube-api-exporter"}
)
Here I'm using a label called "name", but it could be any label.
We use the same trick to get metrics (such as error rate) by version, which we then use to drive our continuous deployment system. kube-api-exporter exports a bunch of useful meta-information about Kubernetes objects to Prometheus.
Hope this helps!