Large payload for Custom Objects - kubernetes

While I can create custom objects just fine, I am wondering how one is supposed to handle large payloads (Gigabytes) for an object.
CRs are mostly used in order to interface with garbage collection/reference counting in Kubernetes.
Adding the payload via YAML does not work, though (out of memory for large payloads):
apiVersion: "data.foo.bar/v1"
kind: Dump
metadata:
name: my-data
ownerReferences:
- apiVersion: apps/v1
kind: Deploy
name: my-deploy
uid: d9607a69-f88f-11e7-a518-42010a800195
spec:
payload: dfewfawfjr345434hdg4rh4ut34gfgr_and_so_on_...
One could perhaps add the payload to a PV and just reference that path in the CR.
Then I have the problem, that it seems like I cannot clean up the payload file, should the CR get finalized (could not find any info about custom Finalizers).
Have no clear idea how to integrate such a concept into Kubernetes lifetimes.

In general the limit on size for any Kube API object is ~1M due to etcd restrictions, but putting more than 20-30k in an object is a bad idea and will be expensive to access (and garbage collection will be expensive as well).
I would recommend storing the data in a object storage bucket and using an RBAC proxy like https://github.com/brancz/kube-rbac-proxy to gate access the bucket contents (use a URL to the proxy as a reference from your object). That gives you all the benefits of tracking the data in the api, but keeps the object size small. If you want a more complex integration you could implement an aggregated API and reuse the core Kubernetes libraries to handle your API, storing the data in the object store.

We still went with using the CO. Alongside, we created a Kubernetes Controller, which handles the lifetime in the PV. For us this works fine, since the Controller can be the single writer to the PV, while the actual Services only need read access to the PV.
Combined with ownerReference, this makes for a good integration into the Kubernetes lifetime.

Related

Manually spawn stateful pod instances

I'm working on a project where I need to spawn 1 instance per user (customer).
I figured it makes sense to create some sort of manager to handle that and host it somewhere. Kubernetes seems like a good choice since it can be hosted virtually anywhere and it will automate a lot of things (e.g. ensuring instances keep running on failure).
All entities are in Python and have a corresponding Flask API.
InstanceManager Instance (user1)
.-----------. .--------.
POST /instances/user3 --> | | ---------- | |---vol1
| | '--------'
| | -----.
'...........' \ Instance (user2)
\ .--------.
'- | |---vol2
'--------'
Now I can't seem to figure out how to translate this into Kubernetes
My thinking:
Instance is a StatefulSet since I want the data to be maintained through restarts.
InstanceManager is a Service with a database attached to track user to instance IP (for health checks, etc).
I'm pretty lost on how to make InstanceManager spawn a new instance on an incoming POST request. I did a lot of digging (Operators, namespaces, etc.) but nothing seems straightforward. Namely I don't seem to even be able to do that via kubectl. Am I thinking totally wrong on how Kubernetes works?
I've done some progress and thought to share.
Essentially you need to interact with Kubernetes REST API directly instead of applying a static yaml or using kubectl, ideally with one of the numerous clients out there.
In our case there's two options:
Create a namespace per user and then a service in that namespace
Create a new service with a unique name for each user
The first approach seems more sensible since using namespaces gives a lot of other benefits (network control, resource allocation, etc.).
The service itself can be pointing to a statefulset or a pod depending on the situation.
There's another gotcha (and possibly more). Namespaces, pod names, etc, they all need to conform to RFC 1123. So for namespaces, you can't simply use email addresses or even base64. You'll need to use something like user-100 and have a mapping table to map back to an actual user.

Understanding pod labels vs annotations

I am trying to understand the difference between pods and annotations.
Standard documentation says that annotations captures "non-identifying information".
While on labels, selectors can be applied. Labels are used to organise objects in kubernetes cluster.
If this is the case then why istio use pod annotations instead of labels for various different settings : https://istio.io/latest/docs/reference/config/annotations/
Isn't label is good approach ?
Just trying to understand what advantages does annotations provide, if istio developers chose to use annotations.
As Extending the Burak answer,
Kubernetes labels and annotations are both ways of adding metadata to
Kubernetes objects. The similarities end there, however. Kubernetes
labels allow you to identify, select and operate on Kubernetes
objects. Annotations are non-identifying metadata and do none of these
things.
Labels are mostly used to attach with the resources like POD, Replica set, etc. it also get used to route the traffic and routing deployment to service and other.
Labels are getting stored in the ETCD database so you can search using it.
Annotation is mostly to store metadata and config-if any.
Metadata like : owner details, last helm release if using helm, side car injection
You can store owner details in labels, K8s use labels for traffic routing from service to deployment and labels should be the same on both resources (deployment & service) to route traffic.
What will you do in that case to match labels for resources? Use service owner name same inside all deployment & service? when you are running multiple distributed services managed by diff team and service owners.
If you notice some of annotation of istio is just for storing metadata like : install.operator.istio.io/chart-owner, install.operator.istio.io/owner-generation
Read more at : https://istio.io/latest/docs/reference/config/annotations/
You should also check once syntax of both label and annotation.

Why doesn't the kubernetes documentation show the valid mappings for requests or capacity?

In kubernetes the correct syntax for specifying the storage requirement of a persistentVolume is:
spec:
capacity:
storage: 10Gi
when inspecting the documentation reference or using kubectl explain I would expect to see the storage key noted in the docs however it's absent, it's not clear from the documentation which mappings are allowed inside the capacity section of a persistentVolume spec and how to know that "storage" is the correct key to denote the disk requirement of the persistent volume.
~ kubectl explain --recursive persistentvolume.spec.capacity
FIELDS:
.
.
capacity <map[string]string>
.
.
You can see examples of course but that doesn't tell me what other possible keys there are, just what there is in the example.
In a similar vein, I can see from this table tucked away in the docs that allowed keys inside resources.requests are: [cpu, memory, hugepages-<size>] but if I call kubectl explain --recursive pod.spec.resources I would expect to see:
FIELDS:
limits <map[string]string>
cpu
memory
hugepages-<size>
requests <map[string]string>
cpu
memory
hugepages-<size>
but instead you see just:
FIELDS:
limits <map[string]string>
requests <map[string]string>
Whilst the datatype make it clears that limits and requests are both mappings the docs don't make it easy to find out what the valid mappings are.
For these mapping, other than finding examples, how do I know what all the possible valid mappings are, and how come they don't appear in the kubectl explain output
There are many possible explanations, but most probable are:
it just wasn't updated,
design choice.
Remember, docs are written by people, not generated. When things are getting updated, certain changes may not be properly described.

Kubernetes resource versioning

Using kubectl get with -o yaml on a resouce , I see that every resource is versioned:
kind: ConfigMap
metadata:
creationTimestamp: 2018-10-16T21:44:10Z
name: my-config
namespace: default
resourceVersion: "163"
I wonder what is the significance of these versioning and for what purpose these are used? ( use cases )
A more detailed explanation, that helped me to understand exactly how this works:
All the objects you’ve created throughout this book—Pods,
ReplicationControllers, Services, Secrets and so on—need to be
stored somewhere in a persistent manner so their manifests survive API
server restarts and failures. For this, Kubernetes uses etcd, which
is a fast, distributed, and consistent key-value store. The only
component that talks to etcd directly is the Kubernetes API server.
All other components read and write data to etcd indirectly through
the API server.
This brings a few benefits, among them a more robust optimistic
locking system as well as validation; and, by abstracting away the
actual storage mechanism from all the other components, it’s much
simpler to replace it in the future. It’s worth emphasizing that etcd
is the only place Kubernetes stores cluster state and metadata.
Optimistic concurrency control (sometimes referred to as optimistic
locking) is a method where instead of locking a piece of data and
preventing it from being read or updated while the lock is in place,
the piece of data includes a version number. Every time the data is
updated, the version number increases. When updating the data, the
version number is checked to see if it has increased between the time
the client read the data and the time it submits the update. If this
happens, the update is rejected and the client must re-read the new
data and try to update it again. The result is that when two clients
try to update the same data entry, only the first one succeeds.
The result is that when two clients try to update the same data entry,
only the first one succeeds
Marko Luksa, "Kubernetes in Action"
So, all the Kubernetes resources include a metadata.resourceVersion field, which clients need to pass back to the API server when updating an object. If the version doesn’t match the one stored in etcd, the API server rejects the update
The main purpose for the resourceVersion on individual resources is optimistic locking. You can fetch a resource, make a change, and submit it as an update, and the server will reject the update with a conflict error if another client has updated it in the meantime (their update would have bumped the resourceVersion, and the value you submit tells the server what version you think you are updating)

What is the difference between secrets and configmaps? [duplicate]

Have been using Kubernetes secrets up to date.
Now we have ConfigMaps as well.
What is the preferred way forward - secrets or config maps?
P.S. After a few iterations we have stabilised at the following rule:
configMaps are per solution domain (can be shared across microservices within the domain, but ultimately are single purpose config entries)
secrets are shared across solution domains, usually represent third party systems or databases
I'm the author of both of these features. The idea is that you should:
Use Secrets for things which are actually secret like API keys, credentials, etc
Use ConfigMaps for not-secret configuration data
In the future, there will likely be some differentiators for secrets like rotation or support for backing the secret API w/ HSMs, etc. In general, we like intent-based APIs, and the intent is definitely different for secret data vs. plain old configs.
One notable difference in the implementation is that kubectl apply -f:
ConfigMaps are "unchanged" if the data hasn't changed.
Secrets are always "configured" - even if the file hasn't changed
Both, ConfigMaps and Secrets store data as a key value pair. The major difference is, Secrets store data in base64 format meanwhile ConfigMaps store data in a plain text.
If you have some critical data like, keys, passwords, service accounts credentials, db connection string, etc then you should always go for Secrets rather than Configs.
And if you want to do some application configuration using environment variables which you don't want to keep secret/hidden like, app theme, base platform url, etc then you can go for ConfigMaps