kubectl imperative object configuration use case - kubernetes

Out of the Kubernetes docs a kubectl tool has "three kinds of object management":
imperative commands
imperative object configuration
declarative object configuration
While the first and the last options' use cases are more or less clear, the second one really makes me confusing.
Moreover in the concepts section there is a clear distinction of use cases:
use imperative commands for quick creation of (simple)
single-container resources
use declarative commands for managing (more complex) set of resources
Also imperative style is recommended for CKA certification so it seems to be preferred for day-to-day cluster management activities.
But once again what is a best use case / practice for "Imperative object configuration" option and what is the root idea behind it?

There are two basic ways to deploy to Kubernetes: imperatively, with kubectl commands, or declaratively, by writing manifests and using kubectl apply. A Kubernetes object should be managed using only one technique. It is better to use only one way for the same object, mixing techniques for the same object results in undefined behavior.
Imperative commands operates on live objects
Imperative object configuration operates on individual files
Declarative object configuration operates on directories of files
Imperative object configuration creates, updates and deletes objects using configuration files, which contains fully-defined object definitions. You can store object configuration files in source control systems and audit changes more easily than with imperative commands.
You can run kubectl apply, delete, and replace operations with configuration files or directories containing configuration files.
Please refer to official documentation, where everything is fully described with examples. I hope it is helpful.

Related

Filtering items are stored in Kubernetes shared informer cache

I have a Kubernetes controller written using client-go informers package. It maintains a watch on all Pods in the cluster, there are about 15k of these objects, and their YAML representation takes around 600 MB to print (I assume their in-memory representation is not that different.)
As a result, this (otherwise really small) controller watching Pods ends up with a huge memory footprint (north of 1 GiB). Even methods that you'd think offer a way of filtering, such as the one named like NewFilteredSharedInformerFactory doesn't really give you a way to specify a predicate function that chooses which objects are stored in the in-memory cache.
Instead, that method in client-go offers a TweakListOptionsFunc. It helps you control ListOptions but my predicate unfortunately cannot be satisfied with a labelSelector or fieldSelector. I need to drop the objects when they arrive to the controller through a predicate function.
Note: the predicate I have is something like "Pods that have an ownerReference by a DaemonSet" (which is not possible with fieldSelectors –also another question of mine) and there's no labelSelector that can work in my scenario.
How would I go about configuring an informer on Pods that only have DaemonSet owner references to reduce the memory footprint of my controller?
Here's an idea, you can get a list of all the DaemonSets in your cluster. Read the spec.selector.matchLabels field to retrieve the label that the DaemonSet pods are bound to have. Use those labels as part of your TweakListOptionsFunc with a logic like:
Pods with label1 OR label2 OR label3 ...
I know it's a bit of toil, but seems to be a working approach. I believe there isn't a way to specify fields in client-go.
It appears that today if you use SharedInformers, there's no way to filters which objects to keep in the shared cache and which ones to discard.
I have found an interesting code snippet in kube-state-metrics project that opts into the lower-layer of abstraction of initiating Watch calls directly (which would normally be considered as an anti-pattern) and using watch.Filter, it decides whether to return an object from a Watch() call (to a cache/reflector or not).
That said, many controller authors might choose to not go down this path as it requires you to specify your own cache/reflector/indexer around the Watch() call. Furthermore, projects like controller-runtime don't even let you get access to this low-level machinery, as far as I know.
Another aspect of reducing controllers' memory footprint can be done through field/data erasure on structs (instead of discarding objects altogether). This is possible in newer versions of client-go through cache.TransformFunc, which can let you delete fields of irrelevant objects (though, these objects would still consume some memory). This one is more of a band-aid that can make your situation better.
In my case, I mostly needed to watch for DaemonSet Pods in certain namespaces, so I refactored the code from using 1 informer (watching all namespaces) to N namespace-scoped informers running concurrently.

In multi-stage compilation, should we use a standard serialisation method to ship objects through stages?

This question is formulated in Scala 3/Dotty but should be generalised to any language NOT in MetaML family.
The Scala 3 macro tutorial:
https://docs.scala-lang.org/scala3/reference/metaprogramming/macros.html
Starts with the The Phase Consistency Principle, which explicitly stated that free variables defined in a compilation stage CANNOT be used by the next stage, because its binding object cannot be persisted to a different compiler process:
... Hence, the result of the program will need to persist the program state itself as one of its parts. We don’t want to do this, hence this situation should be made illegal
This should be considered a solved problem given that many distributed computing frameworks demands the similar capability to persist objects across multiple computers, the most common kind of solution (as observed in Apache Spark) uses standard serialisation/pickling to create snapshots of the binded objects (Java standard serialization, twitter Kryo/Chill) which can be saved on disk/off-heap memory or send over the network.
The tutorial itself also suggested the possibility twice:
One difference is that MetaML does not have an equivalent of the PCP - quoted code in MetaML can access variables in its immediately enclosing environment, with some restrictions and caveats since such accesses involve serialization. However, this does not constitute a fundamental gain in expressiveness.
In the end, ToExpr resembles very much a serialization framework
Instead, Both Scala 2 & Scala 3 (and their respective ecosystem) largely ignores these out-of-the-box solutions, and only provide default methods for primitive types (Liftable in scala2, ToExpr in scala3). In addition, existing libraries that use macro relies heavily on manual definition of quasiquotes/quotes for this trivial task, making source much longer and harder to maintain, while not making anything faster (as JVM object serialisation is an highly-optimised language component)
What's the cause of this status quo? How do we improve it?

Updating Metadata Annotations

I am using kubebuilder to create a Kubernetes operator. When an object of my kind is initiated I have to parse the spec and update the objects based on a few calculations.
From what I can tell I can either update the status of the object, the metadata, or a managed field (I may be wrong?). It appears that the sigs.k8s.io/controller-runtime/pkg/client library is responsible for how to update these fields (I'm not completely sure). I am having trouble understanding the docs.
I have the following questions:
Are there a guide to best practices about where to store configuration on the object between status, metadata (labels or annotations), and managed fields?
How do I update/patch the annotations of an object similar to how I would use r.Status().Update(ctx, &thing); to update the status?
The Kubebuilder docs are a bit raw but nonetheless are a handy guide when building CRDs and controllers with Kubebuilder. It walks you through a fairly detailed example which is great to study and refer back to, to see how to do certain things.
The answer to your question generally is, "it depends." What values are you calculating, and why? Why do you need to store them on the object? Is the lifecycle of this data coupled to the lifecycle of this object, or might this computed data need to live on and be used by other controllers even when the object is deleted? In general, is anything going to interact with those values? What is it going to do with them?
If nothing else aside from the reconciliation controller for the CRD is going to interact with the data you're putting, consider putting it within the object's Status.
Doing r.Status().Update(ctx, &thing) will avoid triggering any side-effects as it will only persist changes you've made to the object's Status subresource, rather than its spec or metadata.
A common thing to do with custom resources is to set and remove finalizers, which live in the object's metadata.

Distribute public key to all pods in all namespaces automatically

I have a public key that all my pods needs to have.
My initial thought was to create a ConfigMap or Secret to hold it but as far as I can tell neither of those can be used across namespaces. Apart from that, it's really boiler plate to paste the same volume into all my Deployments
So now I'm left with only, in my opinion, bad alternatives such as creating the same ConfigMap/Secret in all Namespaces and do the copy-paste thing in deployments.
Any other alternatives?
Extra information after questions.
The key doesn't need to be kept secret, it's a public key, but it needs to be distributed in a trusted way.
It won't rotate often but when it happens all images can't be re-built.
Almost all images/pods needs this key and there will be hundreds of images/pods.
You can use Kubernetes initializers to intercept object creation and mutate as you want. This can solve copy-paste in all your deployments and you can manage it from a central location.
https://medium.com/google-cloud/how-kubernetes-initializers-work-22f6586e1589
You will still need to create configmaps/secrets per namespace though.
While I don't really like the idea, one of the ways to solve it could be an init container that populates a volume with key(s) you need and then these volumes can be mounted in your containers as you see fit. That way it becomes independent of how you namespace stuff and relies only on how pods are defined/created.
That said, the Kubed mentioned by Ryan above sounds like more reasonable approach to a case like this one, and last but not least, something creates your namespaces after all, so having the creation of required elements of a namespace inside the same process sounds legit as well.

Namespaces in Redis?

Is it possible to create namespaces in Redis?
From what I found, all the global commands (count, delete all) work on all the objects. Is there a way to create sub-spaces such that these commands will be limited in context?
I don't want to set up different Redis servers for this purpose.
I assume the answer is "No", and wonder why wasn't this implemented, as it seems to be a useful feature without too much overhead.
A Redis server can handle multiple databases... which are numbered. I think it provides 32 of them by default; you can access them using the -n option to the redis-cli shell scripting command and by similar options to the connection arguments or using the "select()" method on its connection objects. (In this case .select() is the method name for the Python Redis module ... I presume it's named similarly for other libraries and interfaces.
There's an option to control how many separate databases you want in the configuration file for the Redis server daemon as well. I don't know what the upper limit would be and there doesn't seem to be a way to dynamically change that (in other words it seems that you'd have to shutdown and restart the server to add additional DBs). Also, there doesn't seem to be an away to associate these DB numbers with any sort of name nor to impose separate ACLS, nor even different passwords, to them. Redis, of course, is schema-less as well.
If you are using Node, ioredis has transparent key prefixing, which works by having the client prepend a given string to each key in a command. It works in the same way that Ruby's redis-namespace does. This client-side approach still puts all your keys into the same database, but at least you add some structure, and you don't have to use multiple databases or servers.
var fooRedis = new Redis({ keyPrefix: 'foo:' });
fooRedis.set('bar', 'baz'); // Actually sends SET foo:bar baz
If you use Ruby you can look at these gems:
https://github.com/resque/redis-namespace
https://github.com/jodosha/redis-store