Should Controller do cleanup for the Resource it's deleting?

Should Controller do cleanup for the Resource it's deleting? - kubernetes

I am new to Kubernetes Operators. I have a general question about how to conceive of cleanup at the point of deletion.
Let's say the Controller is managing a resource which consists of a Deployment among other things. This Deployment writes to some external database. I'd like the items from the Database to be deleted when the resource is deleted (but not when its Pod is simply restarted - thus it can't happen as part of the application's shut down logic).
It seems like the database purging would have to happen in the Controller then? But this makes me a bit uneasy since it seems like this knowledge of how values are stored is the knowledge of the resource being managed, not the Controller. Is the only other good option to have the Controller send a message to the underlying application to perform its own cleanup?
What is the general way to handle this type of thing?

Have you heard about Finalizers and Owner References in Kubernetes? It's the Owner references describe how groups of objects are related. They are properties on resources that specify the relationship to one another, so entire trees of resources can be deleted.
To avoid futher copy-pasting, I will just leave the links here: Understanding Finalizers

Related

Should Kubernetes operator delete resources which are not managed by it?

I have my application deployed on Kubernetes cluster, managed by operator (A). I am mounting secrets with ssl key materials to the deployment, so application could access to the content.
I have separate operator (B) deployment, which is responsible to create those secrets with the ssl key materials. Now I have a use case where my secrets are recreated by operator (B), and it deletes/restarts the pods, which managed by operator (A).
I am trying understand - is it common practice to allow separately deployed operator delete pods?
My perception was that operator should work only with resources it manages, nothing more.

Community wiki to summarise the topic.
If it is as you say:
both operators are proprietary,
it is impossible to give a definite yes or no answer. Everything will depend on what is really going on there, and we are not able to check and evaluate it.
Look at the well provided comments by David Maze:
That sort of seems like a bug...but also, Pods are pretty disposable and the usual expectation is that a ReplicaSet or another controller will recreate them...?
Note that the essence of the Kubernetes controller model is that the controller looks at the current state of the Kubernetes configuration store (not changes or events, just which objects exist and which don't) and tries to make the things it manages match that, so if the controller believes it should manage some external resource and there's not a matching Kubernetes object, it could delete it.

Kubernetes dynamic pod provisioning

I have an app I'm building on Kubernetes which needs to dynamically add and remove worker pods (which can't be known at initial deployment time). These pods are not interchangeable (so increasing the replica count wouldn't make sense). My question is: what is the right way to do this?
One possible solution would be to call the Kubernetes API to dynamically start and stop these worker pods as needed. However, I've heard that this might be a bad way to go since, if those dynamically-created pods are not in a replica set or deployment, then if they die, nothing is around to restart them (I have not yet verified for certain if this is true or not).
Alternatively, I could use the Kubernetes API to dynamically spin up a higher-level abstraction (like a replica set or deployment). Is this a better solution? Or is there some other more preferable alternative?

If I understand you correctly you need ConfigMaps.
From the official documentation:
The ConfigMap API resource stores configuration data as key-value
pairs. The data can be consumed in pods or provide the configurations
for system components such as controllers. ConfigMap is similar to
Secrets, but provides a means of working with strings that don’t
contain sensitive information. Users and system components alike can
store configuration data in ConfigMap.
Here you can find some examples of how to setup it.
Please try it and let me know if that helped.

Prevent project deletion in OpenShift

Is it possible to prevent project deletion in OpenShift?
I have some projects that must not be deleted. Sure I can recreate any projects that were accidentally deleted but there would still be an outage.
I've read through a lot of docs but haven't come across anything yet. Haven't found anything on preventing namespace deletion in Kubernetes either. I'm hoping I missed something.

You can prevent deletion by certain users but not deletion in general. A good starting point for Authorization in Openshift might be
https://docs.okd.io/3.9/architecture/additional_concepts/authorization.html
This means you should have a user or group of users who you trust to not delete a project by accident and ordinary user who create and destroy objects inside this project but can not destroy it by themself.
Hope this helps.

Why should I store kubernetes deployment configuration into source control if kubernetes already keeps track of it?

One of the documented best practices for Kubernetes is to store the configuration in version control. It is mentioned in the official best practices and also summed up in this Stack Overflow question. The reason is that this is supposed to speed-up rollbacks if necessary.
My question is, why do we need to store this configuration if this is already stored by Kubernetes and there are ways with which we can easily go back to a previous version of the configuration using for example kubectl? An example is a command like:
kubectl rollout history deployment/nginx-deployment
Isn't storing the configuration an unnecessary duplication of a piece of information that we will then have to keep synchronized?
The reason I am asking this is that we are building a configuration service on top of Kubernetes. The user will interact with it to configure multiple deployments, I was wondering if we should keep a history of the Kubernetes configuration and the content of configMaps in a database for possible roll backs or if we should just rely on kubernetes to retrieve the current configuration and rolling back to previous versions of the configuration.

You can use Kubernetes as your store of configuration, to your point, it's just that you probably shouldn't want to. By storing configuration as code, you get several benefits:
Configuration changes get regular code reviews.
They get versioned, are diffable, etc.
They can be tested, linted, and whatever else you desired.
They can be refactored, share code, and be documented.
And all this happens before actually being pushed to Kubernetes.
That may seem bad ("but then my configuration is out of date!"), but keep in mind that configuration is actually never in date - just because you told Kubernetes you want 3 replicas running doesn't mean there are, or if there were that 1 isn't temporarily down right now, and so on.
Configuration expresses intent. It takes a different process to actually notice when your intent changes or doesn't match reality, and make it so. For Kubernetes, that storage is etcd and it's up to the master to, in a loop forever, ensure the stored intent matches reality. For you, the storage is source control and whatever process you want, automated or not, can, in a loop forever, ensure your code eventually becomes reflected in Kubernetes.
The rollback command, then, is just a very fast shortcut to "please do this right now!". It's for when your configuration intent was wrong and you don't have time to fix it. As soon as you roll back, you should chase your configuration and update it there as well. In a sense, this is indeed duplication, but it's a rare event compared to the normal flow, and the overall benefits outweigh this downside.

Kubernetes cluster doesn't store your configuration it runs it, as you server runs your application code.

How to manage state in microservices?

First of all, this is a question regarding my thesis for school. I have done some research about this, it seems like a problem that hasn't been tackled yet (might not be that common).
Before jumping right into the problem, I'll give a brief example of my use case.
I have multiple namespaces containing microservices depending on a state X. To manage this the microservices are put in a namespace named after the state. (so namespaces state_A, state_B, ...)
Important to know is that each microservice needs this state at startup of the service. It will download necessary files, ... according to the state. When launching it with state A version 1, it is very likely that the state gets updated every month. When this happens, it is important to let all the microservices that depend on state A upgrade whatever necessary (databases, in-memory state, ...).
My current approach for this problem is simply using events, the microservices that need updates when the state changes can subscribe on the event and migrate/upgrade accordingly. The only problem I'm facing is that while the service is upgrading, it should still work. So somehow I should duplicate the service first, let the duplicate upgrade and when the upgrade is successful, shut down the original. Because of this the used orchestration service would have to be able to create duplicates (including duplicating the state).
My question is, are there already solutions for my problem (and if yes, which ones)? I have looked into Netflix Conductor (which seemed promising with its workflows and events), Amazon SWF, Marathon and Kubernetes, but none of them covers my problem.
Best of all the existing solution should not be bound to a specific platform (Azure, GCE, ...).

For uninterrupted upgrade you should use clusters of nodes providing your service and perform a rolling update, which takes out a single node at a time, upgrading it, leaving the rest of the nodes for continued servicing. I recommend looking at the concept of virtual services (e.g. in kubernetes) and rolling updates.
For inducing state I would recommend looking into container initialization mechanisms. For example in docker you can use entrypoint scripts or in kubernetes there is the concept of init containers. You should note though that today there is a trend to decouple services and state, meaning the state is kept in a DB that is separate from the service deployment, allowing to view the service as a stateless component that can be replaced without losing state (given the interfacing between the service and required state did not change). This is good in scenarios where the service changes more frequently and the DB design less frequently.
Another note - I am not sure that representing state in a namespace is a good idea. Typically a namespace is a static construct for organization (of code, services, etc.) that aims for stability.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse