Can i use different versions of cassandra in a cluster? - kubernetes

Can i use different versions of cassandra in a single cluster? My goal is to transfer data from one DC(A) to new DC(B) and decommission DC(A), but DC(A) is on version 3.11.3 and DC(B) is going to be *3.11.7+
* I Want to use K8ssandra deployment with metrics and other stuff. The K8ssandra project cannot deploy older versions of cassandra than 3.11.7
Thank you!

K8ssandra itself is purposefully an "opinionated" stack, which is why you can only use certain more recent and not-known to include major issues versions of Cassandra.
But, if you already have the existing cluster, that doesn't mean you can't migrate between them. Check out this blog for an example of doing that: https://k8ssandra.io/blog/tutorials/cassandra-database-migration-to-kubernetes-zero-downtime/

Related

How to approach update of kops based kubernetes api's when upgrading the cluster?

Currently, we run kops based cluster of the version 15. We are planning to upgrade it to the version 16 first and then further. However, api versions for various kubernetes services in yaml's will also need to change. How would you address this issue before the cluster upgrade? Is there any way to enumerate all objects in the cluster with incompatible api versions or what would be the best approach for it? I suspect the objects created by kops, e.g. kube-system objects will be upgraded automatically.
When you upgrade the cluster, the API server will take care to upgrade all existing resources in the cluster. The problem arise when you want to deploy more resources and after the upgrade these are still using the old API versions. In this case your deployment (say kubectl apply) will fail.
I.e nothing already running in the cluster will break. But future deployments will if they still use old versions.
The resources managed by kOps already use new API versions.

Updating StatefulSets in Kubernetes with a propietary vendor?

I could not be understanding Kubernetes correctly but our application relies on a proprietary closed-source vendor that in turn relies on Solr. I've read articles on rolling updates with StatefulSets but they seem to be dependent on the application being aware and accounting for new schema versions, which we have no ability to do without decompiling and jumping through a lot of hoops. Let me describe what we're trying to do:
WebService Version 1 needs to be upgraded to WebService Version 2, this upgrade is none of our code and just the vendor code our code relies on. Think of it like updating the OS.
However WebService Version 1 relies on Solr Version 1. The managed schema is different and there are breaking changes between Solr Version 1 and 2. Both the Solr version and schemas are different. If WebService Version 1 hits Solr Version 2 it won't work, or worse run break Solr Version 2. The same is true in reverse, if we update WebService Version 2 and it gets Solr Version 1 it will break that.
The only thing I can think of is to get Kubernetes to basically spin up a pod for each version and not bring down 1 until 2 is up for both WebService and Solr.
This seems not right, am I understanding this correctly?
This is not really a problem Kubernetes can solve. First work out how you would do it by hand, then you can start working out how to automate it. If zero-downtime is a requirement, the best thing I can imagine is launching the new Solr cluster separately rather than doing an in-place upgrade, then launch the new app separately pointing at the new Solr. But you will need to work out how to sync data between the two Solr clusters in real time during the upgrade. But again, Kubernetes neither helps nor hinders here, the problems are not in launching or managing the containers, it's a logistical issue in your architecture.
It seems that what the canary release strategy with Solr suggests is simply having a new StatefulSet with the same labels as the one with the previous version.
Since labels can be assigned to many objects and network-level, services route requests based on these, what will happen is that requests will be redirected to both StatefulSets, emulating the canary release model.
Following this logic, you can have a v1 StatefulSet with, say, 8 replicas and another v2 StatefulSet with 2. So, ~80% of requests should hit v1 and ~20% v2 (not actually, just to illustrate).
From there, you can play with the number of replicas of each StatefulSet until you "roll out" 100% of replicas of v2, with no downtime.
Now, this can work in your scenario if you label each duo (application + Solr version) in aforementioned way.
Each duo would receive an ~N% of requests, depending on the number of replicas it has. You can slowly decrease the number of replicas of *duo* v1 and increase the next updated version.
This approach has the downside of using more resources as you will be running two versions of your full application stack. However, there is no downtime when upgrading the whole stack and you can control the percentage of "roll out".

Can someone explain me some use cases of helm?

I’m currently using kubernetes and I came across of helm.
Let’s say I don’t like the idea of “infecting” my kubernetes cluster with a process that is not related to my applications but I would gladly accept it if it could be beneficial.
So I made some researches but I still can’t find anything I can’t easily do by using my yaml descriptor and kubectl so for now I can’t find an use except,maybe, for the environizing.
For example (taking it from guides I read:
you can easily install application, eg. helm install nginx —> I add an nginx image to my deployment descriptor, done
repositories -> I have docker ones (where I pull my images from)
you can easily helm rollback in case of release failure-> I just change the image version to the previous one in my kubernetes descriptor, easy
What bothers me is that, at level of commands, I do pretty much the same effort (helm update->kubectl apply).
In exchange for that I have a lot of boilerplate because of keeping the directory structure helm wants and I feel like missing the control I have with plain deployment descriptors ...what am I missing?
It is totally understandable your question. For small and simple deploys the benefits is not actually that great. But when the deploy of something is very complex Helm helps a lot.
Think that you have a couple squads that develop microservice for some company. If you can make a Chart that works for most of them, the deploy of each microservices would differ only by the image and the resources required. This way you get an standardized deployment and easier to all developers.
Another use case is deploying applications which requires a lot of moving parts. For example, if you want to deploy a Grafana server on Kubernetes you're probably going to need at least a Deployment and a Configmap, then you would need a service that matches this deployment. And if you want to expose it to the internet you need an ingress too.
One relatively simple application, would require 4 different YAMLs that you would to manually configure and make sure everything is correct instead you could do a simple helm install and reuse the configuration that someone has already made, sometimes even the company who created the Application.
There are a lot of other use cases, but these two are the ones that I would say are the most common.
Here's three suggestions of ways Helm can be useful:
Your continuous deployment system somewhat routinely produces new builds and wants to send them to the Kubernetes cluster. You can use templating to specify the image name and tag in a deployment, and so helm upgrade ... --set tag=201907211931 to request a specific tag.
You might have various service-specific controls like the log level or external database hostnames. The Helm values mechanism gives a uniform way to specify them, without having to know the details of the Kubernetes YAML files.
There is a repository of pre-packaged application charts, so if you want replicated PostgreSQL with in-cluster persistent storage, that's already built for you and you can just depend on it, rather than figuring out the right combination of StatefulSets and PersistentVolumeClaims yourself.
You can combine these in interesting (and potentially complex) ways: use an in-cluster database for developer testing but use a cloud-hosted and backed-up database for production, for example, and compute the database host name based on what combination of settings are provided.
There are, of course, alternative ways to do all of these things. Kustomize in particular can change the image value fairly straightforwardly, and is notable for having been included in the kubectl tool since Kubernetes 1.14 (see also Declarative Management of Kubernetes Objects Using Kustomize in the Kubernetes documentation). The "operator" pattern gives an alternate path to install software in your cluster, but even more so than Helm you're trusting an arbitrary program with API access.

Packaging a kubernetes based application

We have multiple(20+) services running inside docker containers which are being managed using Kubernetes. These services include databases, streaming pipelines and custom applications. We want to make this product available as an on-premises solution so that it can be easily installed, like a one-click installation sort of thing, hiding all the complexity of the infrastructure.
What would be the best way of doing this? Currently we have scripts managing this but as we move into production there will be frequent upgrades and it will become more and more complex to manage all the dependencies.
I am currently looking into helm and am wondering if I am exploring in the right direction. Any guidance will be really helpful to me. Thanks.
Helm seems like the way to go, but what you need to think about in my opinion is more about how will you deliver updates to your software. For example, will you provide a single 'version' of your whole stack, that translates into particular composition of infra setup and microservice versions, or will you allow your customers to upgrade single microservices as they are released. You can have one huge helm chart for everything, or you can use, like I do in most cases, an "umbrella" chart. It contains subcharts for all microservices etc.
My usual setup contains a subchart for every service, then services names are correctly namespaced, so they can be referenced within as .Release.Name-subchart[-optional]. Also, when I need to upgrade, I just upgraed whole chart with something like --reuse-values --set subchart.image.tag=v1.x.x which gives granular control over each service version. I also gate each subcharts resources with if .Values.enabled so I can individualy enabe/diable each subcharts resources.
The ugly side of this, is that if you do want to release single service upgrade, you still need to run the whole umbrella chart, leaving more surface for some kind of error, but on the other hand it gives this capability to deploy whole solution in one command (the default tags are :latest so clean install will always install latest versions published, and then get updated with tagged releases)

How to update Kubernetes Cluster to the latest version available?

I began to try Google Container Engine recently. I would you like to upgrade the Kubernetes Cluster to the latest version available, if possible without downtime. Is there any way to do this?
Unfortunately, the best answer we currently have is to create a new cluster and move your resources over, then delete the old one.
We are very actively working on making cluster upgrades reliable (both nodes and the master), but upgrades are unlikely to work for the majority of currently existing clusters.
We now have a checked-in upgrade tool for master and nodes: https://github.com/GoogleCloudPlatform/kubernetes/blob/master/cluster/gce/upgrade.sh