What is the proper procedure for upgrading a multi-node Crate cluster? - upgrade

I have a crate cluster consisting of multiple nodes. The cluster is currently running 0.39.1 utilizing the Ubuntu stable repository. I would like to upgrade to 0.40.2 with no down time on the cluster.
Is it wise to simply use the ES rolling upgrade process (given that we have the ES API enabled) referenced here: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/setup-upgrade.html#rolling-upgrades

We cannot give any guarantees that the ES rolling update works in every case and for every setup and dataset. We have to do extensive testing to verify that. Additionally, you have to enable the ES Rest API for that to work.
Nonetheless the following is true:
crate uses ES 1.X since version 0.24.0
there are no breaking changes in crate between 0.38.3 and current stable relase 0.40.2
according to the ES documentation you referenced, rolling upgrades should be supported on all minor/maintenance releases since 1.0
We've put making rolling upgrades (without using the ES API) on our backlog.
Stay tuned.
Update:
Since release 0.44.0 Crate supports zero downtime upgrades using a rolling upgrade with strong data availability guarantees. You don't have to enable the ES REST API. See the documentation about Zero Downtime Upgrades.

Related

I want to roll back to previous version of Kubernetes

I want to roll back to specific version of kubernetes. My current version is 1.21.
Is there any system specifications for kubernetes?
If you are using a managed service, you probably won't be able to roll back, and I would strongly recommend AGAINST rolling back even if you can.
Managed services like GKE, AKS and EKS will only allow you to pick from the latest couple of versions (normally between 3-4 minor versions), but will not allow you to downgrade a minor version (e.g. you can't downgrade from 1.21 to 1.20 for example see here for GKE example)
Rolling back a version will re-introduce any bugs and security issues that were fixed by the upgrade. So essentially, you are making your cluster less secure by downgrading.
Clients such as kubectl will also flag up skew warnings such as in this question, and the rolled back clusters will start rejecting deployments if you've already updated them for new apiVersions
For example if the version you migrated from had an API version something/v1beta and the new version required you to use something/v1 then if you tried to deploy a deployment on the rolled back cluster that used something/v1 (to meet the new cluster version), the rolled back cluster would reject that.

Updating Deprecated apiVersions in helm charts

We have many applications for which we have created helm charts.
Now we need to upgrade our k8s clusters to v1.22. Is there any efficient way to update the charts to support the latest APIs in v1.22? Are there any tools or tips to script the above functionality...?
You have tools available like https://github.com/rikatz/kubepug. Still, in correction phase I encourage you to perform a manual assessment of the deployment and modify it accordingly. Some changes will be a simple keyword change, others will imply some redesign. Features are also first deprecated but usable before they are removed, correct them in deprecation release instead when you bump into removal phase.
https://kubernetes.io/docs/reference/using-api/deprecation-guide/#v1-22

Micro Services and Version Control how to handle deployment

I am currently trying to figure out how to handle version control with microservices.
From what I have read the best strategy is to have a separate git repository for each microservice.
However when it comes to deployment having to upload multiple git repositories seems pretty complex.
Specifically I am scratching my head as how I would deploy an update where multiple microservices require changes that depend on each other, and how to roll back to the previous versions should there be an issue with a production deployment.
This seems like a headache that most developers who use micro services have had to deal with.
Any advice would be greatly appreciated, especially if this could be done with an existing library rather than building something from scratch,
thanks,
Simon
There is no easy answer or library that could solve the problem, however there are strategies that can help. I have outlined a few below
Backward compatibility of service - Whenever you are releasing make sure that your API (REST or otherwise) works with previous consumer, this could be done by proving default values for the newer attributes.
Versioning of API - When changes you are making are not small and breaking, introduce the new version of API so that older consumers can continue to work with previous version.
Canary Deployment - When you deploy a new version of micro-service route only a small percentage of calls to the new service and rest of previous version.Observe the behavior and rollback if required.
Blue Green deployment - Have two production environment, one blue which is proven working and other green which is staging containing the latest release. When the testing is done green environment and you have enough confidence, route all the calls to green.
References
Micro-services versioning
Canary deployment
Blue green deployment
Here's a plugin I wrote using some references: https://github.com/simrankadept/serverless-ssm-version-tracker
NPM package:
https://www.npmjs.com/package/serverless-ssm-version-tracker
The version format supported is YYYY.MM.DD.REVISION

Updating StatefulSets in Kubernetes with a propietary vendor?

I could not be understanding Kubernetes correctly but our application relies on a proprietary closed-source vendor that in turn relies on Solr. I've read articles on rolling updates with StatefulSets but they seem to be dependent on the application being aware and accounting for new schema versions, which we have no ability to do without decompiling and jumping through a lot of hoops. Let me describe what we're trying to do:
WebService Version 1 needs to be upgraded to WebService Version 2, this upgrade is none of our code and just the vendor code our code relies on. Think of it like updating the OS.
However WebService Version 1 relies on Solr Version 1. The managed schema is different and there are breaking changes between Solr Version 1 and 2. Both the Solr version and schemas are different. If WebService Version 1 hits Solr Version 2 it won't work, or worse run break Solr Version 2. The same is true in reverse, if we update WebService Version 2 and it gets Solr Version 1 it will break that.
The only thing I can think of is to get Kubernetes to basically spin up a pod for each version and not bring down 1 until 2 is up for both WebService and Solr.
This seems not right, am I understanding this correctly?
This is not really a problem Kubernetes can solve. First work out how you would do it by hand, then you can start working out how to automate it. If zero-downtime is a requirement, the best thing I can imagine is launching the new Solr cluster separately rather than doing an in-place upgrade, then launch the new app separately pointing at the new Solr. But you will need to work out how to sync data between the two Solr clusters in real time during the upgrade. But again, Kubernetes neither helps nor hinders here, the problems are not in launching or managing the containers, it's a logistical issue in your architecture.
It seems that what the canary release strategy with Solr suggests is simply having a new StatefulSet with the same labels as the one with the previous version.
Since labels can be assigned to many objects and network-level, services route requests based on these, what will happen is that requests will be redirected to both StatefulSets, emulating the canary release model.
Following this logic, you can have a v1 StatefulSet with, say, 8 replicas and another v2 StatefulSet with 2. So, ~80% of requests should hit v1 and ~20% v2 (not actually, just to illustrate).
From there, you can play with the number of replicas of each StatefulSet until you "roll out" 100% of replicas of v2, with no downtime.
Now, this can work in your scenario if you label each duo (application + Solr version) in aforementioned way.
Each duo would receive an ~N% of requests, depending on the number of replicas it has. You can slowly decrease the number of replicas of *duo* v1 and increase the next updated version.
This approach has the downside of using more resources as you will be running two versions of your full application stack. However, there is no downtime when upgrading the whole stack and you can control the percentage of "roll out".

AWS Elasticsearch domain deployed through CloudFormation. How to update ES version without replacement?

We have an AWS Elasticsearch domain we created through CloudFormation running version 6.3 of ES. When we update the ElasticsearchVersion property in the template, it replaces the Elasticsearch domain with a new one running the new version instead of updating the existing one.
How does anyone upgrade their Elasticsearch domains that were deployed with CF if it doesn't do an in-place upgrade? I am almost thinking at this point I need to create and manage my ES domains through boto3.
Any insight or ideas would be greatly appreciated.
This is now possible (as of 25/11/2019) by setting an UpdatePolicy with EnableVersionUpgrade: True.
For example:
ElasticSearchDomain:
Type: AWS::Elasticsearch::Domain
Properties: ...
UpdatePolicy:
EnableVersionUpgrade: true
https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-attribute-updatepolicy.html#cfn-attributes-updatepolicy-upgradeelasticsearchdomain
Received correspondence back from AWS Support regarding an ES in-place upgrade through CloudFormation.
tl;dr It is currently not supported but a feature request is already active for this functionality.
You are correct in saying that ES in-place upgrade is not supported by CFN at this moment. Thus upgrading ES from 6.3 to 6.4 can be done via CLI or AWS Console will keep the existing domain, but with CloudFormation, it will launch a new domain and discard the existing one.
I see that there is already an active feature request for this. I will go ahead and pass your sentiment regards to our internal team about this matter as well.
Unfortunately, AWS Support does not have visibility to service enhancement implementation roadmap, so I would not be able to provide you with an exact time frame.