I could not be understanding Kubernetes correctly but our application relies on a proprietary closed-source vendor that in turn relies on Solr. I've read articles on rolling updates with StatefulSets but they seem to be dependent on the application being aware and accounting for new schema versions, which we have no ability to do without decompiling and jumping through a lot of hoops. Let me describe what we're trying to do:
WebService Version 1 needs to be upgraded to WebService Version 2, this upgrade is none of our code and just the vendor code our code relies on. Think of it like updating the OS.
However WebService Version 1 relies on Solr Version 1. The managed schema is different and there are breaking changes between Solr Version 1 and 2. Both the Solr version and schemas are different. If WebService Version 1 hits Solr Version 2 it won't work, or worse run break Solr Version 2. The same is true in reverse, if we update WebService Version 2 and it gets Solr Version 1 it will break that.
The only thing I can think of is to get Kubernetes to basically spin up a pod for each version and not bring down 1 until 2 is up for both WebService and Solr.
This seems not right, am I understanding this correctly?
This is not really a problem Kubernetes can solve. First work out how you would do it by hand, then you can start working out how to automate it. If zero-downtime is a requirement, the best thing I can imagine is launching the new Solr cluster separately rather than doing an in-place upgrade, then launch the new app separately pointing at the new Solr. But you will need to work out how to sync data between the two Solr clusters in real time during the upgrade. But again, Kubernetes neither helps nor hinders here, the problems are not in launching or managing the containers, it's a logistical issue in your architecture.
It seems that what the canary release strategy with Solr suggests is simply having a new StatefulSet with the same labels as the one with the previous version.
Since labels can be assigned to many objects and network-level, services route requests based on these, what will happen is that requests will be redirected to both StatefulSets, emulating the canary release model.
Following this logic, you can have a v1 StatefulSet with, say, 8 replicas and another v2 StatefulSet with 2. So, ~80% of requests should hit v1 and ~20% v2 (not actually, just to illustrate).
From there, you can play with the number of replicas of each StatefulSet until you "roll out" 100% of replicas of v2, with no downtime.
Now, this can work in your scenario if you label each duo (application + Solr version) in aforementioned way.
Each duo would receive an ~N% of requests, depending on the number of replicas it has. You can slowly decrease the number of replicas of *duo* v1 and increase the next updated version.
This approach has the downside of using more resources as you will be running two versions of your full application stack. However, there is no downtime when upgrading the whole stack and you can control the percentage of "roll out".
Related
I want to roll back to specific version of kubernetes. My current version is 1.21.
Is there any system specifications for kubernetes?
If you are using a managed service, you probably won't be able to roll back, and I would strongly recommend AGAINST rolling back even if you can.
Managed services like GKE, AKS and EKS will only allow you to pick from the latest couple of versions (normally between 3-4 minor versions), but will not allow you to downgrade a minor version (e.g. you can't downgrade from 1.21 to 1.20 for example see here for GKE example)
Rolling back a version will re-introduce any bugs and security issues that were fixed by the upgrade. So essentially, you are making your cluster less secure by downgrading.
Clients such as kubectl will also flag up skew warnings such as in this question, and the rolled back clusters will start rejecting deployments if you've already updated them for new apiVersions
For example if the version you migrated from had an API version something/v1beta and the new version required you to use something/v1 then if you tried to deploy a deployment on the rolled back cluster that used something/v1 (to meet the new cluster version), the rolled back cluster would reject that.
I am currently trying to figure out how to handle version control with microservices.
From what I have read the best strategy is to have a separate git repository for each microservice.
However when it comes to deployment having to upload multiple git repositories seems pretty complex.
Specifically I am scratching my head as how I would deploy an update where multiple microservices require changes that depend on each other, and how to roll back to the previous versions should there be an issue with a production deployment.
This seems like a headache that most developers who use micro services have had to deal with.
Any advice would be greatly appreciated, especially if this could be done with an existing library rather than building something from scratch,
thanks,
Simon
There is no easy answer or library that could solve the problem, however there are strategies that can help. I have outlined a few below
Backward compatibility of service - Whenever you are releasing make sure that your API (REST or otherwise) works with previous consumer, this could be done by proving default values for the newer attributes.
Versioning of API - When changes you are making are not small and breaking, introduce the new version of API so that older consumers can continue to work with previous version.
Canary Deployment - When you deploy a new version of micro-service route only a small percentage of calls to the new service and rest of previous version.Observe the behavior and rollback if required.
Blue Green deployment - Have two production environment, one blue which is proven working and other green which is staging containing the latest release. When the testing is done green environment and you have enough confidence, route all the calls to green.
References
Micro-services versioning
Canary deployment
Blue green deployment
Here's a plugin I wrote using some references: https://github.com/simrankadept/serverless-ssm-version-tracker
NPM package:
https://www.npmjs.com/package/serverless-ssm-version-tracker
The version format supported is YYYY.MM.DD.REVISION
I've read about partial upgrade, but it always requires to change some parts of the packages application. I'd like to know if there's a way to redeploy a package without version change. In a way, similar to what VS is doing when deploying to the dev cluster.
On your local dev cluster, VS simply deletes the application before it starts the re-deployment. You could do the same in your production cluster, however this results in downtime since the application is not accessible during that time.
What's the reason why you wouldn't want to use the regular monitored upgrade? It has many advantages, like automatic rollbacks and so on.
I'd like to use zookeeper in one of my applications for distributed configuration management. The application is currently running in distributed environment and having to restart nodes for configuration files changes is a headache.
However, we want the zookeeper process to be started from within the application. The point is to reduced startup dependency and reduce operational cost. We've already have startup/shutdown scripts for the application and we need to reduce impact for operations team.
Has any one done something similar? Is this setup recommended or there are better solutions? Any tip or feedback is appreciated.
I have a blog post that describes how to embed Zookeeper in an application. The Zookeeper developers don't recommend it, though, and I would tend to agree now, though I had the same rationale for embedding it that you do - to reduce the number of moving parts.
You want to keep your ZK cluster stable but you will need to restart your app to do code updates, etc, impacting the ZK cluster stability.
Ultimately you will end up using your ZK cluster for multiple apps and those extra moving parts will be amortized over a number of projects.
We have our J2EE based application basically It is small e-commerce apps that run across global (multiple time zones). When ever we have to deploy the patch it take around 3 hrs time (DB backup,DB changes,Java changes,QA smoke testing). I knew its too high. I want to bring down this deployment time to less than 30 mins.
Now I would brief about application infra: We got two Jboss server and single DB, load balancer is configured for both jboss server. It is not cluster env.
Currently what we do :
We bring down both jboss and DB
Take DB backup
Make the DB changes, run some script
Make the java changes, run patches
Above steps will take around 2 hrs for us
Than QA will do testing for one hr. than bring up the server.
Can you suggest some better approach to achieve this? My main question, when we have multiple jboss and single DB. How to make deployment smooth
One approach I've heard that Netflix uses, but have not had a chance to use myself:
Make all of your DB schema changes both forward and backward compatible with the current version of software running, and the one you are about to deploy. Make the new software version continue to write any data the old version needs. Hopefully this is a minimal set.
Backup your running DB (most DBs don't require downtime for backups), and deploy your database schema updates at least a week prior to your software deploy.
Once your db changes have burnt in and seem to be bug free with the current running version, reconfigure your load balancer to point to only one instance of your JBoss servers. Deploy your updated software to the other instance and have QA smoke test it offline while the other server continues to server production request.
When QA is happy with the results, point the LB to just the offline JBoss server (with the new software). When that comes online, update the software on the newly offline JBoss server, and have QA smoke test if desired. If successful, point the LB to both JBoss instances.
If QA finds major bugs, and a quick bug fix and "roll-forward" is not possible, roll back to the previous version of the deployed software. Since your schema and new code is backward compatible, you won't have lost data.
On your next deploy, remove any garbage from your schema (like columns unused by the current deploy) in a way that makes it still backward and forward compatible.
Although more complex than your current approach, this approach should reduce your deployment downtime with minimal risk.