High Available Kubernetes cluster - kubernetes

We are starting migrating our system from Azure web app services to AKS infrastructure and currently we had an incident with our test cluster and connection to all our environments were lost. It was due to the upgrading version of Kubernetes and adding additional node pool which broke the route table and they lost communication between themselves.
So as a result we came up with the next HA infrastructure for our environments:
But that eventually adds more work on the CI/CD pipelines and doesn't look very logically as Kubernetes itself should be reliable.
Can I have your comments and thoughts if it is best practice or proper way of moving forward?

Related

Good solutions to automate infrastructure deployment locally?

I have recently been reading more about infrastructure as a service (IaaS) and platform as a service (PaaS) and had some questions. I see when we opt for a PaaS solution, it is generally very easy to create the infrastructure as the cloud providers handle that for us and we can even automate the deployment using an infrastructure as code solution like Terraform.
But if we use an IaaS solution or even a local on premise cluster, we lose a lot of the automation it seems that PaaS allows. So I was curious, are there any good tools out there for automating infrastructure deployment on a local cluster that is not in the cloud?
The best thing I could think of was to run a local Kubernetes cluster and then Dockerize each of the infrastructure components, but this seems difficult as each node in the cluster will need its own specific configuration files.
From my basic Googling, it seems like there is not a good solution to this.
Edit:
I was not clear enough with my original intentions. I have two problems I am trying to solve.
How do I automate infrastructure deployment locally? For example, suppose I wanted to create a Hadoop HDFS cluster. I would need to configure one node to be the namenode with an accessible IP, and the other nodes to be datanodes that are aware of the namenode's IP. At the moment, I have to do this manually by logging into each node, checking it's IP, and then configuring each one. How would I automate this? If I were to use a Kubernetes approach, how do I specify that one of the running pods needs to be the namenode and the others are datanodes? How do I find the pods' IPs and have them be aware of the namenode IP?
The next problem I have is very similar to the first, but a slight modification. How would I deploy specific configuration files to each node. For instance in Kafka, the configuration file for one node, requires the IPs of the Zookeeper nodes, as well as the IP it should listen on. This may be different for every node in the cluster. Is there a good way to make these config files pod specific, so that I do not have to do bash text processing to insert the correct contents into each pod's config files?
You can use Terraform for all of your on-premise Infra. Automation, and Ansible for configuration management.
Let's say you have three HPE servers, Install K8s or VMware on them using Ansible, then you can treat them as three Avvaliabilty zones in one region, same as AWS. from this you can start deploying dockerize apps, or helm charts using Terraform.
Summary:
Ansbile for installing and configuration K8s.
Terraform for provisioning K8s.
Helm for installing apps on K8s.
After this you gonna have a base automated on-premise Infra.

Terraform, Kubernetes, Mesos etc - how are they connected

Reading a lot on internet but the information is not clear or mixedup so I thought I will ask the question here.
I am trying to understand how Terraform is same or different from container orchestration tools like Kubernetes, Mesos etc.
Can Terraform work independently or Kube and Mesos? How is it connected to docker containers?
Can someone please shed the light?
Thanks!!!
I don't know enough about Mesos as I would like, but I do know about Kubernetes and Terraform. Despite I'm not an expert the general basics between this tools have a different purpose. While Terraform deals with the generation of the infrastructure in the cloud by using their apis, Kubernetes deals with the administration and orchestration of containers in the undergroown infrastructure by using the api of the container daemon such the docker daemon.
So generally talking the Terraform main point is to make transparent the creation of the cloud infrastructure where you write what you want to have, servers, network, security policies, some PaaS Service and Kubernetes is the orchestrator of containers.
Hope this helps you. Please, in the case of someone saws a mistake. Remark it so we all improves.
Terraform - A Tools to Build your Infrastructure an open-source project Hashicorp labs if you are aware with AWS and heard of CloudFormation both work in same manner but Terraform have some better feature you can write your whole Infrastructure as a Code run it in one click and decommissioned it in one click.
For more you must visit the site: https://www.terraform.io
Now Kubernetes (An open source project by Google) and Apache-Mesos( Or DC/OS) An project by Apache foundation both are used for Container orchestration (and I’m purposely avoiding using the word Docker) is not for everyone and does not answer every need.
Mesos was launched first but it was really hard to manage Mesos networking that time. and In 2014 there was the first Release of Kubernetes comes in.
Now, DC/OS (the Distributed Cloud Operating System) is an open-source, distributed operating system based on the Apache Mesos distributed systems kernel.
It's in the race with Kubernetes .
I would suggest you must go through this article to get a better understanding of Kubernetes vs Mesos : https://logz.io/blog/kubernetes-vs-mesos/
And Yes they are not related to Terraform at all.
Thanks

Azure Service Fabric - connect to local service fabric cluster from outside the VM it's running on?

We have a 5-node Azure Service Fabric Cluster as our main Production microservices hub. Up until now, for testing purposes, we've just been pushing out separate versions of our applications (the production application with ".Test" appended to the name) to that production SFC.
We're looking for a better approach, namely a separate test Service Fabric Cluster. But the issue comes down to costs. The smallest SFC you can create in Azure is 3 nodes. Further, you can't shutdown a SFC when it's not being used, which we would also need to do to save on costs.
So now I'm looking at just spinning up a plain Windows VM in Azure and installing the local Service Fabric Cluster app (which allows just one-node setup). Is it possible to do this and be able to communicate with the cluster from outside the VM?
What you are trying to accomplish is setup a standalone cluster. The steps to do it is documented in this docs.
Yes, you can access the cluster from outside the VM, In simple terms enable access to the network and open the firewall ports.
Technically both deployments(Guide and DevCluster) are very similar, the main difference is that you have better control on the templates following the standalone guide, using the development setup you don't have much options and all the process is automated.
PS: I would highly recommend you have a UAT\Staging cluster with the
exact same specs as the production version, the approach you used
could be a good idea for staging environment. Having different
environments increase the risk of issues, mainly related to
configuration and concurrency.

Feasibility of using multi master Kubernetes cluster architecture

I am trying to implement CI/CD pipeline using Kubernetes and Jenkins. In my application I have 25 Micro services. And need to deploy it for 5 different clients. The microservice code is unique. But configuration for each client is different.
So here I am configuring Spring cloud config server with 5 different Profiles/Configuration. And When I am building Docker images, I will define which is the active config server profile by adding active profile in Docker file. So from 25 microservices I am building 25 * 5 number of Docker images and deploying that. So total 125 microservices I need to deploy in Kubernetes cluster. And these microservice are calling from my Angular 2 front end application.
Here when I am considering the performance of application and speed of response, the single master is enough of this application architecture? Or Should I definitely need to use the multi master Kubernetes cluster? How I can manage this application?
I am new to these cloud and CI/CD pipeline architecture tasks. So I have confusion related with designing of workflow. If single master is enough, then I can continue with current. Otherwise I need to implement the multi master Kubernetes HA cluster.
The performance of the application and/or the speed do not depend on the number of master nodes. It resolves High Availability issues, but not performance. Now, you should still consider having at least 3 masters for this implementation you are working on. If the master goes down, your cluster is useless.
In Kubernetes, the master gets the API calls and acts upon them, by setting the desired state of the cluster to the current state. But in the end that's the nodes (slaves) doing the heavy work. So your performance issues will depend mostly, if not exclusively, on your nodes. If you have enough memory and CPU, you should be fine.
Multi master sounds like a good idea for HA.
You could also look at using Helm which lets you configure microservices in a per installation basis so that you don't have to keep re-releasing docker images each time to configure a new environment. You can then inject the helm configuration into, say, a ConfigMap that mounts the content as an application.yml so that Spring Boot automatically loads the settings

Multiple environments (Staging, QA, production, etc) with Kubernetes

What is considered a good practice with K8S for managing multiple environments (QA, Staging, Production, Dev, etc)?
As an example, say that a team is working on a product which requires deploying a few APIs, along with a front-end application. Usually, this will require at least 2 environments:
Staging: For iterations/testing and validation before releasing to the client
Production: This the environment the client has access to. Should contain stable and well-tested features.
So, assuming the team is using Kubernetes, what would be a good practice to host these environments? This far we've considered two options:
Use a K8s cluster for each environment
Use only one K8s cluster and keep them in different namespaces.
(1) Seems the safest options since it minimizes the risks of potential human mistake and machine failures, that could put the production environment in danger. However, this comes with the cost of more master machines and also the cost of more infrastructure management.
(2) Looks like it simplifies infrastructure and deployment management because there is one single cluster but it raises a few questions like:
How does one make sure that a human mistake might impact the production environment?
How does one make sure that a high load in the staging environment won't cause a loss of performance in the production environment?
There might be some other concerns, so I'm reaching out to the K8s community on StackOverflow to have a better understanding of how people are dealing with these sort of challenges.
Multiple Clusters Considerations
Take a look at this blog post from Vadim Eisenberg (IBM / Istio): Checklist: pros and cons of using multiple Kubernetes clusters, and how to distribute workloads between them.
I'd like to highlight some of the pros/cons:
Reasons to have multiple clusters
Separation of production/development/test: especially for testing a new version of Kubernetes, of a service mesh, of other cluster software
Compliance: according to some regulations some applications must run in separate clusters/separate VPNs
Better isolation for security
Cloud/on-prem: to split the load between on-premise services
Reasons to have a single cluster
Reduce setup, maintenance and administration overhead
Improve utilization
Cost reduction
Considering a not too expensive environment, with average maintenance, and yet still ensuring security isolation for production applications, I would recommend:
1 cluster for DEV and STAGING (separated by namespaces, maybe even isolated, using Network Policies, like in Calico)
1 cluster for PROD
Environment Parity
It's a good practice to keep development, staging, and production as similar as possible:
Differences between backing services mean that tiny incompatibilities
crop up, causing code that worked and passed tests in development or
staging to fail in production. These types of errors create friction
that disincentivizes continuous deployment.
Combine a powerful CI/CD tool with helm. You can use the flexibility of helm values to set default configurations, just overriding the configs that differ from an environment to another.
GitLab CI/CD with AutoDevops has a powerful integration with Kubernetes, which allows you to manage multiple Kubernetes clusters already with helm support.
Managing multiple clusters (kubectl interactions)
When you are working with multiple Kubernetes clusters, it’s easy to
mess up with contexts and run kubectl in the wrong cluster. Beyond
that, Kubernetes has restrictions for versioning mismatch between the
client (kubectl) and server (kubernetes master), so running commands
in the right context does not mean running the right client version.
To overcome this:
Use asdf to manage multiple kubectl versions
Set the KUBECONFIG env var to change between multiple kubeconfig files
Use kube-ps1 to keep track of your current context/namespace
Use kubectx and kubens to change fast between clusters/namespaces
Use aliases to combine them all together
I have an article that exemplifies how to accomplish this: Using different kubectl versions with multiple Kubernetes clusters
I also recommend the following reads:
Mastering the KUBECONFIG file by Ahmet Alp Balkan (Google Engineer)
How Zalando Manages 140+ Kubernetes Clusters by Henning Jacobs (Zalando Tech)
Definitely use a separate cluster for development and creating docker images so that your staging/production clusters can be locked down security wise. Whether you use separate clusters for staging + production is up to you to decide based on risk/cost - certainly keeping them separate will help avoid staging affecting production.
I'd also highly recommend using GitOps to promote versions of your apps between your environments.
To minimise human error I also recommend you look into automating as much as you can for your CI/CD and promotion.
Here's a demo of how to automate CI/CD with multiple environments on Kubernetes using GitOps for promotion between environments and Preview Environments on Pull Requests which was done live on GKE though Jenkins X supports most kubernetes clusters
It depends on what you want to test in each of the scenarios. In general I would try to avoid running test scenarios on the production cluster to avoid unnecessary side effects (performance impact, etc.).
If your intention is testing with a staging system that exactly mimics the production system I would recommend firing up an exact replica of the complete cluster and shut it down after you're done testing and move the deployments to production.
If your purpose is testing a staging system that allows testing the application to deploy I would run a smaller staging cluster permanently and update the deployments (with also a scaled down version of the deployments) as required for continuous testing.
To control the different clusters I prefer having a separate ci/cd machine that is not part of the cluster but used for firing up and shutting down clusters as well as performing deployment work, initiating tests, etc. This allows to set up and shut down clusters as part of automated testing scenarios.
It's clear that by keeping the production cluster appart from the staging one, the risk of potential errors impacting the production services is reduced. However this comes at a cost of more infrastructure/configuration management, since it requires at least:
at least 3 masters for the production cluster and at least one master for the staging one
2 Kubectl config files to be added to the CI/CD system
Let’s also not forget that there could be more than one environment. For example I've worked at companies where there are at least 3 environments:
QA: This where we did daily deploys and where we did our internal QA before releasing to the client)
Client QA: This where we deployed before deploying to production so that the client could validate the environment before releasing to production)
Production: This where production services are deployed.
I think ephemeral/on-demand clusters makes sense but only for certain use cases (load/performance testing or very « big » integration/end-to-end testing) but for more persistent/sticky environments I see an overhead that might be reduced by running them within a single cluster.
I guess I wanted to reach out to the k8s community to see what patterns are used for such scenarios like the ones I've described.
Unless compliance or other requirements dictate otherwise, I favor a single cluster for all environments. With this approach, attention points are:
Make sure you also group nodes per environment using labels. You can then use the nodeSelector on resources to ensure that they are running on specific nodes. This will reduce the chances of (excess) resource consumption spilling over between environments.
Treat your namespaces as subnets and forbid all egress/ingress traffic by default. See https://kubernetes.io/docs/concepts/services-networking/network-policies/.
Have a strategy for managing service accounts. ClusterRoleBindings imply something different if a clusters hosts more than one environment.
Use scrutiny when using tools like Helm. Some charts blatantly install service accounts with cluster-wide permissions, but permissions to service accounts should be limited to the environment they are in.
I think there is a middle point. I am working with eks and node groups. The master is managed, scaled and maintained by aws. You could then create 3 kinds of node groups (just an example):
1 - General Purpose -> labels: environment=general-purpose
2 - Staging -> labels: environment=staging (taints if necessary)
3 - Prod -> labels: environment=production (taints if necessary)
You can use tolerations and node selectors on the pods so they are placed where they are supposed to be.
This allows you to use more robust or powerful nodes for production's nodegroups, and, for example, SPOT instances for staging, uat, qa, etc... and has a couple of big upsides:
Environments are physically separated (and virtually too, in namespaces)
You can reduce costs by sharing not only the masters, but also some nodes with pods shared by the two environments and by using spot or cheaper instances in staging/uat/...
No cluster-management overheads
You have to pay attention to roles and policies to keep it secure. You can implement network policies using, for example eks+calico.
Update:
I found a doc that may be useful when using EKS. It has some details on how to safely run multi-tenant cluster, and some of this details may be useful to isolate production pods and namespaces from the ones in staging.
https://aws.github.io/aws-eks-best-practices/security/docs/multitenancy/
Using multiple clusters is the norm, at the very least to enforce a strong separation between production and "non-production".
In that spirit, do note that GitLab 13.2 (July 2020) now includes:
Multiple Kubernetes cluster deployment in Core
Using GitLab to deploy multiple Kubernetes clusters with GitLab previously required a Premium license.
Our community spoke, and we listened: deploying to multiple clusters is useful even for individual contributors.
Based on your feedback, starting in GitLab 13.2, you can deploy to multiple group and project clusters in Core.
See documentation and issue.
A few thoughts here:
Do not trust namespaces to protect the cluster from catastrophe. Having separate production and non-prod (dev,stage,test,etc) clusters is the minimum necessary requirement. Noisy neighbors have been known to bring down entire clusters.
Separate repositories for code and k8s deployments (Helm, Kustomize, etc.) will make best practices like trunk-based development and feature-flagging easier as the teams scale.
Using Environments as a Service (EaaS) will allow each PR to be tested in its own short-lived (ephemeral) environment. Each environment is a high-fidelity copy of production (including custom infrasture like database, buckets, dns, etc.), so devs can remotely code against a trustworthy environment (NOT minikube). This can help reduce configuration drift, improve release cycles, and improve the overall dev experience. (disclaimer: I work for an EaaS company).
I think running a single cluster make sense because it reduces overhead, monitoring. But, you have to make sure to place network policies, access control in place.
Network policy - to prohibit dev/qa environment workload to interact with prod/staging stores.
Access control - who have access on different environment resources using ClusterRoles, Roles etc..