I am looking at ways to shift from our monolith system to more flexible microservice based, and from managing the application(containerized) standpoint, Kubernetes comes as the frontrunner.
In our ecosystem, there are some hardware devices that are to be a part as it is. Understanding Kubernetes (in the limited time) do not provide me a clear-cut way if managing the HW is a possibility with Kubernetes or not. I explored CRDs, Addons etc., but those approaches did not look promising to my use case of managing HW nodes.
My use case for managing HW nodes include:
1. Discovery of HW devices by K8s
2. Possibly managing them over REST API through K8s.
* High availability of HW devices is not in scope, however, any thoughts are welcome.
Kubernetes was developed as an automation deployment, orchestration and management tool for containerised applications. However, its role does not involve managing and discovering Hardware nodes, because Kubernetes implements own internal structure components by populating services across the Nodes in the cluster.
You can consider launching support for various compute, storage, etc. devices that require specific setup in Kubernetes cluster within Device plugin framework.
Related
Our store applications are not distributed applications. We deploy on each node and then configured to store specific details. So, it is tightly coupled to node. Can I use kubernetes for this test case? Would I get benefits from it?
Our store applications are not distributed applications. We deploy on each node and then configured to store specific details. So, it is tightly coupled to node. Can I use kubernetes for this test case?
Based on only this information, it is hard to tell. But Kubernetes is designed so that it should be easy to migrate existing applications. E.g. you can use a PersistentVolumeClaim for the directories that your application store information.
That said, it will probably be challenging. A cluster administrator want to treat the Nodes in the cluster as "cattles" and throw them away when its time to upgrade. If your app only has one instance, it will have some downtime and your PersistentVolume should be backed by a storage system over the network - otherwise the data will be lost when the node is thrown away.
If you want to run more than one instance for fault tolerance, it need to be stateless - but it is likely not stateless if it stores local data on disk.
There are several ways to have applications running on fixed nodes of the cluster. It really depends on how those applications behave and why do they need to run on a fixed node of the cluster.
Usually such applications are Stateful and may require interacting with a specific node's resources, or writing directly on a mounted volume on specific nodes for performance reasons and so on.
It can be obtained with a simple nodeSelector or with affinity to nodes ( https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/ )
Or with local persistent volumes ( https://kubernetes.io/blog/2019/04/04/kubernetes-1.14-local-persistent-volumes-ga/ )
With this said, if all the applications that needs to be executed on the Kubernetes cluster are apps that needs to run on a single node, you lose a lot of benefits as Kubernetes works really well with stateless applications, which may be moved between nodes to obtain high availability and a strong resilience to nodes failure.
The thing is that Kubernetes is complex and brings you a lot of tools to work with, but if you end up using a small amount of them, I think it's an overkill.
I would weight the benefits you could get with adopting Kubernetes (easy way to check the whole cluster health, easy monitoring of logs, metrics and resources usage. Strong resilience to node failure for stateless applications, load balancing of requests and a lot more) with the cons and complexity that it may brings (especially migrating to it can require a good amount of effort, if you weren't using containers to host your applications and so on)
I don't believe that it is a good idea to host a cluster of message queues that persist messages inside Kubernetes.
I don't think hosting any cluster inside kubernetes nor any stateful system inside Kubernetes is a good idea.
Kubernetes is designed for stateless restful services, to scale in and out, etc.
First of all, am I right?
If I am right, then
I just need some good valid points to convince some of my colleagues.
Thanks
Kubernetes was primarily designed for stateless applications, and managing state in Kubernetes introduces additional complexities that you have to manage (persistence, backups, recovery, replication, high availability, etc.).
Let's cite some sources (emphases added):
While it’s perfectly possible to run stateful workloads like databases in Kubernetes with enterprise-grade reliability, it requires a large investment of time and engineering that it may not make sense for your company to make [...] It’s usually more cost-effective to use managed services instead.
Cloud Native DevOps with Kubernetes, p. 13, O'Reilly, 2019.
The decision to use Statefulsets should be taken judiciously because usually stateful applications require much deeper management that the orchestrator cannot really manage well yet.
Kubernetes Best Practices, Chapter 16, O'Reilly, 2019.
But...
Support for running stateful applications in Kubernetes is steadily increasing, the main tools being StatefulSets and Operators:
Stateful applications require much more due diligence, but the reality of running them in clusters has been accelerated by the introduction of StatefulSets and Operators.
Kubernetes Best Practices, Chapter 16, O'Reilly, 2019.
Managing stateful applications such as database systems in Kubernetes is still a complex distributed system and needs to be carefully orchestrated using the native Kubernetes primitives of pods, ReplicaSets, Deployments, and StatefulSets, but using Operators that have specific application knowledge built into them as Kubernetes-native APIs may help to elevate these systems into production-based clusters.
Kubernetes Best Practices, Chapter 16, O'Reilly, 2019.
Conclusion
As a best practice at the time of this writing, I would say:
Avoid managing state inside Kubernetes if you can. Use external services (e.g. cloud services, like DynamoDB or CloudAMQP) for managing state.
If you have to manage state inside the Kubernetes cluster, check if an Operator exists exists for the type of application that you want to run, and if yes, use it.
I'm migrating a number of applications from AWS ECS to Azure AKS and being the first production deployment for me in Kubernetes I'd like to ensure that it's set up correctly from the off.
The applications being moved all use resources at varying degrees with some being more memory intensive and others being more CPU intensive, and all running at different scales.
After some research, I'm not sure which would be the best approach out of running a single large cluster and running them all in their own Namespace, or running a single cluster per application with Federation.
I should note that I'll need to monitor resource usage per application for cost management (amongst other things), and communication is needed between most of the applications.
I'm able to set up both layouts and I'm sure both would work, but I'm not sure of the pros and cons of each approach, whether I should be avoiding one altogether, or whether I should be considering other options?
Because you are at the beginning of your kubernetes journey I would go with separate clusters for each stage you have (or at least separate dev and prod). You can very easily take your cluster down (I did it several times with resource starvation). Also not setting correctly those network policies you might find that services from different stages/namespaces (like test and sandbox) communicate with each other. Or pipelines that should deploy dev to change something in other namespace.
Why risk production being affected by dev work?
Even if you don't have to upgrade the control plane yourself, aks still has its versions and flags and it is better to test them before moving to production on a separate cluster.
So my initial decision would be to set some hard boundaries: different clusters. Later once you get more knowledge with aks and kubernetes you can review your decision.
As you said that communication is need among the applications I suggest you go with one cluster. Application isolation can be achieved by Deploying each application in a separate namespace. You can collect metrics at namespace level and can set resources quota at namespace level. That way you can take action at application level
A single cluster (with namespaces and RBAC) is easier to setup and manage. A single k8s cluster does support high load.
If you really want multiple clusters, you could try istio multi-cluster (istio service mesh for multiple cluster) too.
Depends... Be aware AKS still doesn't support multiple node pools (On the short-term roadmap), so you'll need to run those workloads in single pool VM type. Also when thinking about multiple clusters, think about multi-tenancy requirements and the blast radius of a single cluster. I typically see users deploying multiple clusters even though there is some management overhead, but good SCM and configuration management practices can help with this overhead.
I have a greenfield application that I would like to host on Service Fabric but high availability will be a key concern. Is there a way for a cluster to span across Azure region’s? Does Service Fabric recognize Availability Zones?
Is there a way for a cluster to span across Azure region’s?
Yes, it's in the FAQ.
The core Service Fabric clustering technology can be used to combine
machines running anywhere in the world, so long as they have network
connectivity to each other. However, building and running such a
cluster can be complicated.
If you are interested in this scenario, we encourage you to get in
contact either through the Service Fabric GitHub Issues List or
through your support representative in order to obtain additional
guidance. The Service Fabric team is working to provide additional
clarity, guidance, and recommendations for this scenario.
Does Service Fabric recognize Availability Zones?
Yes, but limited.
Make sure to check out this example (marked as 'not production ready'). And check out SF Mesh as well, which allows you to deploy to multiple AZ's. (but currently in preview)
I would like to setup a HA swarm / kubernetes cluster based on low power architecture (arm).
My main objective is to learn how works a HA web cluster, how it reacts to failures and recover from them, how easy it is to scale.
I would like to host a blog on it as well as other services once it is working (git / custom services / home automation / CI server / ...).
Here are my first questions:
Regading the hardware, which is the more appropriate ? Rpi3 or Odroid-C2 or something else? I intend to have 4-6 nodes to start. Low power consumption is important to me since it will be running 24/7 at home
What is the best architecure to follow ? I would like to run everything in container (for scalability and redudancy), and have redundant load balancer, web servers and databases. Something like this: architecture
Would it be possible to have web server / databases distributed on all the cluster, and load balancing on 2-3 nodes ? Or is it better to separate it physically?
Which technology is the more suited (swarm / kubernetes / ansible to deploy / flocker for storage) ? I read about this topic a lot lately, but there are a lot of choices.
Thanks for your answers !
EDIT1: infrastructure deployment and management
I have almost all the material and I am now looking in a way to easily manage and deploy the 5 (or more) PIs. I want the procedure to be as scalable as possible.
Is there some way to:
retrieve an image from network the first time (PXE boot like)
apply custom settings for each node: network config (IP), SSH access, ...
automatically deploy / update new software on servers
easily add new nodes on the cluster
I can have a dedicated PI or my PC that would act as deployment server.
Thanks for your inputs !
Raspberry Pi, ODroid, CHIP, BeagleBoard are all suitable hardware.
Note that flash card has a limited lifetime if you constantly read/write to them.
Kubernetes is a great option to learn clustering containers.
Docker Swarm is also good.
None of these solutions provide distributed storage, so if you're talking about a PHP type web server and SQL database which are not distributed, then you can't really be redundant even with Kubernetes or Swarm.
To be effectively redundant, you need master/slave setup for the DB, or better a clustered database like elasticsearch or maybe the cluster version of MariaDB for SQL, so you have redundancy provided by the database cluster itself (which is not a replacement for backups, but it better than a single container)
For real distributed storage, you need to look at technologies like Ceph or GlusterFS. These do not work well with Kubernetes or Swarm because they need to be tied to the hardware. There is a docker/kubernetes Ceph project on Github, but I'd say it is still a bit hacky.
Better provision this separately, or directly on the host.
As far as load balancing is concerned, you ay want to have a couple nodes with external load balancers for redundancy, if you build a Kubernetes cluster you don't really chose what else may run on the same node, except by specifying CPU/RAM quota and limits, or affinity.
If you want to have a try to Raspberry Pi 3 in Kubernetes here is a step by step tutorial to setup your Kubernetes cluster with Raspberry Pi 3:
To prevent de read/write issue, you might consider purchasing and additional NAS device and mount it as volume to yours pods
Totally agree with MrE with the distributed storage for PHP-like. Volume lifespan is per pod, and is tied to the pod. So you cannot share one Volume between pods.