I have an application with 5 microservices (iam, courses...). I want to know which is the best approach to migrate them to kubernetes. I was thinking to create namespaces by enviroment as google recomendes:
1. prod
2. dev
3. staging
then I thought that may be better create namespace by environment and microservices.
1. iam-prod
2. iam-dev
3. iam-staging
1. courses-prod
2. courses-dev
3. courses-staging
...
but this approach can be a little bit difficult to handle. Because I need to communicate between each other.
Which approach do you think that is better?
Just like the other answer, you should create namespace isolation for prod, dev and staging. This will ensure a couple of nuances are taken care of...
Ideally, your pods in either of the environments should not be talking across environments
You can manage your network policies in a much cleaner and manageable way with this organization of k8s kinds
You can run multiple microservices on the same namespace. So, I would go with prod, dev and staging namespaces where you can have one or multiple instances of each micro-service.
yet, If you want to use separate namespaces for separate microservices environments, they still can communicate using service. The DNS URL will be, SERVICE_NAME.NAMESPACE.SVC.
ref: https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/
If you go with second approach you will create unnecessary complexity without achieving any benefit. Also think of situation if your micro-services grows ,are you going to create new cluster for each one.This is not at all recommended.
Concept of Namespace should not be linked to applications but it is related to users.Refer k8 doc as below
"Namespaces are intended for use in environments with many users spread
across multiple teams, or projects. For clusters with a few to tens of users,
you should not need to create or think about namespaces at all. Start using namespaces when you need the features they provide."
Also even if first approach is recommended, please have separate cluster for prod as this should be more secure and highly available with proper disaster recovery plan ready and tested.
Go with one name space for each environment. You can also define resource quota per names paces. That way each application environment can be independently managed
None of the above are ideal solutions. I’ll go over why.
Security
Namespaces are the easiest boundary to use for managing RBAC permissions. In general, you will want to use the pre-provisioned admin and editor cluster roles to constrain access for users to use namespaces. This means people and services that share namespaces also share visibility of secrets. So the namespace becomes the blast radius for compromising secrets.
In order to reduce the blast radius of secrets exposure you can either micromanage resource level role binding (which is unreasonable overhead without additional automation and tooling) or segregate services across namespaces so that only tightly couple services share a namespace.
Isolation
Kubernetes resource isolation is relatively poor between namespaces. There’s no way to force a namespace to deploy into a different node pool than another namespace without custom admission controllers. So resource isolation is effectively opt-in, which is both insecure and unenforceable.
Because of this, is actually more secure and better resource isolated to have different environments (dev, staging, prod) in seperate K8s clusters all together. But this is obviously more expensive and more management overhead. So it’s only cost effective when you have many services and enough resource usage to justify the added overhead.
The consequence of poor resource isolation is that your dev and staging workloads can effectively DOS your prod workloads simply by using shared resources. CPU/memory/disk are the obvious culprits. These can be enforced by custom admission controllers. But the more insidious problem is sharing ingress proxies, load balancer, and networking, which is harder to isolate between namespaces.
Another consequence of poor isolation is that dev services with poor security can be compromised, allowing horizontal access to prod services. Realistically, no one deploys dev apps as production ready and secure. So without hard isolation, your security is at risk too.
Quotas
Quotas are managed at the namespace level. So if you want to isolate quota by environment AND team, you can’t use namespaces for both. And if you want to have quota by project, you’d need a project per namespace. The only way to handle all three is with multiple clusters, multiple namespaces, and multiple node pools with custom deployment/admission enforcement of that creates a makeshift hierarchy or matrix.
Namespace Hierarchy
Namespaces are flat. If you use them for env you can’t use them for org or team level access control. If you use them for team level access control your engineers can use them for component/project/system level abstraction boundaries. You can only choose one or the chaos will be unmanageable.
Conclusion
Unfortunately, the namespace abstraction is being used for 3 or 4 use cases in the Kubernetes community, and it’s the not really ideal for any of them. So either you pick an non-ideal use case to optimize for or you manage multiple clusters and write a bunch of custom automation to handle all the use cases.
Related
Sorry if this question might sound "convoluted" but here it goes...
I'm currently designing a k8s solution based on Firecracker and Kata-containers. I'd like the environment to be as isolated/secure as possible. My thoughts around this are:
deploy k8s masters as Firecracker nodes having API-server,
Controller, Scheduler and etcd
deploy k8s workers as Firecracker nodes having Kubelet, Kube-proxy and using Kata-containers + Firecracker for
deployed workload. The workload will be a combination of MQTT cluster components and in-house developed FaaS components (probably using OpenFaaS)
It's point 2 above which makes me feel a little awkward/convoluted. Am I over complicating things, introducing complexity which will cause problems related to (CNI) networking among worker nodes etc? Isolation and minimizing attack vectors are all important, but maybe I'm trying "to be too much of a s.m.a.r.t.a.s.s" here :)
I really like the concept with Firecrackers microVM architecture with reduced security risks and reduced footprint and it would make for a wonderful solution to tenant isolation. However, am I better of to use another CRI-conforming runtime together with Kata for the actual workload being deployed on the workers?
Many thanks in advance for your thoughts/comments on this!
You might want to take a look at https://github.com/weaveworks-liquidmetal and consider whether contributing to that would get you further towards your goal? alternative runtimes (like kata) for different workloads are welcomed in PR’s. There is a liquid-metal slack channel in the Weaveworks user group of you have any queries. Disclosure I currently work at Weaveworks :)
This should be the topic of my bachelor thesis. At the moment I am looking for literature or general information, but I can't really find it. Do you have more information on this topic? I want to find out if it makes sense to run dev and test stages on a cluster instead of running each stage on its own.
I also want to find out, if it's a good idea, how I can consolidate the clusters.
That's a nice question. It's a huge topic to cover actually, in short Yes and No you can setup a single cluster for all of your environments.
But in general, everyone need to consider various things before merging all the environments into a single cluster. Some of them include, number of services that you are running on k8s, number of engineers you have in hand as Ops to manage and maintain the existing cluster without any issues, location of different teams who use the cluster(If you take latency into consideration).
There are many advantages and disadvantages in merging everything into one.
Advantages include easy to manage and maintain a small cluster, you can spread out your nodes with labels and deploy your applications with dev label on dev node and so on. But this also hits a waste of resources thing, if you are not letting k8s take the decision by restricting the deployment of the pod. People can argue on this topic for hours and hours if we setup a debate.
Costing of resources, imagine you have a cluster of 3 nodes and 3 masters on prod, similarly a cluster of 2 nodes and 3 masters on dev, a cluster of 2 nodes and 3 masters on test. The costing is huge as you are allocating 9 masters, if your merge dev and test into 1 you will be saving 3 VMs cost.
K8s is very new to many DevOps engineers in many organisations, and many of us need a region to experiment and figure out things with the latest versions of the softwares before that could be implemented on Prod. This is the biggest thing of all, because downtime is very costly and many cannot afford a downtime no matter what. If everything is in single cluster, it's difficult to debug the problems. One example is upgrading Helm 2.0 to 3.0 as this involves losing of helm data. One need to research and workout.
As I said, team locations is another thing. Imagine you have a offshore testing team, if you merge dev and test into single cluster. There might be network latency for the testing team to work with the product and as all of us have deadlines. This could be arguable but still one need to consider network latencies.
In short Yes and No, this is very debatable question and we can keep adding pros and cons to the list forever, but it is advisable to have different cluster for each environment until you become some kind of kubernetes guru understanding each and every packet of data inside your cluster.
This is already achievable using namespace
https://kubernetes.io/docs/concepts/overview/working-with-objects/namespaces/
With name space we will be able to isolate.
Merging with different namespace would achieve what your thinking.
I have some apps in production working in Azure. All these applications belong to the same company and communicate with each other. I want to migrate them to Kubernetes.
My question is: What are the best practices in this case and why ?
Some peoples recommend one cluster and multiples namespaces and I don't know why.
For example: https://www.youtube.com/watch?v=xygE8DbwJ7c recommends apps within a cluster doing intra-cluster multi-tenancy but the arguments of this choice are not enough for me.
My question is: What are the best practices in this case? and why ?
Answer is: it depends...
To try to summarize it from our experience:
Cluster for each app is usually quite a bit waste of resources, especially giving HA clusters requirements, and it can mainly be justified in case when single app is comprised of larger number of microservices that are naturally clustered together or when some special security considerations has to be taken into account. That is, however, in our experience, rare the case (but it depends)...
Namespaces for apps in a cluster are more in line with our experience and needs, but again, this should not be overdone either (so, again it depends) since, for example your CNI can be bottleneck leading to one rogue app (or setup) degrading performance for other apps in seemingly unrelated case. Loadbanalcing and rollout downtimes, clashes for resources and other things can happen if all is crammed into one cluster at all cost. So this has it's limits as well.
Best of both worlds - we started with single cluster, and when we reached naturally separate (and separately performant) use cases (say, qa, dev, stage environments, different client with special security considerations etc) we migrated to more clusters, keeping in each cluster reasonably namespaced apps.
So all in all: depending on available machine pool (number of nodes), size of the cluster, size of apps themselves (microservice/service complexity), HA requirements, redundance, security considerations etc.... you might want to fit all into one cluster with namespaced apps, then again maybe separate in several clusters (again with namespaced apps within each cluster) or keep everything totally separate with one app per cluster. So - it depends.
It really depends on the scenario. I can think of one scenario where some of the apps need dedicated higher configuration nodes (Say GPU).
In such scenarios having a dedicated cluster with GPU nodes can be beneficial for such apps. And having a normal CPU nodes for other normal apps.
Is it best practice to place monitoring tools like Prometheus and Grafana inside a Kubernetes cluster or outside a Kubernetes cluster?
I can see the case for either. It seems very easy to place it inside the cluster. But that seems less robust.
It seems people do this typically, likely they are running everything in their environment or app under K8S. In a bigger picture view if you have use cases outside of one specific app it likely makes sense to run this on another architecture. The reason why is that Prometheus doesn't support clustering. You can write to two instances, but there is not really an HA plan for this product. To me, that's a major problem for a monitoring technology.
Most organizations who use this tool heavily end up not meeting use cases which APM (transaction tracing and diagnostics) can do. Additionally, you'll need to deploy an ELK/Splunk type stack, so it gets very complex. They also find it difficult to manage and often will look towards a Datadog, SingalFx, Sysdig, or another system which can do more and is fully managed. Naturally, most of what I have said here has cost, so if you do not have a budget then you'll have to spend your time (time=money) doing all the work.
After weeks of developing my various microservices, GC Pub/Sub and GC Functions using a basic MongoDB server, I would like to test the entire data flow using what I would use in production: a sharded MongoDB cluster. I've never used these and would like to get myself familiar with setting them up, updating, etc.
Costs are an issue at this stage, especially for testing. Therefore, what is the most cost-effective way to setup a (test) MongoDB sharded cluster on Google Compute Engine?
The easiest approach for you is to use Cloud Launcher for your deployment. It will let you choose the number of nodes and the machine types. In that way you can deploy something that suits your budget. You will get billed according to the resources you deploy and can use this online calculator to have an estimate. A drawback is that there does not seems to be a direct way to increase nodes or change machine types without manual reconfiguration.
While configuring your deployment the appropriate number of nodes and an arbitre will be created. Once you have tested, you might want to think about using more complex architectures that will be redundant against failures in one region (Those will certainly increase your cost since it will mean having additional nodes).
You can also consider running Mongo on GKE, it would be easier to escale but it will require that you get familiar with Kubernetes. Kubernetes Engine is also charged according to the resources used by the cluster.