What is the scope of learning kubernetes? [closed] - kubernetes

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I came across the word "Kubernetes" recently while searching for some online courses. I understood if I learn Kubernetes, I will learn about containers and stuff related to container orchestration and how easily we can scale the microservices. But I wanted to know after learning kubernetes is there any other thing to learn to become an expert in that line?
My question is more of the stream I can select if I learn this, as like learning Python or R will help you to become a data analyst or other data related stream?
I am very new this, really appreciate your help in understanding this
Thanks in advance

The main prerequisite for Kubernetes is Docker. Once you learn Docker, you learn how to package environments into containers and deploy them. Once you've learnt how to build docker images, you need to 'orchestrate' them. What does that mean?
That means, if you have a bunch of microservices (in the form of containers), you can spin up multiple machines and tell Kubernetes which image/container goes where and so you can orchestrate your app using Docker images (packaged environments) and then Kubernetes as the underlying resource provider to run these containers, and control when they are spun up/killed.
Assuming you don't have a massive cluster on-prem (or at home) Kubernetes on a single personal computer is rather useless. You would need to learn a cloud platform (or invest in a server) to utilise Kubernetes efficiently.
Once you learn this, you would possibly need to find a way for your containers to communicate with one another. In my opinion, the two most important things any amateur programmer needs to know are:
Message brokers
REST
Message brokers: Kafka, RabbitMQ (personal fave), Google Pub/Sub, etc.
REST: Basically sending/receiving data via HTTP requests.
Once all of this is done, you've learnt how to build images, orchestrate them, have them communicate with one another and use resources from other machines (utilizing the cloud or on-prem servers)
There are many other uses for Kubernetes, but in my opinion, this should be enough to entice you to learn this key-skill.
Kubernetes and Docker is the future, because it removes the need to worry about environments. If you have a docker image, you can run that image on Mac, Linux, Windows or basically any machine with a hypervisor. Increase portability, and decreases over-head of setting up environments each time. Also allows you to spin up 1 or 100 or 1000 or 10,000 containers (excellent for scalability!)

Yes, if you are looking to explore fully then security aspect can also be a thing you can learn and these days its in demand where various clients want to get security leaks checked at level of containers, containers registry and even at level of kubernetes also.
You can become DevSecOps with couple of certifications.
And pertaining to your later question I can't envisage anything because here you can just deploy containers and you can even deploy some python code there which is expected to collect some data from sensors and do some computations.
Please comment if something specific is your question

Related

Which segregation Kubernetes clusters for an production environment? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 1 year ago.
Improve this question
I'm wondering about the best practices for architecting my Kubernetes clusters.
For 1 environment (e.g. production), what organisation should I have in my clusters?
Examples: 1 cluster per technology stack, 1 cluster per exposure area (internet, private...), 1 cluster with everything ... ?
Thanks for your help
I'm not a Kubernetes expert, so I'll give you some generic guidance to help until someone who knows more weighs in.
By technology stack - no. That wouldn't provide any value I can think of.
By 'exposure' - yes. If one cluster is compromised the damage will hopefully be limited to that cluster only.
By solution - yes.
Solution vs Technology Stack
"Solution" is where you have a number of systems that exist to addresses a specific business problem or domain. This could be functional e.g. finance vs CRM vs HR.
Technology stacks in the literal sense is not likely to be relevant. True, it's not uncommon for different solutions & systems to be comprised of different technology (is that what you were meaning?) - but that's usually a by-product, not the primary driver.
Let's say you have two major solutions (e.g. the finance and CRM). It's likely that you will have situations that impacts one but shouldn't impact the other.
Planned functional changes: e.g. rolling out a major release. Object Orientated programmers and architects have had this nailed for years through designing systems that are cohesive but loosely-coupled (see: Difference Between Cohesion and Coupling), and through stuff like the Stable Dependencies Principle. Having both solutions dependent on the same cluster makes them coupled in that respect, which.
Planned infrastructure changes: e.g. patching, maintenance, resource reallocation, etc.
Unplanned changes: e.g. un-planned outage, security breaches.
Conclusion
Look at what will be running on the cluster(s), and what solutions they are part of, and consider separation along those lines.
The final answer might be a combination of both, some sort of balance between security concerns and solution (i.e. change) boundaries.
The best way would be is to have 1 kubernetes cluster and have the worker nodes in private subnets. You can choose to have the control plane in a public subnet with restricted access like your VPN cidr etc.
If you have multiple teams or application stacks, I'd suggest having different namespaces for each stack as this creates the logical separation of resources.
Also, check the resource limits and quotas that you can apply on kubernetes to prevent over consumption of the resources.
And, as you mentioned multiple application stacks, I am assuming you would have multiple services being exposed for each application or something similar. I would highly recommend using a ingress controller (nginx or anything) to work as single point of entry for each application. You can have more than 1 application listening to 1 load balancer.
Also, have prometheus or ELK monitoring in place as they are great with monitoring k8s components and metrics.
And, I would highly recommend using a tool kubecost and kubebench for enhancing your k8s cluster.
Kubecost is for cost analytics and reporting for k8s components and kubebench would audit your cluster against CIS standards and give you a report on what improvements are required and where.
Please note that the above recommendations are based on best practises and cost efficiency.

Kubernetes - Running the CI/CD pipeline on the prod cluster [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 1 year ago.
Improve this question
First of all, a disclaimer - I know that this questions might be too "open-ended" for SO, but I honestly could not find a better place for it (and the K8s docs specifically recommend any questions be directed to SO).
My company has decided to switch the main production infrastructure to Kubernetes. However, there is some significant pushback from the developers, who would prefer not to run the CI/CD pipeline on the same cluster as the production workloads. They prefer to keep some dedicated VMs for that purpose, the main reason given that "we should not put all our eggs in one basket".
With me coming from the other side of devops (the "ops" side), I would prefer to have everything in one place, managed using the same set of tools. Unfortunately, I cannot find any documented best practices stating one way or another.
So my questions are:
Based on personal experience, would you recommend one type of deployment over another? Why?
Can anyone point me to a link making the argument one way or another? Any recommendations that we should follow in such a case?
Unfortunately, I cannot find any documented best practices stating one way or another.
This is all depends on how strong separation you want. In Kubernetes you can separate environments by using a separate namespace, but for professional company environments, you typically want stronger separation. If you use a cloud provider, it is common to separate with a different account for "production", aslo with different access rights.
developers, who would prefer not to run the CI/CD pipeline on the same cluster as the production workloads.
If this is for a professional organization, I agree with them. You want to use completely separated VMs, network and load balancer. If you use a cloud provider, it is also good to use a different cloud account and vpc (virtual private cloud - network).
Recommendation
With me coming from the other side of devops (the "ops" side), I would prefer to have everything in one place, managed using the same set of tools.
I agree with both you and your developers. Use a dedicated cluster for production and a different cluster for development. Do all changes in the production cluster via CI/CD pipelines. Restrict access (at least, write access) to the production environment.
With that setup, you only have two clusters that are in active use, not more - but also strong separation for the production environment.
References
See Best practices for enterprise organizations for a good document on best practices for organizations.

Is it recommended to use containers for databases? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
I have searched for information about it but still cannot find anything convincing.
I have multiple containerized websites with apache and php, these in turn are exposed through a reverse proxy with virtual hosts for each container, but, I've been thinking about the database, most use mariadb 5.5 but there is one site web required by mariadb 10.
I was wondering if it was a good idea for each container on each website to embed its own instance of mariadb or create a unique container for this, but I have some doubts.
Mariadb uses its own load balancing system, the container will affect its use if it had to raise multiple instances of the same database even though they all use the same data directory? I'm wondering if the engine will have to do the same indexing multiple times or there will be conflict in the use of files.
Having the website in a container has no problems because the files do not undergo changes and the logs and uploaded files are stored in persistent volumes, but in the case of the database it is different because I do not know if it is a good idea that multiple engines make use of the same data directory.
In a productive environment where the database has a high query load, is it recommended to use a container? Or is it better to embed the database inside the website container or do a native installation on the server?
In which cases should I choose one or the other option?
Absolutely do not have 2 databases share the same data directory. Only 1 database server should manage its own volume.
If you need more databases because you want high availability or are worried about load, each needs to manage its own data directory and sync with each other via replication.
I'd say that using containers for database engines is a bit uncommon outside of development setups, but not unheard of especially if you want to be able to scale fast. I don't think it's super easy to automate all of this.
Databases are critical services.
In my opinion, you shouldn't use use Docker for production databases.
But You wouldn’t think twice about using Docker in a local development environment

Openshift vs Rancher, what are the differences? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
I am totally new to this two technologies (I know docker and kubernetes btw).
Haven't find much an the web about this comparison topic.
I have read that Openshift is used by more companies,but a nightmare to install,pricier and on upgrade data loss can occur.
But nothing else.
What should be the deciding factor for which one to use for kubernete cluster orchestration?
I currently work for Rancher. I've also been building Internet infrastructure since 1996 and owned an MSP for 14 years that built and managed Internet datacenters for large US media companies. I've been working with containers since 2014, and since then I've tried pretty much everything that exists for managing containers and Kubernetes.
"The deciding factor" varies by individual and organization. Many companies use OpenShift. Many companies use Rancher. Many companies use something else, and everyone will defend their solution because it fits their needs, or because of the psychological principle of consistency, which states that because we chose to walk a certain path, that path must be correct. More specifically, the parameters around the solution we chose must be what we need because that was the choice we made.
Red Hat's approach to Kubernetes management comes from OpenShift being a PaaS before it was ever a Kubernetes solution. By virtue of being a PaaS, it is opinionated, which means it's going to be prescriptive about what you can do and how you can do it. For many people, this is a great solution -- they avoid the "analysis paralysis" that comes from having too many choices available to them.
Rancher's approach to Kubernetes management comes from a desire to integrate cloud native tooling into a modular platform that still lets you choose what to do. Much like Kubernetes itself, it doesn't tell you how to do it, but rather gives fast access to the tooling to do whatever you want to do.
Red Hat's approach is to create large K8s clusters and manage them independently.
Rancher's approach is to unify thousands of clusters into a single management control plane.
Because Rancher is designed for multi-cluster management, it applies global configuration where it benefits the operator (such as authentication and identity management) but keeps tight controls on individual clusters and namespaces within them.
Within the security boundaries Rancher gives developers access to clusters and namespaces, easy app deployment, monitoring and metrics, service mesh, and access to Kubernetes features without having to go and learn all about Kubernetes first.
But wait! Doesn't OpenShift give developers those things too?
Yes, but often with Red Hat-branded solutions that are modified versions of open source software. Rancher always deploys unadulterated versions of upstream software and adds management value to it from the outside.
The skills you learn using software with Rancher will transfer to using that same software anywhere else. That's not always the case with skills you learn while using OpenShift.
There are a lot of things in Kubernetes that are onerous to configure, independent of the value of using the thing itself. It's easy to spend more time fussing around with Kubernetes than you do using it, and Rancher wants to narrow that gap without compromising your freedom of choice.
What is it that you want to do, not only now, but in the future? You say that you already know Kubernetes, but something has you seeking a management solution for your K8s clusters. What are your criteria for success?
No one can tell you what you need to be successful. Not me, not Red Hat, not Rancher.
I chose to use Rancher and to work there because I believe that they are empowering developers and operators to hit the ground running with Kubernetes. Everything that Rancher produces is free and open source, and although they're a business, the vast majority of Rancher deployments make no money for Rancher.
This forces Rancher to create a product that has true value, not a product that they can convince other people to buy.
The proof is in the deployments - Red Hat has roughly 1,000 OpenShift customers, which means roughly 1,000 OpenShift deployments. Rancher has fewer paying customers than Red Hat, but Rancher has over 30,000 deployments that we know about.
You can be up and running with Rancher in under ten minutes, and you can import the clusters you already have and start working with them a few minutes later. Why not just take it for a spin and see if you like it?
I also invite you to join the Rancher Users slack. There you will not only find a community of Rancher users, but you will be able to find other people who compared Rancher and OpenShift and chose Rancher. They will be happy to help you with information that will lead you to feel confident about whatever choice you make.

Multi-Cluster Kubernetes - cross cluster communication [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 1 year ago.
Improve this question
Not sure if this is the right place, please point me to a different forum if not.
In a multi-cluster kubernetes setup, is cross-cluster communication a valid design? In particular, a pod in one cluster relying on a pod in another cluster.
Or are there limitations or anti-patterns associated with this that we should avoid?
If not, what tools do you use to manage this deployment and monitor load on each cluster?
Multicluster deployments give you a greater degree of isolation and availability but increase complexity. If your systems have high availability requirements, you likely need clusters across multiple zones and regions. You can canary configuration changes or new binary releases in a single cluster, where the configuration changes only affect a small amount of user traffic. Additionally, if a cluster has a problem, you can temporarily route traffic to nearby clusters until you address the issue.
Multiple meshes afford the following capabilities beyond that of a single mesh:
Organizational boundaries: lines of business
Service name or namespace reuse: multiple distinct uses of the default namespace
Stronger isolation: isolating test workloads from production workloads
I have found a very good youtube videos from KubeCon, check it out because it really explains how multi-cluster works, specially the first one with Matt Turner.
https://www.youtube.com/watch?v=FiMSr-fOFKU
https://www.youtube.com/watch?v=-zsThiLvYos
Check out Admiral which provides automatic configuration and service discovery for multicluster Istio service mesh
Istio has a very robust set of multi-cluster capabilities. Managing this configuration across multiple clusters at scale is challenging. Admiral takes an opinionated view on this configuration and provides automatic provisioning and syncing across clusters. This removes the complexity from developers and mesh operators pushing this complexity into automation.
In a multi-cluster kubernetes setup, is cross-cluster communication a valid design? In particular, a pod in one cluster relying on a pod in another cluster.
Based on provided links and my knowledge everything should work fine, pod can rely on a pod in another cluster.
More useful links:
https://istio.io/docs/ops/deployment/deployment-models/#multiple-clusters
https://banzaicloud.com/blog/istio-multicluster-federation-2/
https://github.com/istio-ecosystem/coddiwomple
https://github.com/istio-ecosystem/multi-mesh-examples
EDIT
how do the different frameworks of Kubefed and Admiral fit with each other? Can we use both or only use one?
I would not use kubefed since it's in alpha as far as i know, unless you really need it. I dont know how both of them would work together, I can only assume that they should both work.
what considerations should we have in deciding between different mesh architecture to facilitate cross-cluster communication?
Above, there is a link to youtube video, istio Multi-Cluster Service Mesh Patterns Explained, I would say it's up to you to decide which one you want to use based on your needs, the simplest one is the first described in the video, single control plane, single network. More about it there.