If I have a multi - tier application (say web / logic / database), where each tier having it's own container, and I need to deploy all of these en - bloc, do they all have to go into the same pod?
And if they are in the same pod, does this have any implications in terms of the maximum size of application that can be run?
Or is there some higher level abstraction that I can use to start all three layers, but have them running on different minions?
Why do you need to deploy all of the components together? In a micro services architecture, you would want to reduce the dependencies between each layer to a clean interface and then allow each layer to be deployed and scaled separately from the others.
If you need to deploy them together (e.g. they share local disk or localhost networking) then you need to deploy them as a single pod. A single pod is an atomic scheduling unit, so it will be deployed onto a single host machine. Since it lands on a single host, this limits the scalability of your application to the size of a single host (not allowing you to scale out as your traffic increases).
If your three layers are not tightly coupled, then you can run them in different pods, which allows them to be scheduled across multiple hosts (or on the same host if, for example, you are doing local development). To connect the pods together, you can define services.
You should take a look at the guestbook example which illustrates how to define pods and services for a simple multi-tier web application running on Kubernetes.
Related
I see that kubernets uses pod and then in each pod there can be multiple containers.
Example I create a pod with
Container 1: Django server - running at port 8000
Container 2: Reactjs server - running at port 3000
Since the containers inside cant have port conflicts, then its better to put all of them in one containers. Because I see the advantage of using containers is no need to worry about port conflict.
Container 1: BOTH Django server - running at port 8000 and Reactjs server - running at port 3000
No need of container2.
and also
When i run different docker containers on my PC i cant access them like local host
But then how is this possible inside a POD with multiple containers.
Whats the difference between the docker containers run on PC and inside a POD.
The typical way to think about this delineation is "which parts of my app scale together?"
So for your example, you probably wouldn't even choose a common pod for them. You should have a Django pod and separately, a ReactJS server pod. Thus you can scale these independently.
The typical case for deploying pods with multiple containers is a pattern called "sidecar", where the added container enhances some aspect of the deployed workload, and always scales right along with that workload container. Examples are:
Shipping logs to a central log server
Security auditing
Purpose-built Proxies - e.g. handles DB connection details
Service Mesh (intercepts all network traffic and handles routing, circuit breaking, load balancing, etc.)
As for deploying the software into the same container, this would only be appropriate if the two pieces being considered for co-deployment into the same container are developed by the same team and address the same concerns (that is - they really are only one piece when you think about it). If you can imagine them being owned/maintained by distinct teams, let those teams ship a clean container image with a contract to use networking ports for interaction.
(some of) The details are this:
Pods are a shared Networking and IPC namespace. Thus one container in a pod can modify iptables and the modification applies to all other containers in that pod. This may help guide your choice: Which containers should have that intimate a relationship to each other?
Specifically I am referring to Linux Namespaces, a feature of the kernel that allows different processes to share a resource but not "see" each other. Containers are normal Linux processes, but with a few other Linux features in place to stop them from seeing each other. This video is a great intro to these concepts. (timestamp in link lands on a succinct slide/moment)
Edit - I noticed the question edited to be more succinctly about networking. The answer is in the Namespace feature of the Linux kernel that I mentioned. Every process belongs to a Network namespace. Without doing anything special, it would be the default network namespace. Containers usually launch into their own network namespace, depending on the tool you use to launch them. Linux then includes a feature where you can virtually connect two namespaces - this is called a Veth Pair (Pair of Virtual Ethernet devices, connected). After a Veth pair is setup between the default namespace and the container's namespace, both get a new eth device, and can talk to each other. Not all tools will setup that veth pair by default (example: Kubernetes will not do this by default). You can, however, tell Kubernetes to launch your pod in "host" networking mode, which just uses the system's default network namespace so the veth pair is not even required.
I am currently studying distributed systems and have seen that many businesses relies on side-car proxy pattern for their services. For example, I know a company that uses an nginx proxy for authentication of their services and roles and permissions instead of including this business logic within their services.
Another one makes use of a cloud-sql-proxy on GKE to use the Cloud SQL offering that comes on google cloud. So on top of deploying their services in a container which runs in a pod, they is a proxy just for communicating with the database.
There is also istio which is a service mesh solution which can be deployed as a side-car proxy in a pod.
I am pretty sure there are other commonly know use-cases where this pattern is used but at some point how much is too much side-car proxy? How heavy is it on the pod running it and what are the complexity that comes with using 2, 3, or even 4 side car proxys on top of your service container?
I recommend you to define what really you need and continue your research based on this, since this topic is too broad and doesn't have one correct answer.
Due to this, I decided to post a community wiki answer. Feel free to expand it.
There can be various reasons for running some containers in one pod. According to Kubernetes documentation:
A Pod can encapsulate an application composed of multiple co-located
containers that are tightly coupled and need to share resources. These
co-located containers form a single cohesive unit of service—for
example, one container serving data stored in a shared volume to the
public, while a separate sidecar container refreshes or updates
those files. The Pod wraps these containers, storage resources, and an
ephemeral network identity together as a single unit.
In its simplest form, a sidecar container can be used to add functionality to a primary application that might otherwise be difficult to improve.
Advantages of using sidecar containers
sidecar container is independent from its primary application in terms of runtime environment and programming language;
no significant latency during communication between primary application and sidecar container;
the sidecar pattern entails designing modular containers. The modular container can be plugged in more than one place with minimal modifications, since you don't need to write configuration code inside each application.
Notes regarding usage of sidecar containers
consider making a small sidecar container that doesn't consume much resources. The strong point of a sidecar containers lies in their ability to be small and pluggable. If sidecar container logic is getting more complex and/or becoming more tightly coupled with the main application container, it may better be integrated with the main application’s code instead.
to ensure that any number of sidecar containers can works successfully with main application its necessary to sum up all the resources/request limits while defining resource limits for the pod, because all the containers will run in parallel. Whole functionality works only if both types of containers are running successfully and most of the time these sidecar containers are simple and small that consume fewer resources than the main container.
I want to deploy multiple guest executables in one node but I'm not sure how does it works behind the screens? How are the VMs resources divided between each executable? Is it done in an efficient way? Do I need to configure something for getting all the executable well packed in the VM to save memory? How can I know how many executables can be run on the same VM?
In order to understand how Service Fabric allocates services inside nodes you need to understand its hosting model.
Service Fabric hosting model is very powerful when using the Service Fabric programming model (Stateless and Stateful services and Actors), but more limited with Guest Executables and Containers, although it's still efficient.
Basically, Service Fabric will activate a ServicePackage (a process) for each Guest Executable (service). Depending on the configured number of instances for each service, Service Fabric will either run one instance of each service per node (if InstanceCount = -1) or it will distribute the instances across all the nodes (if InstanceCount is a number lower than the number of nodes).
There is no limitation on the number of services you can run on a single node, but each service will consume resources (CPU, RAM, ports, etc) and it can become a problem. In that case, you have several options:
Increase the size of the VM for that node type
Scale out the node type (add more nodes) and specify a lower number of instances per service (so that not every node has all services)
Create more node types and organize the services accordingly using Placement Constraints (for example, you could have High Compute nodes, High Ram nodes, Public Nodes, Backend Nodes, Financial Nodes, Analytics Nodes... it depends on what makes sense for your scenario).
How can I know how many executables can be run on the same VM?
As I mentioned before, this does not depend on the quantity, but in the resources used by these executables and the size of the VM. Depending on your services, you might be able to estimate the resources they need, but you'll definitely need to test and monitor your cluster because no amount of calculations beats reality.
Update: add interesting links
You can help Service Fabric manage your cluster more efficiently by making your services report dynamic metrics and also by providing limits to what resources a single service can take (to avoid a service from consuming all the memory of a node for example):
Custom Metrics
Resource Governance
I'm migrating a number of applications from AWS ECS to Azure AKS and being the first production deployment for me in Kubernetes I'd like to ensure that it's set up correctly from the off.
The applications being moved all use resources at varying degrees with some being more memory intensive and others being more CPU intensive, and all running at different scales.
After some research, I'm not sure which would be the best approach out of running a single large cluster and running them all in their own Namespace, or running a single cluster per application with Federation.
I should note that I'll need to monitor resource usage per application for cost management (amongst other things), and communication is needed between most of the applications.
I'm able to set up both layouts and I'm sure both would work, but I'm not sure of the pros and cons of each approach, whether I should be avoiding one altogether, or whether I should be considering other options?
Because you are at the beginning of your kubernetes journey I would go with separate clusters for each stage you have (or at least separate dev and prod). You can very easily take your cluster down (I did it several times with resource starvation). Also not setting correctly those network policies you might find that services from different stages/namespaces (like test and sandbox) communicate with each other. Or pipelines that should deploy dev to change something in other namespace.
Why risk production being affected by dev work?
Even if you don't have to upgrade the control plane yourself, aks still has its versions and flags and it is better to test them before moving to production on a separate cluster.
So my initial decision would be to set some hard boundaries: different clusters. Later once you get more knowledge with aks and kubernetes you can review your decision.
As you said that communication is need among the applications I suggest you go with one cluster. Application isolation can be achieved by Deploying each application in a separate namespace. You can collect metrics at namespace level and can set resources quota at namespace level. That way you can take action at application level
A single cluster (with namespaces and RBAC) is easier to setup and manage. A single k8s cluster does support high load.
If you really want multiple clusters, you could try istio multi-cluster (istio service mesh for multiple cluster) too.
Depends... Be aware AKS still doesn't support multiple node pools (On the short-term roadmap), so you'll need to run those workloads in single pool VM type. Also when thinking about multiple clusters, think about multi-tenancy requirements and the blast radius of a single cluster. I typically see users deploying multiple clusters even though there is some management overhead, but good SCM and configuration management practices can help with this overhead.
I am trying to implement CI/CD pipeline using Kubernetes and Jenkins. In my application I have 25 Micro services. And need to deploy it for 5 different clients. The microservice code is unique. But configuration for each client is different.
So here I am configuring Spring cloud config server with 5 different Profiles/Configuration. And When I am building Docker images, I will define which is the active config server profile by adding active profile in Docker file. So from 25 microservices I am building 25 * 5 number of Docker images and deploying that. So total 125 microservices I need to deploy in Kubernetes cluster. And these microservice are calling from my Angular 2 front end application.
Here when I am considering the performance of application and speed of response, the single master is enough of this application architecture? Or Should I definitely need to use the multi master Kubernetes cluster? How I can manage this application?
I am new to these cloud and CI/CD pipeline architecture tasks. So I have confusion related with designing of workflow. If single master is enough, then I can continue with current. Otherwise I need to implement the multi master Kubernetes HA cluster.
The performance of the application and/or the speed do not depend on the number of master nodes. It resolves High Availability issues, but not performance. Now, you should still consider having at least 3 masters for this implementation you are working on. If the master goes down, your cluster is useless.
In Kubernetes, the master gets the API calls and acts upon them, by setting the desired state of the cluster to the current state. But in the end that's the nodes (slaves) doing the heavy work. So your performance issues will depend mostly, if not exclusively, on your nodes. If you have enough memory and CPU, you should be fine.
Multi master sounds like a good idea for HA.
You could also look at using Helm which lets you configure microservices in a per installation basis so that you don't have to keep re-releasing docker images each time to configure a new environment. You can then inject the helm configuration into, say, a ConfigMap that mounts the content as an application.yml so that Spring Boot automatically loads the settings