What are the major difference b/w Native Kubernetes and Kubernetes deployments?
I'm new to Kubernetes and trying to understand how different is the Flink deployments on them.
If any insight into internals is given it will be of great help.
In a Kubernetes session or per-job deployment, Flink has no idea it's running on Kubernetes. In this mode, Flink behaves as it does in any standalone deployment (where there is no cluster framework available to do resource management). Kubernetes just happens to be how the infrastructure was created, but as far as Flink is concerned, it could have been bare metal. You will have to arrange for kubernetes to create the infrastructure that you will have configured Flink to expect.
In a Native Kubernetes session deployment, Flink uses its KubernetesResourceManager, which submits a description of the cluster it wants to the Kubernetes ApiServer, which creates it. As jobs come and go, and the requirements for task managers (and slots) go up and down, Flink is able to obtain and release resources from kubernetes as appropriate.
In Application Mode (blog post) (details) you end up with Flink running as a kubernetes application, which will automatically create and destroy cluster components as needed for the job(s) in one Flink application.
Related
Per Flink's doc, we can deploy a standalone Flink cluster on top of Kubernetes, using Flink’s standalone deployment,
or deploy Flink on Kubernetes using native Kubernetes deployments.
The document says
We generally recommend new users to deploy Flink on Kubernetes using native Kubernetes deployments.
Is it because native Kubernetes is easier to get start with, or is it because standalone mode is kind of legacy ?
In native Kubernetes mode, Flink is able to dynamically allocate and de-allocate TaskManagers depending on the required resources. While in standalone mode, task managers have to be provisioned manually.
It sounds to me that native Kubernetes mode is a better choice.
Posted community wiki based on other answers - David Anderson answer and austin_ce answer. Feel free to expand it.
Good explanation from the David Anderson answer:
Standalone mode:
In a Kubernetes session or per-job deployment, Flink has no idea it's running on Kubernetes. In this mode, Flink behaves as it does in any standalone deployment (where there is no cluster framework available to do resource management). Kubernetes just happens to be how the infrastructure was created, but as far as Flink is concerned, it could have been bare metal. You will have to arrange for kubernetes to create the infrastructure that you will have configured Flink to expect.
Native mode:
Session deployment
In a Native Kubernetes session deployment, Flink uses its KubernetesResourceManager, which submits a description of the cluster it wants to the Kubernetes ApiServer, which creates it. As jobs come and go, and the requirements for task managers (and slots) go up and down, Flink is able to obtain and release resources from kubernetes as appropriate.
Application mode
In Application Mode (blog post) (details) you end up with Flink running as a kubernetes application, which will automatically create and destroy cluster components as needed for the job(s) in one Flink application.
The native mode is recommended because it is just simpler, I would not say it is legacy:
The Native mode is the current recommendation for starting out on Kubernetes as it is the simplest option, like you noted. In Flink 1.13 (to be released in the coming weeks), there is added support for specifying Pod templates. One of the drawbacks to this approach is its limited ability to integrate with CI/CD.
For my microservice based application, I am designing a component which is as follows:
Task that we want to execute is of periodic nature. For it, i planned to make use of the Kubernetes cron-jobs. It executes the job every 1 hour. This works perfectly fine.
In few scenarios, i want to execute this task on-demand (in stead of waiting for next hour window). For example, if next job time is 2:00pm, i want to execute it early, say 1:20pm.
There is a related question - How can I trigger a Kubernetes Scheduled Job manually?
But I am not looking for a manual way of achieving it or explicitly calling kubectl
commands. Is there a way do it automatically, based on events/queues?
Our application is deployed on AWS EKS and Azure AKS. Can I integrate the k8 clusters to read onto some queues/pub-subs (ex. aws-sqs, aws-sns) and do it dynamically?
Your help would be immensely appreciated!
If you application is running on Kubernetes and don't want to get migrated to serverless function and keep everything inside the Kubernetes cluster you can use the Knative.
Scale to Zero With Knative
Knative is a serverless platform that is built on top of Kubernetes. It provides higher-level abstractions for common application use cases.
One key feature is its ability to run generic (micro) service-based applications as serverless with the help of built-in scale to zero support. Knative has introduced its own autoscaler, Knative Pod Autoscaler (KPA), that supports scale to zero for any service that uses non-CPU-based scaling matrics.
update your micro service to running with Knative minor change will be there and you can run it on Kubernetes.
We are running Flink jobs on Kubernetes in Application mode, the problem is when the job is completed/stopped, the job manager container will exit but the 1. deployment for task managers 2. job manager service 3. configMap will still be there unless we run kubectl delete to clean it up.
This is not a big deal if we stop the job manually, but in case our Flink job is a batch job which will complete sometime later, it means we need an external service to keep monitoring job manager container and clean up the rest resources when it's done, which is not very practical.
I wonder what's the best practice here? Do we support run Flink batch jobs on Kubernetes? If yes then there should be a way for the Flink job itself to clean up everything when it's completed right?
I assume that you are running standalone Flink application on Kubernetes. In such mode, Flink is not aware of Kubernetes cluster. So the users have to leverage some external tools(e.g. kubectl, k8s-operator) to manage the lifecyle of Flink clusters. This means that you need to delete the TaskManager deployment, configmaps, services manually.
I think this situation could get improved via the following two ways.
Set the owner reference for TaskManager deployment, configmaps, services to JobManager job. However, you still need to delete the Kubernetes job manually after application finished.
Have a try on the native Kubernetes integration. Flink will have an embedded Kubernetes client and could delete the resource automatically when application finished.
I'm trying to deploy high available flink cluster on kubernetes. In the below examples worker nodes are replicated but we have only one master pod.
https://github.com/apache/flink-statefun
As far as I understand there are 2 approaches to make job manager HA.
https://ci.apache.org/projects/flink/flink-docs-stable/ops/jobmanager_high_availability.html
https://medium.com/hepsiburadatech/high-available-flink-cluster-on-kubernetes-setup-73b2baf9200e
In the first example we deploy another job manager to switch between them in case of failure
In the second example kubernetes redeploy the job manager pod in case of failure
So I have few questions
For both examples what happens to the running jobs when the active job manager fails?
Can the first scenario be applied on kubernetes?
For the second scenario in case of job manager failure flink UI will be unavailable until the pod recover but in the second first scenario it will be available am I right?
What is the pros/cons of the both scenarios?
There is one approach to make job manager HA, both of your link is using the JM HA using zookeeper cluster to make active/standby arhitecture of the JM.
When JobManager fails there is a "Failover" such as describe in apache flink documentation(first link), the standby JM become to be Active.
Ofcouse, kubernetes is just the deployment of the whole Flink cluster, you can still use the HA cluster mode using zk.
No, both will make the "failover" and a standby JM will become active.
You are not understand that kubernetes is only the deploy cluster of flink, Same as you can deploy it on phsical/virtual servers, than u can deploy it on kubernetes, but things like High Aviability will stay the same.
EDIT:
You can make 2 or more pods in kubernetes of JobManager and then it`ll be equal to the first solution.
I installed and configured 3 node K8S cluster. The worker nodes are windows nodes. We have one .Net application. We want to containerize this application. This application internally using Apache Ignite for the distributed cache.
We build docker image for this application, wrote a deployment file and deployed it in K8S cluster. The deployment will also create a service of “LoadBalancer” type. Using this service we are connecting to the application from the outside world. All is good so far.
Coming to the issue, as we are using Apache Ignite for the distributed cache. One of the POD will be master. We want to always forward the traffic to the POD which is acting as the master node in the Apache Ignite cluster. The Apache Ignite master node identification must be dynamic.
I had gone through the below link. Here the POD configuration is static. We want to dynamically identify the master POD and forward the traffic. What we have to do on the service side.
https://appscode.com/products/voyager/7.4.0/guides/ingress/http/statefulset-pod/
Any help on how to forward the traffic to the POD is greatly appreciated.
The very fact that you have a leader/follower topology, the ask to direct traffic to a said nome (master node) is flawed for a couple of reasons:
What happens when the current leader fails over and there is a new election to select a new leader
The fact that pods are ephemeral they should not have major roles to play in production, instead work with deployments and their replicas. What you are trying to achieve is an anti-pattern
In any case, if this is what you want, may be you want to read about gateways in istio which can be found here