Flink and k8s deployment - kubernetes

Per Flink's doc, we can deploy a standalone Flink cluster on top of Kubernetes, using Flink’s standalone deployment,
or deploy Flink on Kubernetes using native Kubernetes deployments.
The document says
We generally recommend new users to deploy Flink on Kubernetes using native Kubernetes deployments.
Is it because native Kubernetes is easier to get start with, or is it because standalone mode is kind of legacy ?
In native Kubernetes mode, Flink is able to dynamically allocate and de-allocate TaskManagers depending on the required resources. While in standalone mode, task managers have to be provisioned manually.
It sounds to me that native Kubernetes mode is a better choice.

Posted community wiki based on other answers - David Anderson answer and austin_ce answer. Feel free to expand it.
Good explanation from the David Anderson answer:
Standalone mode:
In a Kubernetes session or per-job deployment, Flink has no idea it's running on Kubernetes. In this mode, Flink behaves as it does in any standalone deployment (where there is no cluster framework available to do resource management). Kubernetes just happens to be how the infrastructure was created, but as far as Flink is concerned, it could have been bare metal. You will have to arrange for kubernetes to create the infrastructure that you will have configured Flink to expect.
Native mode:
Session deployment
In a Native Kubernetes session deployment, Flink uses its KubernetesResourceManager, which submits a description of the cluster it wants to the Kubernetes ApiServer, which creates it. As jobs come and go, and the requirements for task managers (and slots) go up and down, Flink is able to obtain and release resources from kubernetes as appropriate.
Application mode
In Application Mode (blog post) (details) you end up with Flink running as a kubernetes application, which will automatically create and destroy cluster components as needed for the job(s) in one Flink application.
The native mode is recommended because it is just simpler, I would not say it is legacy:
The Native mode is the current recommendation for starting out on Kubernetes as it is the simplest option, like you noted. In Flink 1.13 (to be released in the coming weeks), there is added support for specifying Pod templates. One of the drawbacks to this approach is its limited ability to integrate with CI/CD.

Related

Can we spin off a kubernetes cronjob automatically and dynamically? How can we do it in AWS EKS, Azure AKS based on queues or notifications?

For my microservice based application, I am designing a component which is as follows:
Task that we want to execute is of periodic nature. For it, i planned to make use of the Kubernetes cron-jobs. It executes the job every 1 hour. This works perfectly fine.
In few scenarios, i want to execute this task on-demand (in stead of waiting for next hour window). For example, if next job time is 2:00pm, i want to execute it early, say 1:20pm.
There is a related question - How can I trigger a Kubernetes Scheduled Job manually?
But I am not looking for a manual way of achieving it or explicitly calling kubectl
commands. Is there a way do it automatically, based on events/queues?
Our application is deployed on AWS EKS and Azure AKS. Can I integrate the k8 clusters to read onto some queues/pub-subs (ex. aws-sqs, aws-sns) and do it dynamically?
Your help would be immensely appreciated!
If you application is running on Kubernetes and don't want to get migrated to serverless function and keep everything inside the Kubernetes cluster you can use the Knative.
Scale to Zero With Knative
Knative is a serverless platform that is built on top of Kubernetes. It provides higher-level abstractions for common application use cases.
One key feature is its ability to run generic (micro) service-based applications as serverless with the help of built-in scale to zero support. Knative has introduced its own autoscaler, Knative Pod Autoscaler (KPA), that supports scale to zero for any service that uses non-CPU-based scaling matrics.
update your micro service to running with Knative minor change will be there and you can run it on Kubernetes.

Cassandra health check in phantom

In our containerised scala application , we are using phantom library for persisting and retrieving data from Cassandra. We have a requirement to do regular health check on Cassandra.
Presently, on bootstrap of application when there is a deployment in any new kubernetes pod, we check for an active Cassandra session and then later run a scheduled check on Cassandra health.
Appreciate if could share alternatives to do health check on Cassandra.
If you're using the DataStax Cassandra Operator (cass-operator), the health check is already done for you automatically. If a pod goes down, the cass-operator will automatically attempt to recover it for you.
If you haven't already seen it, have a look at open-source K8ssandra. It is a ready-made platform for running Apache Cassandra in Kubernetes using the DataStax Cassandra Operator under the hood but with all the tooling built-in:
Reaper for automated repairs
Medusa for backups and restores
Metrics Collector for monitoring with Prometheus + Grafana
Traefik templates for k8s cluster ingress
Since all these components are open-source, they are all free to use and don't require a licence or paid subscription but still comes with a robust community support. Cheers!

How different is the Flink deployment on Kubernetes and Native Kubernetes

What are the major difference b/w Native Kubernetes and Kubernetes deployments?
I'm new to Kubernetes and trying to understand how different is the Flink deployments on them.
If any insight into internals is given it will be of great help.
In a Kubernetes session or per-job deployment, Flink has no idea it's running on Kubernetes. In this mode, Flink behaves as it does in any standalone deployment (where there is no cluster framework available to do resource management). Kubernetes just happens to be how the infrastructure was created, but as far as Flink is concerned, it could have been bare metal. You will have to arrange for kubernetes to create the infrastructure that you will have configured Flink to expect.
In a Native Kubernetes session deployment, Flink uses its KubernetesResourceManager, which submits a description of the cluster it wants to the Kubernetes ApiServer, which creates it. As jobs come and go, and the requirements for task managers (and slots) go up and down, Flink is able to obtain and release resources from kubernetes as appropriate.
In Application Mode (blog post) (details) you end up with Flink running as a kubernetes application, which will automatically create and destroy cluster components as needed for the job(s) in one Flink application.

Triggering a Kubernetes-based application from AppEngine

I'm currently looking into triggering some 3D rendering from an AppEngine-based service.
The idea is that input data is submitted by an API client to this web service, which then invokes an internal Kubernetes GPU enabled application ("rendering backend") to do the hard work.
GPU-enabled clusters are relatively expensive ($$$), so I really want the cluster to be up and running on demand. I am trying to achieve that by setting the autoscaling minimum to 0 for the rendering backend.
The only pretty way of "triggering" a rendering task on such a cluster I could think of is via Pub/Sub Push. Basically, I need something like Cloud Tasks, but those seem to be aimed at long running tasks executed in AppEngine, not Kubernetes. Plus I like the way Pub/Sub decouples the web service from the rendering backend.
Google's Pub/Sub only allows pushing via HTTPS and only to a validated domain. It appears that Google is forcing me to completely "expose" my internal rendering backend by assigning a domain name to it, which feels ridiculous. I cannot just tell Pub/Sub to invoke http://loadbalancer.IP.address/handle_push.
This is making me doubt my architecture.
How would you go about building something like this on GCP?
From the GKE perspective:
You can have a cluster with a dedicated GPU-based nodepool and schedule your pods there using Taints and tolerations. Additionally, you can control the number of nodes in your nodepool using Autoscaling so that, you can use them only when your pods are to be scheduled/run.
Consider that this requires an additional default-non-GPU-based nodepool, where system pods are being run.
For triggering, as long as your default pool is running, you'd be able to deploy your application and the autoscaling should start automatically. For deploying from an App Engine application, you might want to consider talking to the Kubernetes API directly through a library.
Finally and considering the nature of your current goal (3D rendering), it might be best to use Kubernetes Jobs. With these, you can complete an sporadic computational load, allowing the nodepool to downsize once is finished.
Wrapping up, you can have a minimum cluster with a zero-sized GPU-based nodepool that will autoscale when a tainted job is requested to be run there, and once the workload is finished, it should automatically downscale. These actions can be triggered from GAE, using one of the client libraries.

Terraform, Kubernetes, Mesos etc - how are they connected

Reading a lot on internet but the information is not clear or mixedup so I thought I will ask the question here.
I am trying to understand how Terraform is same or different from container orchestration tools like Kubernetes, Mesos etc.
Can Terraform work independently or Kube and Mesos? How is it connected to docker containers?
Can someone please shed the light?
Thanks!!!
I don't know enough about Mesos as I would like, but I do know about Kubernetes and Terraform. Despite I'm not an expert the general basics between this tools have a different purpose. While Terraform deals with the generation of the infrastructure in the cloud by using their apis, Kubernetes deals with the administration and orchestration of containers in the undergroown infrastructure by using the api of the container daemon such the docker daemon.
So generally talking the Terraform main point is to make transparent the creation of the cloud infrastructure where you write what you want to have, servers, network, security policies, some PaaS Service and Kubernetes is the orchestrator of containers.
Hope this helps you. Please, in the case of someone saws a mistake. Remark it so we all improves.
Terraform - A Tools to Build your Infrastructure an open-source project Hashicorp labs if you are aware with AWS and heard of CloudFormation both work in same manner but Terraform have some better feature you can write your whole Infrastructure as a Code run it in one click and decommissioned it in one click.
For more you must visit the site: https://www.terraform.io
Now Kubernetes (An open source project by Google) and Apache-Mesos( Or DC/OS) An project by Apache foundation both are used for Container orchestration (and I’m purposely avoiding using the word Docker) is not for everyone and does not answer every need.
Mesos was launched first but it was really hard to manage Mesos networking that time. and In 2014 there was the first Release of Kubernetes comes in.
Now, DC/OS (the Distributed Cloud Operating System) is an open-source, distributed operating system based on the Apache Mesos distributed systems kernel.
It's in the race with Kubernetes .
I would suggest you must go through this article to get a better understanding of Kubernetes vs Mesos : https://logz.io/blog/kubernetes-vs-mesos/
And Yes they are not related to Terraform at all.
Thanks