We have a kind of evaluation job which consists of several thousand invocations of a legacy binary with various inputs, each of which running like a minute. The individual runs are perfectly parallelizable (one instance per core).
What is the state of the art to do this in a hybrid cloud scenario?
Kubernetes itself does not seem to provide an interface for prioritizing or managing waiting jobs. Jenkins would be good at these points, but feels like a hack. Of course, we could hack something ourselves, but the problem should be sufficiently generic to already have an out-of-the box solution.
There are a lot of frameworks that helps managing jobs in Kubernetes cluster. The most popular are:
Argo for orchestrating parallel jobs on Kubernetes. Workflows is implemented as a Kubernetes CRD (Custom Resource Definition).
Airflow - has a modular architecture and uses a message queue to orchestrate an arbitrary number of workers. Also take a look for kubernetes-executor.
I recommend you to look for this video which describe each of framework and help you decide which is better for you.
You may be interested in following aricles about using Mesos for Hybrid Cloud
Xue, Noha & Haugerud, Hårek & Yazidi, Anis. (2017). Towards a Hybrid Cloud Platform Using Apache Mesos. 143-148. 10.1007/978-3-319-60774-0_12.
Hybrid cloud technology is becoming increasingly popular as it merges private and public clouds to bring the best of two worlds together. However, due to the heterogeneous cloud installation, facilitating a hybrid cloud setup is not simple. Despite the availability of some commercial solutions to build a hybrid cloud, an open source implementation is still unavailable. In this paper, we try to bridge the gap by providing an open source implementation by leveraging the power of Apache Mesos. We build a hybrid cloud on the top of multiple cloud platforms, private and public.
Apache Mesos For All Your Hybrid Cloud Needs
Choosing the Best Approach to Hybrid Cloud
Related
As I dive into the world of Cloud Composer, Airflow, Google Kubernetes Engine, and Kubernetes I've not yet found a good answer to what exactly makes Cloud Composer better than Helm and GKE.
Here are some things I've found that could be unique to Composer but mostly seem like they could be handled by GKE.
On their homepage:
End-to-end integration with Google Cloud products including BigQuery, Dataflow, Dataproc, Datastore, Cloud Storage, Pub/Sub, and AI Platform gives users the freedom to fully orchestrate their pipeline.
On the features page:
Identity-Aware Proxy protects the interface
Cloud Composer associates a Cloud Storage bucket with the environment. The associated bucket stores the DAGs, logs, custom plugins, and data for the environment.
The downsides of Composer I've seen include:
It takes many hours to spin up a new instance
It doesn't support Kubernetes Executor
It is risky to change the underlying GKE config because it could be changed back by a composer update
There are often errors that happen when auto-scaling often happen but are documented as known
Upgrading environments is still beta
To be clear, I'm not saying Cloud Composer is bad. I'm just having trouble seeing why people like it. When I've asked folks why it is better than Helm + GKE they haven't had any compelling answers despite that they can tell many stories of Composer being unpredictable and having lots of issues.
Are you comparing the same things?
On one side, GKE, you have a container orchestrator. Declare that you want, it will deploy and maintain the stability of the cluster according with declared configuration. This configuration can be packaged with helm to write it in an easier mode. Because you deploy container, you can use the language that you want in your services.
On the other side, you have a workflow manager, with scheduler, retry policies, parallel task, context forwarding. you write DAG in python (only!) and you have operators to interact with external product/services. It's mainly designed for data processing and used a lot by data scientist and data engineering team.
Note: Cloud Composer is deployed on top of GKE (scheduler and worker), redis, app engine and Cloud SQL.
You compare 2 different worlds: Ops world (GKE/Helm) and the App/Data world (Composer/Airflow). Have a look to this new video
Update 1:
My bad, I didn't understand!!! Anyway, personally I don't want to manage things by myself: a cluster, the update of K8S, VM patching, replicas, snapshot, backup/restore,...
If someone can do this for me, I prefer, and managed services are perfect for me!!
Do you ask yourselves this question about Cloud SQL and a database managed by yourselves on a Compute Engine instance? If not (because Cloud SQL solve a lot of boring issues), my opinion is the same for Composer.
But it's an opinion, I didn't test both and compare the performance, cost and easiness.
Hoping that there is some good insight into how to handle orchestration amount microservices in an on prem smaller company environment. Currently, the systems that we are looking to convert from monolithic to microservices like the rest of the world :).
The problem I'm having with as an architect, is justifying the big learning curve and server requirements with the resources we have at the moment. I can easily see us having 50ish microservices, which I feel could be on that line of using kubernetes or not.
The thing is, if we don't, how do we monitor if it is on-prem. We do use Azure Devops, so I'm wondering if this would safice for deployment parts.
Thanks!
This comes down to a debate over essential vs accidental complexity. The verdict is in from companies that k8s strikes a good balance vs swarm and other orchestrators are barely talked about in the industry.
https://www.reactiveops.com/blog/is-kubernetes-overkill
The platforms that build on kubernetes are still emerging to offer a simpler interface for those wanting a higher level of abstraction but aren't mature enough yet. GKE offers a very easy way to just deal with workloads, AKS is still maturing so you will likely face some bugs but it is tightly integrated with Azure Devops.
Microsoft is all-in on k8s although their on-prem offering doesn't seem fully fledged yet. GKE on-prem and Openshift 4.1 offer fully managed on-prem (if using vSphere) for list price of $1200/core/year. https://nedinthecloud.com/2019/02/19/azure-stack-kubernetes-cluster-is-not-aks/
Other ways of deploying on prem are emerging so long as you're comfortable with managing the compute, storage and network yourself. Installing and upgrading are becoming easier (see e.g. https://github.com/kubermatic/kubeone which builds on the cluster-api abstraction). For bare metal ambitious projects like talos are making k8s specific immutable OSes (https://github.com/talos-systems/talos).
AWS is still holding out hope for lock-in with ECS and Fargate but it remains to be seen if that will succeed.
I'm fairly new to Elixir and container orchestration technologies like Kubernetes. I know that Elixir runs on the BEAM, which I've heard makes it possible to easily run programs on a network of computers. From what I understand of Kubernetes, Kubernetes manages a swarm of containers in a virtual network.
My question: would it be a good idea to use Kubernetes to manage a scalable swarm of containers running an Elixir program? If so, are there any pre-configured Kubernetes setups that make this easy?
We just moved to kubernetes for container management at my startup. I have a couple opinions on this:
If the goal is simply to distribute an elixir application over a "swarm" of nodes and / or machines, Kubernetes adds quite a bit of complexity and plumbing towards that goal. You end up having not just the learning curve of BEAM distribution, but of the kubernetes ecosystem itself.
However, kubernetes and elixir applications can work really well together, especially if your application deployment is more complex than pure elixir projects. For example we have custom services run from javascript, python, elixir, and golang in our application deployment (not too mention a dozen random tools).
Kubernetes helps extend the ability to distribute and manage nodes for all of these services with one strategy.
To answer your question: it can be a good idea, but I'd recommend first learning to manage the elixir program distributed by itself. Its not quite as simple as everyone makes it out to be, although its better than most programming technologies for other languages IMO. Here's a good guide to getting started with distributed elixir/erlang: http://engineering.pivotal.io/post/how-to-set-up-an-elixir-cluster-on-amazon-ec2/.
You can isolate that learning curve, and once you feel comfortable with it move on to kubernetes. Here's a really good resource on getting started with swarming via kubernetes: http://bitwalker.org/posts/2016-08-04-clustering-in-kubernetes/
Oops, one final comment: the advantages of tackling both learning curves are that you end up with the power of distributed elixir, the rest of your application deployment can keep up, and the kubernetes echosystem enables a lot of very cool tech like auto-scaling, helping unlock you out of a cloud provider, etc.
Our team had used elixir builded binary and docker/consul for three years.
When the servers per service are less than three, just used containers to manage are fine.
Otherwise, it is painful to scale your services(servers are expensive and you can not figure out the pressure of your services).
I had experienced this and tried using Swarm. It works like a charm! You can easily scale your services and manage servers.
So, when you meeting many servers, it is a good idea to use Kubernetes to manage a scalable swarm of containers running an Elixir program.
What does Apache Mesos do that Kubernetes can't do or vice-versa?
Mesos is a Two level scheduler. Sure it grabs resource information from every machine and gives it to the top level scheduler such that frameworks like kubernetes can use to schedule containers across machines but Kubernetes can itself schedule containers across machines (No need for Mesos from this regard). so what are few things that Apache Mesos can do that Kubernetes cannot do or vice-versa?
Both Mesos and Kubernetes are n-th level containers orchestrators. This means you can achieve the same features but some kind of tasks could be done easier (read. better) on one of them. In fact, you can run Kubernetes on Mesos and vice verse.
Let's go through main differences that give some clue when you need to make a decision:
Architecture
As you pointed out Mesos is a Two-Level Scheduler and this is the main difference in architecture. This gives you the ability to create your custom scheduler (aka framework) to run your tasks. What's more, you can have more than one scheduler. All your schedulers compete for the resources that are fairly distributed using Dominant Resources Fairness algorithm (that could be replaced with custom allocator). You can also assign roles to the frameworks and tasks and assign weights to this roles to prioritize some schedulers. Roles are tightly connected with resources. Above features gives you the ability to create your own way of scheduling for different applications (e.g., Fenzo) with different heuristics based on a type of tasks you want to run. For example, when running batch tasks it's good to place them near data and time to start is not so important. On the other hand, running stateless services is independent of nodes and it's more critical to run them ASAP.
Kubernetes architecture is a single level scheduler. That means decisions where pod will be run are made in a single component. There is no such thing as resource offer. On the other hand, everything there is pluggable and built with a layered design.
Origin
Mesos was created at Twitter (formerly at Berkeley but the first production usage was at Twitter) to support their scale.
In March 2010, about a year into the Mesos project, Hindman and his Berkeley colleagues gave a talk at Twitter. At first, he was disappointed. Only about eight people showed up. But then Twitter's chief scientist told him that eight people was lot – about ten percent of the company's entire staff. And then, after the talk, three of those people approached him.
Soon, Hindman was consulting at Twitter, working hand-in-hand with those ex-Google engineers and others to expand the project. Then he joined the company as an intern. And, a year after that, he signed on as a full-time employee.
source
Kubernetes was created by Google to bring users to their cloud promising no lock-in experience. This is the same technique Amazon did with Kindle. You can read any book on it but using it with Amazon gives you the best experience. The same is true for Google. You can run Kubernetes on any cloud (public or private) but the best tooling, integration and support you'll get only on Google Cloud.
But Google and Microsoft are different. Microsoft wants to support everything on Azure, while Google wants Kubernetes everywhere. (In a sense, Microsoft is living up to the Borg name, assimilating all orchestrators, more than Google is.) And quite literally, Kubernetes is how Google is playing up to the on-premises cloud crowd giving it differentiation from AWS (which won’t sell its infrastructure as a stack with a license, although it says VMware is its private cloud partner) and Microsoft (which still doesn’t have its Azure Stack private cloud out the door). source
Community
Judging a project simply by its community size could be misleading. It's like you'd be saying that php is a great language because it has large community.
Mesos community is much smaller than Kubernetes. That's the fact. Kubernetes has financial support from many big companies including Google, Intel, Mirantis, RedHat and more while Mesos is developed mainly by Mesosphere with some support from Apple, Microsoft. Although Mesos is a mature project, its development is slow but stable. On the other hand, Kubernetes is much younger, but rapidly developed.
Meso contributors origin
The Kubernetes Community - Ian Lewis, Developer Advocate, Google
Scale
Mesos was targeted for big customers from the early beginning. It is used at Twitter, Apple, Verizon, Yelp, Netflix to run hundreds of thousands of containers on thousands of servers.
Kubernetes was started by Google to give developers Google Infrastructure experience (GIFFE). From the beginning, it was prepared for small scale up to hundreds of machines. This constraint is increased with every release but they started small to grow big. There are no public data about biggest Kubernetes installation.
Hype
Due to scale issues, Kuberntetes started to be popular among smaller companies (not cloud scale) while Mesos was targeted for enterprise users. Kubernetes is supported by Cloud Native Foundation while Mesos is Apache Foundation Project. These two foundations have different founding and sponsors. Generally, more money gives you better marketing and Kubernetes definitely did it right.
https://g.co/trends/RUuhA
Conclusion
It looks like Kubernetes already won the containers orchestrator war. But if you have some custom workloads and really big scale, Mesos could be a good choice.
The main difference is in the community size and the open source model : where DCOS is supported by Mesosphere and provide enterprise features in a commercial product only (because mesosphere isn't philanthropist), K8S has a larger community with strong contributions from different companies resulting in providing much more integrated enterprise features (multitenancy, RBAC, quota, preemption, gateways...) meaning they are easier to use, not necessarily they don't exist in DCOS.
I would globally say that :
DCOS is more battle tested for stateful and big data workloads but lacks of integration with other perimetric components including plug and play central monitoring and logging and enterprise features like security model, multi tenancy, auto updates... It was a very hard way to integrate everything for a production grade platform.
K8S is more battle tested for stateless apps and provides lots of plug and play tools like prometheus, EFK, helm... which makes the implementation of a production grade platform much easier. Next to that there is a big move on stateful workloads with statefulsets and the operator pattern which is comparable with mesos frameworks but again, K8S provides lots of tools to develop them with less costs because lots of functionalities are provided out of the box, it takes me 2 months to develop a MongoDB operator to provide MongoDB as a service in a multi tenant and secured way and I needed to learn Golang in the same time.
source
https://www.infoworld.com/article/3118345/cloud-computing/why-kubernetes-is-winning-the-container-war.html
https://www.theregister.co.uk/2017/10/17/docker_ee_kubernetes_support
https://www.techrepublic.com/article/these-two-vendors-are-most-likely-to-bring-kubernetes-containers-to-the-enterprise
https://www.cloudhealthtech.com/blog/container-wars-are-over-kubernetes-has-won
https://news.ycombinator.com/item?id=12462261
I see these both in Bluemix, but what is the difference between them?
Cloud Foundry and OpenWhisk are two Bluemix Compute models that a developer can used to power an application's workload.
I'll give a very high-level summary of both services and when I would use them...
Cloud Foundry
IBM Bluemix was originally based off Cloud Foundry's open technology. It is a cloud computing platform as a service that supports the full lifecycle, from initial development, through all testing stages, to deployment.
Cloud Foundry has a CLI program called cf which is the primary tool to interact with Bluemix (or Bluemix provides a web GUI for this).
Cloud Foundry introduces the concepts of Organizations that contain Spaces which you can think of as workspaces. Different spaces typically correspond to different lifecycle stages for an application.
Cloud Foundry introduces the concepts of Services and Applications. A Cloud Foundry service usually performs a particular function (like a database service), and an application usually has services and their keys bound to it.
OpenWhisk
OpenWhisk is a brand new IBM Cloud developed distributed event-driven compute model.
It has a distributed automatically scaling serverless architecture that executes application logic on events.
OpenWhisk also has a CLI program called wsk which can be used to run your code snippets, or actions, on OpenWhisk.
OpenWhisk introduces the concepts of Triggers, Actions, and Rules.
Triggers are a class of events emitted by event sources.
Actions encapsulate the actual code to be executed which support multiple language bindings including Node.js, Swift and arbitrary binary programs encapsulated in Docker Containers. Actions invoke any part of an open ecosystem including existing Bluemix services for analytics, data, cognitive, or any other 3rd party service.
Rules are an association between a trigger and an action.
Cloud Foundry vs. OpenWhisk
So the question remains: when should you use Cloud Foundry, or when should you use OpenWhisk?
In my limited experience using OpenWhisk, here are my thoughts. I like to think of OpenWhisk as an easily implementable automatically scaling architecture that application developers can use without needing much prior knowledge in backend development. I think of Cloud Foundry as a lower level in the software stack which might give you more customization, but will likely take more skill and knowledge for setting it up.
I would use Cloud Foundry if I...
Was a backend & application developer.
Had experience creating and connecting services together.
Needed functionality that just might not be possible using OpenWhisk.
I would use OpenWhisk if I...
Was an application developer.
Didn't want to worry about a server.
Didn't want to learn different programming languages, etc. to figure out how to set up my server.
Really wanted focus on developing my application and have the backend just work.
Hope that helped.
Edit:
Here's a cool image that I found that illustrates this:
CloudFoundry is a PaaS (Platform-as-a-service) platform, which means in a nutshell, that it hosts the platform for your application to run on. Examples of a platform include node.js or a JVM.
OpenWhisk is a serverless platform. The term FaaS (Function-as-a-service) seems to be emerging as well. You upload code, which is executed once an event happens. That event might be anything, ranging from a simple HTTP request to a change happening in your database.
The fundamental difference between the two is the mode of operation. PaaS means, you're still running a server-process. You'll have a long running process which listens to events and executes your logic, once an event happens. All the other time, the process is idle, still requiring CPU cycles and memory to actually listen for events.
In serverless, the platform takes the burden of "listening for events". Once an event happens, your code is instantiated and executed. That code is shutdown afterwards thus not requiring any resources anymore. That also explains why OpenWhisk actions have a time limitation of 5 minutes. It is not meant to have long running actions.
Disclaimer: Both platforms support a lot more than I described here, I tried to keep it down to the most substantial difference between the both.