Running zookeeper from within application - distributed-computing

I'd like to use zookeeper in one of my applications for distributed configuration management. The application is currently running in distributed environment and having to restart nodes for configuration files changes is a headache.
However, we want the zookeeper process to be started from within the application. The point is to reduced startup dependency and reduce operational cost. We've already have startup/shutdown scripts for the application and we need to reduce impact for operations team.
Has any one done something similar? Is this setup recommended or there are better solutions? Any tip or feedback is appreciated.

I have a blog post that describes how to embed Zookeeper in an application. The Zookeeper developers don't recommend it, though, and I would tend to agree now, though I had the same rationale for embedding it that you do - to reduce the number of moving parts.
You want to keep your ZK cluster stable but you will need to restart your app to do code updates, etc, impacting the ZK cluster stability.
Ultimately you will end up using your ZK cluster for multiple apps and those extra moving parts will be amortized over a number of projects.

Related

local development of microservices, methods and tools to work efficiently

I work with teams members to develop a microservices architecture but I have a problem with the way to work. Indeed, I have too many microservices and when I run them during my development, it consumes too memory even with a good workstation. So I use docker compose to build and execute my MSA but it takes a long time. One often hears about how technically build an MSA but never about the way to work efficiently to build it. How do you do in this case ? How do you work ? Do you use tools or any others to improve and facilitate your developments. I've heard about skaffold but I don't see what the difference is with docker compose or with a simple ci/cd in a cluster env for example. Feel free to give tips and your opinion. Thanks
I've had a fair amount of experience with microservices and local development and here's been some approaches I've seen:
Run all the things locally on docker or k8. If using k8, then a tool like skaffolding can make it easier to run and debug a service locally in the IDE but put it into your local k8 so that it can communicate with other k8 services. It works OK but running more than 4 or 5 full services locally in k8 or docker requires dedicating a substantial amount of CPU and memory.
Build mock versions of all your services. Use those locally and for integration tests. The mock services are intentionally much simpler and therefore easier to run lots of them locally. Obvious downside is that you have to build mock version of every service, and you can easily miss bugs that are caused by mock services not behaving like the real service. Record/replay tools like Hoveryfly can help in building mock services.
Give every developer their own Cloud environment. Run most services in the cloud but use a tool like Telepresence to swap locally running services in and out of the cloud cluster. This eliminates the problem of running too many services on a single machine but can be spendy to maintain separate cloud sandboxes for each developer. You also need a DevOps resource to help developers when their cloud sandbox gets out of whack.
Eliminate unnecessary microservice complexity and consolidate all your services into 1 or 2 monoliths. Enjoy being able to run everything locally as a single service. Accept the fact that a microservice architecture is overkill for most companies. Too many people choose a microservice architecture upfront before their needs demand it. Or they do it out of fear that they will need it in the future. Inevitably this leads to guessing how they should decompose the system into many microservices, and getting the boundaries and contracts wrong, which makes it just as hard or harder to fix in the future compared to a monolith. And they incur the costs of microservices years before they need to. Microservices make everything more costly and painful, from local development to deployment. For companies like Netflix and Amazon, it's necessary. For most of us, it's not.
I prefer option 4 if at all possible. Otherwise option 2 or 3 in that order. Option 1 should be avoided in my opinion but it is probably the option everyone tries first.
In GKE and assuming you have a private cluster. You can utilize port forwarding while hooked up to the GKE environment through the CLI. Create a script that forwards your local ports to the GKE environment. I believe on the services tab in your cluster is where you will find the "port-forwarding" button that will give you the CMD command. This way you can work on one microservice with all of its traffic being routed to the actual DEV cluster. This prevents you from having to run multiple projects at the same time.
I would say create a staging environment which will have all services running. This staging environment will specifically be curated for development. E.g. if it's deployed using k8s then you expose some ports using nodeport service if you need them for your specific microservice. And have a DevOps pipeline to always keep this environment up to date with the code.
This environment should always be built from master branch. If you have single repo for app or repo per service, it's fair assumption that the will always have most recent code when you create your dev/feature branch.
Then when you want to develop a feature or fix a bug you checkout your microservice. And if you are following the microservice pattern appropriately, that single microservice should be an executable and have it's own docker file and should be debuggable from your local IDE. Many enterprises follow this pattern, and enforce at the organization level that the master branch is always production ready and high quality.
Let's say, you discover a bug in some other microservice running in k8s cluster. You will very likely get tempted to find a way to debug that remote microservice. However, that should be written as a bug for the team that owns the microservice. If your team owns it then you fix it and then start working on your feature. If you really think you need to debug multiple microservices, then I think you have real tight coupling between the services or you don't really need the microservice architecture.

What are the benefits of building an Android application with Kubernetes/Containers

I will be building an Android application (not a game) soon. I heard of containerized development and Docker/Kubernetes but I'm not well-versed in its functions and use cases.
Why should I build my Android application with Kubernetes?
Your question can be split up into two parts:
1. Why should I containerize my deployment?
I hope by "deployment", you are referring to the backend services that serve your Android application; not the application itself (not sure how one would do that...). Here is a good article.
Containerization is a powerful abstraction that can help you manage both your code and environment. Setting up a container with the correct dependencies, utilities etc., and securing them is a lot of work, as is the case with any server setup. However, once you have packaged everything into a container, you can deploy said container multiple times and build on-top of it. The value of the grunt work that you have done in the past is therefore carried forward in your future deployments; conversely, so are the bugs... Additionally, you can also leverage the Docker ecosystem and build on various community contributions greatly accelerating your workflows.
A possible unintended advantage is also protection against configuration drift. Whenever services fail or your application crashes, you can simply restart your container, and a fresh version of the service will be created again. However, to support these operations, you need to ensure that your containerized service behaves nicely across restarts and fails gracefully. There are many other caveats and advantages that are not listed here; you can find more discussion on Google.
2. Why should I use Kubernetes for my container orchestration?
If you have many containers (think in the order of 100s), then using a single-node solution like Docker/docker-compose to manage them becomes tedious.
If only there was a tool to manage across multiple nodes, implement service discovery between your nodes, have fault tolerance (ie. automatic restarts, backoff policies), do health-checking of your services, manage storage assets, and conveniently expose your containers to the public. That tool is Kubernetes.
Here is a more in-depth intro.
Hope this helps!

Running an Elixir program in a Kubernetes swarm

I'm fairly new to Elixir and container orchestration technologies like Kubernetes. I know that Elixir runs on the BEAM, which I've heard makes it possible to easily run programs on a network of computers. From what I understand of Kubernetes, Kubernetes manages a swarm of containers in a virtual network.
My question: would it be a good idea to use Kubernetes to manage a scalable swarm of containers running an Elixir program? If so, are there any pre-configured Kubernetes setups that make this easy?
We just moved to kubernetes for container management at my startup. I have a couple opinions on this:
If the goal is simply to distribute an elixir application over a "swarm" of nodes and / or machines, Kubernetes adds quite a bit of complexity and plumbing towards that goal. You end up having not just the learning curve of BEAM distribution, but of the kubernetes ecosystem itself.
However, kubernetes and elixir applications can work really well together, especially if your application deployment is more complex than pure elixir projects. For example we have custom services run from javascript, python, elixir, and golang in our application deployment (not too mention a dozen random tools).
Kubernetes helps extend the ability to distribute and manage nodes for all of these services with one strategy.
To answer your question: it can be a good idea, but I'd recommend first learning to manage the elixir program distributed by itself. Its not quite as simple as everyone makes it out to be, although its better than most programming technologies for other languages IMO. Here's a good guide to getting started with distributed elixir/erlang: http://engineering.pivotal.io/post/how-to-set-up-an-elixir-cluster-on-amazon-ec2/.
You can isolate that learning curve, and once you feel comfortable with it move on to kubernetes. Here's a really good resource on getting started with swarming via kubernetes: http://bitwalker.org/posts/2016-08-04-clustering-in-kubernetes/
Oops, one final comment: the advantages of tackling both learning curves are that you end up with the power of distributed elixir, the rest of your application deployment can keep up, and the kubernetes echosystem enables a lot of very cool tech like auto-scaling, helping unlock you out of a cloud provider, etc.
Our team had used elixir builded binary and docker/consul for three years.
When the servers per service are less than three, just used containers to manage are fine.
Otherwise, it is painful to scale your services(servers are expensive and you can not figure out the pressure of your services).
I had experienced this and tried using Swarm. It works like a charm! You can easily scale your services and manage servers.
So, when you meeting many servers, it is a good idea to use Kubernetes to manage a scalable swarm of containers running an Elixir program.

What does Apache Mesos do that Kubernetes can't do and vice-versa?

What does Apache Mesos do that Kubernetes can't do or vice-versa?
Mesos is a Two level scheduler. Sure it grabs resource information from every machine and gives it to the top level scheduler such that frameworks like kubernetes can use to schedule containers across machines but Kubernetes can itself schedule containers across machines (No need for Mesos from this regard). so what are few things that Apache Mesos can do that Kubernetes cannot do or vice-versa?
Both Mesos and Kubernetes are n-th level containers orchestrators. This means you can achieve the same features but some kind of tasks could be done easier (read. better) on one of them. In fact, you can run Kubernetes on Mesos and vice verse.
Let's go through main differences that give some clue when you need to make a decision:
Architecture
As you pointed out Mesos is a Two-Level Scheduler and this is the main difference in architecture. This gives you the ability to create your custom scheduler (aka framework) to run your tasks. What's more, you can have more than one scheduler. All your schedulers compete for the resources that are fairly distributed using Dominant Resources Fairness algorithm (that could be replaced with custom allocator). You can also assign roles to the frameworks and tasks and assign weights to this roles to prioritize some schedulers. Roles are tightly connected with resources. Above features gives you the ability to create your own way of scheduling for different applications (e.g., Fenzo) with different heuristics based on a type of tasks you want to run. For example, when running batch tasks it's good to place them near data and time to start is not so important. On the other hand, running stateless services is independent of nodes and it's more critical to run them ASAP.
Kubernetes architecture is a single level scheduler. That means decisions where pod will be run are made in a single component. There is no such thing as resource offer. On the other hand, everything there is pluggable and built with a layered design.
Origin
Mesos was created at Twitter (formerly at Berkeley but the first production usage was at Twitter) to support their scale.
In March 2010, about a year into the Mesos project, Hindman and his Berkeley colleagues gave a talk at Twitter. At first, he was disappointed. Only about eight people showed up. But then Twitter's chief scientist told him that eight people was lot – about ten percent of the company's entire staff. And then, after the talk, three of those people approached him.
Soon, Hindman was consulting at Twitter, working hand-in-hand with those ex-Google engineers and others to expand the project. Then he joined the company as an intern. And, a year after that, he signed on as a full-time employee.
source
Kubernetes was created by Google to bring users to their cloud promising no lock-in experience. This is the same technique Amazon did with Kindle. You can read any book on it but using it with Amazon gives you the best experience. The same is true for Google. You can run Kubernetes on any cloud (public or private) but the best tooling, integration and support you'll get only on Google Cloud.
But Google and Microsoft are different. Microsoft wants to support everything on Azure, while Google wants Kubernetes everywhere. (In a sense, Microsoft is living up to the Borg name, assimilating all orchestrators, more than Google is.) And quite literally, Kubernetes is how Google is playing up to the on-premises cloud crowd giving it differentiation from AWS (which won’t sell its infrastructure as a stack with a license, although it says VMware is its private cloud partner) and Microsoft (which still doesn’t have its Azure Stack private cloud out the door). source
Community
Judging a project simply by its community size could be misleading. It's like you'd be saying that php is a great language because it has large community.
Mesos community is much smaller than Kubernetes. That's the fact. Kubernetes has financial support from many big companies including Google, Intel, Mirantis, RedHat and more while Mesos is developed mainly by Mesosphere with some support from Apple, Microsoft. Although Mesos is a mature project, its development is slow but stable. On the other hand, Kubernetes is much younger, but rapidly developed.
Meso contributors origin
The Kubernetes Community - Ian Lewis, Developer Advocate, Google
Scale
Mesos was targeted for big customers from the early beginning. It is used at Twitter, Apple, Verizon, Yelp, Netflix to run hundreds of thousands of containers on thousands of servers.
Kubernetes was started by Google to give developers Google Infrastructure experience (GIFFE). From the beginning, it was prepared for small scale up to hundreds of machines. This constraint is increased with every release but they started small to grow big. There are no public data about biggest Kubernetes installation.
Hype
Due to scale issues, Kuberntetes started to be popular among smaller companies (not cloud scale) while Mesos was targeted for enterprise users. Kubernetes is supported by Cloud Native Foundation while Mesos is Apache Foundation Project. These two foundations have different founding and sponsors. Generally, more money gives you better marketing and Kubernetes definitely did it right.
https://g.co/trends/RUuhA
Conclusion
It looks like Kubernetes already won the containers orchestrator war. But if you have some custom workloads and really big scale, Mesos could be a good choice.
The main difference is in the community size and the open source model : where DCOS is supported by Mesosphere and provide enterprise features in a commercial product only (because mesosphere isn't philanthropist), K8S has a larger community with strong contributions from different companies resulting in providing much more integrated enterprise features (multitenancy, RBAC, quota, preemption, gateways...) meaning they are easier to use, not necessarily they don't exist in DCOS.
I would globally say that :
DCOS is more battle tested for stateful and big data workloads but lacks of integration with other perimetric components including plug and play central monitoring and logging and enterprise features like security model, multi tenancy, auto updates... It was a very hard way to integrate everything for a production grade platform.
K8S is more battle tested for stateless apps and provides lots of plug and play tools like prometheus, EFK, helm... which makes the implementation of a production grade platform much easier. Next to that there is a big move on stateful workloads with statefulsets and the operator pattern which is comparable with mesos frameworks but again, K8S provides lots of tools to develop them with less costs because lots of functionalities are provided out of the box, it takes me 2 months to develop a MongoDB operator to provide MongoDB as a service in a multi tenant and secured way and I needed to learn Golang in the same time.
source
https://www.infoworld.com/article/3118345/cloud-computing/why-kubernetes-is-winning-the-container-war.html
https://www.theregister.co.uk/2017/10/17/docker_ee_kubernetes_support
https://www.techrepublic.com/article/these-two-vendors-are-most-likely-to-bring-kubernetes-containers-to-the-enterprise
https://www.cloudhealthtech.com/blog/container-wars-are-over-kubernetes-has-won
https://news.ycombinator.com/item?id=12462261

Should docker image be bundled with code?

We are building a SaaS application. I don't have (for now - for this app) high demands on availability. It's mostly going to be used in a specific time zone and for business purposes only, so scheduled restarting at 3 in the morning shouldn't be a problem at all.
It is an ASP.NET application running in mono with the fastcgi server. Each customer will have - due to security reasons - his own application deployed. This is going to be done using docker containers, with an Nginx server in the front, to distribute the requests based on URL. The possible ways how to deploy it are for me:
Create a docker image with the fcgi server only and run the code from a mount point
Create a docker image with the fcgi server and the code
pros for 1. would seem
It's easier to update the code, since the docker containers can keep running
Configuration can be bundled with the code
I could easily (if I ever wanted to) add minor changes for specific clients
pros for 2. would seem
everything is in an image, no need to mess around with additional files, just pull it and run it
cons for 1.
a lot of folders for a lot of customers additionally to the running containers
cons for 2.
Configuration can't be in the image (or can it? - should i create specific images per customer with their configuration?) => still additional files for each customer
Updating a container is harder since I need to restart it - but not a big deal, as stated in the beginning
For now - the first year - the number of customers will be low and when the demand is low, any solution is good enough. I'm looking rather at - what is going to work with >100 customers.
Also for future I want to set up CI for this project, so we wouldn't need to update all customers instances manually. Docker images can have automated builds but not sure that will be enough.
My concerns are basically - which solution is less messier, maybe easier to automate?
I couldn't find any best practices with docker which cover a similar scenario.
It's likely that your application's dependencies are going to be dependent on the code, so you'll still have to sometimes rebuild the images and restart the containers (whenever you add a new dependency).
This means you would have two upgrade workflows:
One where you update just the code (when there are no dependency changes)
One where you update the images too, and restart the containers (when there are dependency changes)
This is most likely undesirable, because it's difficult to automate.
So, I would recommend bundling the code on the image.
You should definitely make sure that your application's configuration can be stored somewhere else, though (e.g. on a volume, or accessed through through environment variables).
Ultimately, Docker is a platform to package, deploy and run applications, so packaging the application (i.e. bundling the code on the image) seems to be the better way to use it.