Why is enforcing system-reserved reservations in Kubernetes dangerous? - kubernetes

I'm reading the Kubernetes docs on Reserve Compute Resources for System Daemons, and it says "Be extra careful while enforcing system-reserved reservation since it can lead to critical system services being CPU starved, OOM killed, or unable to fork on the node."
I've seen this warning in a few places, and I'm having a hard time understanding the practical implication.
Can someone give me a scenario in which enforcing system-reserved reservation would lead to system services being starved, etc, that would NOT happen if I did not enforce it?

You probably have at least a few things running on the host nodes outside of Kubernetes' view. Like systemd, some hardware stuffs, maybe sshd. Minimal OSes like CoreOS definitely have a lot less, but if you're running on a more stock OS image, you need to leave room for all the other gunk that comes with them. Without leaving RAM set aside, the Kubelet will happily use it all up and then when you go to try and SSH in to debug why your node has gotten really slow and unstable, you won't be able to.

Related

Kubernetes priority of remove pods in 1.8.1

we have a small problem with the kubernetes cluster.
Because one of our applications is so demanding that sometimes consume all of our resources and finally some of pods are killed. The real problem starts when system pods like flannel or cache became removed.
Is there a recommended way to control what is being removed? How "save" system pods? Maybe someone has experience in this topic?
One of the ideas is to change QoS for all pods/apps from the kube-system to "Guaranteed". But I'm afraid that this will not work well if we limit resources, even with a large margin.
Btw. where can I read about what (default) requirements system services have? How set it on cluster creation phase?
The second idea is setting the Eviction Policy and/or Taints and Tolerations, but there is a anxiety that our key application will be (re)moved as one of the first. Unfortunately it currently works only in one copy and the initialization can take up to several minutes, so switching between nodes is currently unacceptable and impossible.
The final idea is to use Priority and Preemption, but from what I see in the 1.8.1 documentation is still in the "alpha" phase, and I have serious concerns about the stability of this solution.
Maybe there is something else I did not think about? I will be happy to listen other proposals.

Optimized environment for mongo

I have my RHEL linux server(VM) running a 4core processor and 8GB ram running the below applications
- an Apache Karaf container
- an Apache tomcat server
- an ActiveMQ server
- and the mongod server(either primary of secondary).
Often I see that mongo consumes nearly 80% of cpu. Now I see that my cpu and memory is overshooting most of the time and this has caused me to doubt whether my hardware config is too low for running these many components.
Please let me know if it is ok to run mongo like this on a shared server..
The question is to broad and the answer depends on too many variables, but I'll try to give you overall sense of it.
Can you use all these services together on the same machine at a minimum load? - for sure. It's not clear where other shards reside though, but it will work either way. You didn't provide your HDD specs which is quite important for a DB server, but again it will work at a minimum load.
Can you use this setup under heavy load - not the best idea. Perhaps it's better to have separate servers handling these services.
Monitor overall server load like: CPU, memory, IO. Check mongo logs for slow queries. If your queries supposed to run fast and they don't, you'll need more hardware.
Nobody would be really able to tell you how much load a specific server configuration can handle. You need at least 512Mb RAM and 1 CPU to get going these days but very soon you hit the limits. It all depends on how many users you have, what kinds of queries they run and how much data they cover.
Can you run MongoDB along other applications on a single server? Well it would appear that if you are having memory issues or CPU issues in your current configuration then you will likely need to address something. But "Can You?", well if it is not going to affect you then of course you can.
Should you, do this? Most people would firmly agree that you should not, and that would also stand for most of the other applications you are running on the one machine.
There are various reasons, process isolation, resource allocation, security, and far too many for a short topic response to go into why you should not have this kind of configuration. And certainly where it becomes a problem you should be addressing the issue by seeking a new configuration.
For Mongo alone, most people would not think twice about running their SQL database on dedicated hardware. The choice for Mongo should likely be no different.
Have also suggested this be moved to ServerFault, as it is not a programming question suited to stack overflow.

Programming considerations for virtualized applications

There are lots of questions on SO asking about the pros and cons of virtualization for both development and testing.
My question is subtly different - in a world in which virtualization is commonplace, what are the things a programmer should consider when it comes to writing software that may be deployed into a virtualized environment? Some of my initial thoughts are:
Detecting if another instance of your application is running
Communicating with hardware (physical/virtual)
Resource throttling (app written for multi-core CPU running on single-CPU VM)
Anything else?
You have most of the basics covered with the three broad points. Watch out for:
Hardware communication related issues. Disk access speeds are vastly different (and may have unusually high extremes - imagine a VM that is shut down for 3 days in the middle of a disk write....). Network access may interrupt with unusual responses
Fancy pointer arithmetic. Try to avoid it
Heavy reliance on unusually uncommon low level/assembly instructions
Reliance on machine clocks. Remember that any calls you're making to the clock, and time intervals, may regularly return unusual values when running on a VM
Single CPU apps may find themselves running on multiple CPU machines, that do funky things like Work Stealing
Corner cases and unusual failure modes are much more common. You might not have to worry as much that the network card will disappear in the middle of your communication on a real machine, as you would on a virtual one
Manual management of resources (memory, disk, etc...). The more automated the work, the better the virtual environment is likely to be at handling it. For example, you might be better off using a memory-managed type of language/environment, instead of writing an application in C.
In my experience there are really only a couple of things you have to care about:
Your application should not fail because of CPU time shortage (i.e. using timeouts too tight)
Don't use low-priority always-running processes to perform tasks on the background
The clock may run unevenly
Don't truss what the OS says about system load
Almost any other issue should not be handled by the application but by the virtualizer, the host OS or your preferred sys-admin :-)

Scala + Akka: How to develop a Multi-Machine Highly Available Cluster

We're developing a server system in Scala + Akka for a game that will serve clients in Android, iPhone, and Second Life. There are parts of this server that need to be highly available, running on multiple machines. If one of those servers dies (of, say, hardware failure), the system needs to keep running. I think I want the clients to have a list of machines they will try to connect with, similar to how Cassandra works.
The multi-node examples I've seen so far with Akka seem to me to be centered around the idea of scalability, rather than high availability (at least with regard to hardware). The multi-node examples seem to always have a single point of failure. For example there are load balancers, but if I need to reboot one of the machines that have load balancers, my system will suffer some downtime.
Are there any examples that show this type of hardware fault tolerance for Akka? Or, do you have any thoughts on good ways to make this happen?
So far, the best answer I've been able to come up with is to study the Erlang OTP docs, meditate on them, and try to figure out how to put my system together using the building blocks available in Akka.
But if there are resources, examples, or ideas on how to share state between multiple machines in a way that if one of them goes down things keep running, I'd sure appreciate them, because I'm concerned I might be re-inventing the wheel here. Maybe there is a multi-node STM container that automatically keeps the shared state in sync across multiple nodes? Or maybe this is so easy to make that the documentation doesn't bother showing examples of how to do it, or perhaps I haven't been thorough enough in my research and experimentation yet. Any thoughts or ideas will be appreciated.
HA and load management is a very important aspect of scalability and is available as a part of the AkkaSource commercial offering.
If you're listing multiple potential hosts in your clients already, then those can effectively become load balancers.
You could offer a host suggestion service and recommends to the client which machine they should connect to (based on current load, or whatever), then the client can pin to that until the connection fails.
If the host suggestion service is not there, then the client can simply pick a random host from it internal list, trying them until it connects.
Ideally on first time start up, the client will connect to the host suggestion service and not only get directed to an appropriate host, but a list of other potential hosts as well. This list can routinely be updated every time the client connects.
If the host suggestion service is down on the clients first attempt (unlikely, but...) then you can pre-deploy a list of hosts in the client install so it can start immediately randomly selecting hosts from the very beginning if it has too.
Make sure that your list of hosts is actual host names, and not IPs, that give you more flexibility long term (i.e. you'll "always have" host1.example.com, host2.example.com... etc. even if you move infrastructure and change IPs).
You could take a look how RedDwarf and it's fork DimDwarf are built. They are both horizontally scalable crash-only game app servers and DimDwarf is partly written in Scala (new messaging functionality). Their approach and architecture should match your needs quite well :)
2 cents..
"how to share state between multiple machines in a way that if one of them goes down things keep running"
Don't share state between machines, instead partition state across machines. I don't know your domain so I don't know if this will work. But essentially if you assign certain aggregates ( in DDD terms ) to certain nodes, you can keep those aggregates in memory ( actor, agent, etc ) when they are being used. In order to do this you will need to use something like zookeeper to coordinate which nodes handle which aggregates. In the event of failure you can bring the aggregate up on a different node.
Further more, if you use an event sourcing model to build your aggregates, it becomes almost trivial to have real-time copies ( slaves ) of your aggregate on other nodes by those nodes listening for events and maintaining their own copies.
By using Akka, we get remoting between nodes almost for free. This means that which ever node handles a request that might need to interact with an Aggregate/Entity on another nodes can do so with RemoteActors.
What I have outlined here is very general but gives an approach to distributed fault-tolerance with Akka and ZooKeeper. It may or may not help. I hope it does.
All the best,
Andy

Wastage of resources in Virtualization

I am not sure if this is the write place to ask the question. However i hope it is.
When looking for a VPS earlier today, I was trying to understand how each container would work in the background. Keeping in mind the fact that the operating system uses most of the power and power on a system, wouldn't having multiple operating systems in the same machine mean more wastage of resources.
For instance if i was running centOS on a dedicated box and it was running lets say 20 background OS level processes. Then i go and install a virtualization platform and install 5 more centOS virtual machines in the same system which are exactly the same as the host operating system. Doesn't this mean duplication of those 20 processes 6 times? So internally the context switching is happening between 120 processes instead of 20?
Firstly your question seems to touch on two topics, full virtualization and paravirtualization. Most VPS are providing a paravirtualized environment which (to very broadly generalize) only virtualizes parts of the OS, it appears as a fully virtualized system to the user but in terms of processes, I/O, it can be very different depending on the OS and the way this has been implemented.
When dealing with full guest virtualization, the main reason and benefit of Virtualization is reclaiming underutilized resources. Making use of that idle capacity.
For example, 5 machines running at average resource utilization of 15% could be virtualized on a single server and use an average of 75% resources, still leaving 25% overhead to handle peak capacity.
If your processes can co-exist on the same system, all depend on the same libraries, configuration settings, etc. can be brought up/down and restarted without affecting each other - then you may "waste" resources virtualizing them.
However if you need to reboot/restart Server A without affecting Server B and they both have pretty low usage, or the two applications depend on different kernel versions for example - then that's a good candidate for virtualization.
When you move up to enterprise level virtualization and start thinking about computing costs in cents-per-hour and dollars-per-gigabyte then this "overhead" is nothing compared to the savings and other benefits. You don't have disks half empty, CPUs idling, wasted resources, competition for who gets to configure what. Virtual hosts can move between hosts depending on load, fault tolerance, high-availability, automated provisioning.