What is the difference between “container_memory_working_set_bytes” and “container_memory_rss” metric on the container - kubernetes

I need to monitor my container memory usage running on kubernetes cluster. After read some articles there're two recommendations: "container_memory_rss", "container_memory_working_set_bytes"
The definitions of both metrics are said (from the cAdvisor code)
"container_memory_rss" : The amount of anonymous and swap cache memory
"container_memory_working_set_bytes": The amount of working set memory, this includes recently accessed memory, dirty memory, and kernel memory
I think both metrics are represent the bytes size on the physical memory that process uses. But there are some differences between the two values from my grafana dashboard.
My question is:
What is the difference between two metrics?
Which metrics are much proper to monitor memory usage? Some post said both because one of those metrics reaches to the limit, then that container is oom killed.

You are right. I will try to address your questions in more detail.
What is the difference between two metrics?
container_memory_rss equals to the value of total_rss from /sys/fs/cgroups/memory/memory.status file:
// The amount of anonymous and swap cache memory (includes transparent
// hugepages).
// Units: Bytes.
RSS uint64 `json:"rss"`
The total amount of anonymous and swap cache memory (it includes transparent hugepages), and it equals to the value of total_rss from memory.status file. This should not be confused with the true resident set size or the amount of physical memory used by the cgroup. rss + file_mapped will give you the resident set size of cgroup. It does not include memory that is swapped out. It does include memory from shared libraries as long as the pages from those libraries are actually in memory. It does include all stack and heap memory.
container_memory_working_set_bytes (as already mentioned by Olesya) is the total usage - inactive file. It is an estimate of how much memory cannot be evicted:
// The amount of working set memory, this includes recently accessed memory,
// dirty memory, and kernel memory. Working set is <= "usage".
// Units: Bytes.
WorkingSet uint64 `json:"working_set"`
Working Set is the current size, in bytes, of the Working Set of this process. The Working Set is the set of memory pages touched recently by the threads in the process.
Which metrics are much proper to monitor memory usage? Some post said
both because one of those metrics reaches to the limit, then that
container is oom killed.
If you are limiting the resource usage for your pods than you should monitor both as they will cause an oom-kill if they reach a particular resource limit.
I also recommend this article which shows an example explaining the below assertion:
You might think that memory utilization is easily tracked with
container_memory_usage_bytes, however, this metric also includes
cached (think filesystem cache) items that can be evicted under memory
pressure. The better metric is container_memory_working_set_bytes as
this is what the OOM killer is watching for.
EDIT:
Adding some additional sources as a supplement:
A Deep Dive into Kubernetes Metrics — Part 3 Container Resource Metrics
#1744
Understanding Kubernetes Memory Metrics
Memory_working_set vs Memory_rss in Kubernetes, which one you should monitor?
Managing Resources for Containers
cAdvisor code

Related

Promotheus: discrepancy in sum vs avg

I have micrometer-prometheus jvm metrics monitoring configured for my spring boot application, which is deployed in kubernetes pods. There are 2 pods.
When I run query avg(jvm_memory_max_bytes), I see graph hovering mostly around 400mb value. When I run sum(jvm_memory_max_bytes), graph jumps up to 10gb value.
Is this much variation normal?
The metric jvm_memory_max_bytes shows:
The maximum amount of memory in bytes that can be used for memory management.
So the value will not change according to its consumption but rather on how much memory is available.
If you are trying to get how much memory has been used, you need to use the metric: jvm_memory_used_bytes.
You can find more information on this page, under 3. JVM Metrics.

How to determine resource limit for Openshift Pods required for my tomcat application?

I've a web application (soap service) running in Tomcat 8 server in Openshift. The payload size is relatively small with 5-10 elements and the traffic is also small (300 calls per day, 5-10 max threads at a time). I'm little confused on the Pod resource restriction. How do I come up with min and max cpu and memory limits for each pod if I'm going to use min 1 and max 3 pods for my application?
It's tricky to configure accurate limitation value without performance test.
Because we don't expect your application is required how much resources process per requests. A good rule of thumb is to limit the resource based on heaviest workload on your environment. Memory limitation can trigger OOM-killer, so you should set up afforded value which is based on your tomcat heap and static memory size.
As opposed to CPU limitation will not kill your pod if reached the limitation value, but slow down the process speed.
My suggestion of each limitation value's starting point is as follows.
Memory: Tomcat(Java) memory size + 30% buffer
CPU: personally I think CPU limitation is useless to maximize the
process performance and efficiency. Even though CPU usage is afforded and the pod
can use full cpu resources to process the requests as soon as
possible at that time, the limitation setting can disturb it. But if
you should spread the resource usage evenly for suppressing some
aggressive resource eater, you can consider the CPU limitation.
This answer might not be what you want to, but I hope it help you to consider your capacity planning.

How to calculate the number of CPU, memory and storage that my Google Cloud SQL needs

My DB is reaching the 100% of CPU utilization and increasing the number of CPU is not working anymore.
What kind of information should I consider to create my Google Cloud SQL? How do you set up the DB configuration?
Info I have:
For 10-50 minute a day I have 120 request/seconds and the CPU reaches 100% of utilization
Memory usage is the maximum 2.5GB during this critical period
Storage usage is currently around 1.3GB
Current configuration:
vCPUs: 10
Memory: 10 GB
SSD storage: 50 GB
Unfortunately, there is no magic formula for determining the correct database size. This is because queries have variable load - some are small and simple and take no time at all, others are complex or huge and take lots of resources to complete.
There are generally two strategies to dealing with high load: Reduce your load (use connection pooling, optimize your queries, cache results), or increase the size of your database (add additional CPUs, Storage, or Read replicas).
Usually, when we have CPU utilization, it is because the CPU is overloaded or we have too many database tables in the same instances. Here are some common issues and fixes provided by Google’s documentation:
If CPU utilization is over 98% for 6 hours, your instance is not properly sized for your workload, and it is not covered by the SLA.
If you have 10,000 or more database tables on a single instance, it could result in the instance becoming unresponsive or unable to perform maintenance operations, and the instance is not covered by the SLA.
When the CPU is overloaded, it is recommended to use this documentation to view the percentage of available CPU your instance is using on the Instance details page in the Google Cloud Console.
It is also recommended to monitor your CPU usage and receive alerts at a specified threshold, set up a Stackdriver alert.
Increasing the number of CPUs for your instance should reduce the strain of your instance. Note that changing CPUs requires an instance restart. If your instance is already at the maximum number of CPUs, shard your database to multiple instances.
Google has this very interesting documentation about investigating high utilization and determining whether a system or user task is causing high CPU utilization. You could use it to troubleshoot your instance and find what's causing the high CPU utilization.

What is the use case of setting memory request less than limit in . K8s

I understand the use case of setting CPU request less than limit - it allows for CPU bursts in each container if the instances has free CPU hence resulting in max CPU utilization.
However, I cannot really find the use case for doing the same with memory. Most application don't release memory after allocating it, so effectively applications will request up to 'limit' memory (which has the same effect as setting request = limit). The only exception is containers running on an instance that already has all its memory allocated. I don't really see any pros in this, and the cons are more nondeterministic behaviour that is hard to monitor (one container having higher latencies than the other due to heavy GC).
Only use case I can think of is a shaded in memory cache, where you want to allow for a spike in memory usage. But even in this case one would be running the risk of of one of the nodes underperforming.
Maybe not a real answer, but a point of view on the subject.
The difference with the limit on CPU and Memory is what happens when the limit is reached. In case of the CPU, the container keeps running but the CPU usage is limited. If memory limit is reached, container gets killed and restarted.
In my use case, I often set the memory request to the amount of memory my application uses on average, and the limit to +25%. This allows me to avoid container killing most of the time (which is good) but of course it exposes me to memory overallocation (and this could be a problem as you mentioned).
Actually the topic you mention is interesting and in the meantime complex, just as Linux memory management is. As we know when the process is using more memory than the limit it will quickly move up on the potential "to-kill" process "ladder". Going further, the purpose of limit is to tell the kernel when it should consider the process to be potentially killed. Requests on the other hand are a direct statement "my container will need this much memory", but other than that they provide valuable information to the Scheduler about where can the Pod be scheduled (based on available Node resources).
If there is no memory request and high limit, Kubernetes will default the request to the limit (this might result in scheduling fail, even if the pods real requirements are met).
If you set a request, but not limit - container will use the default limit for namespace (if there is none, it will be able to use the whole available Node memory)
Setting memory request which is lower than limit you will give your pods room to have activity bursts. Also you make sure that a memory which is available for Pod to consume during boost is actually a reasonable amount.
Setting memory limit == memory request is not desirable simply because activity spikes will put it on a highway to be OOM killed by Kernel. The memory limits in Kubernetes cannot be throttled, if there is a memory pressure that is the most probable scenario (lets also remember that there is no swap partition).
Quoting Will Tomlin and his interesting article on Requests vs Limits which I highly recommend:
You might be asking if there’s reason to set limits higher than
requests. If your component has a stable memory footprint, you
probably shouldn’t since when a container exceeds its requests, it’s
more likely to be evicted if the worker node encounters a low memory
condition.
To summarize - there is no straight and easy answer. You have to determine your memory requirements and use monitoring and alerting tools to have control and be ready to change/adjust the configuration accordingly to needs.

Advice on setting pod memory request size

I have a question based on my experience trying to implement memory requests/limits correctly in an OpenShift OKD cluster. I started by setting no request, then watching to see what cluster metrics reported for memory use, then setting something close to that as a request. I ended up with high-memory-pressure nodes, thrashing, and oom kills. I have found I need to set the requests to something closer to the VIRT size in ‘top’ (include the program binary size) to keep performance up. Does this make sense? I'm confused by the asymmetry between request (and apparent need) and reported use in metrics.
You always need to leave a bit of memory headroom for overhead an memory spills. If for some reason the container exceeds the memory, either from your application, from your binary of some garbage collection system it will get killed. For example, this is common in Java apps, where you specify a heap and you need an extra overhead for the garbage collector and other things such as:
Native JRE
Perm / metaspace
JIT bytecode
JNI
NIO
Threads
This blog explains some of them.