Heap size usage when having two processes in a container - kubernetes

We have two processes (p1 and p2) in a JVM container (in Docker) using kubernetes.
The resource Limit (in the helm chart) for the container is set to 1000 MiB.
We set the XX:MaxRAMPercentage to 50% (=500 MiB). How will the heap distribution for each process look like?
Will they p1 and p2 equally so they will have 250 MiB each that cannot be exceeded?
Or will they share the whole heap of 500 MiB that cannot be exceeded?

The heap memory is just a part of the memory consumed by the JVM - there is also stack and native memory; the runtime, the JIT and the garbage collector also need memory. So, a typical Java application run with -Xmx500m will need approximately 700-1000MB of RAM (when using the full heap). The full memory usage heavily depends on what your application is doing and how it allocates and deallocates memory - some Java apps with 1GB of heap can use 20GB of RAM.
Back to your question: when you limit the container to 1000MiB and run two same-siced, pretty standard Java web applications, I would size the JVMs with -Xmx300m (or if you really want to use relative values: -XX:MaxRAMPercentage=30.0).
For more information: this answer gives a good overview of Java memory.

Related

What is the difference between “container_memory_working_set_bytes” and “container_memory_rss” metric on the container

I need to monitor my container memory usage running on kubernetes cluster. After read some articles there're two recommendations: "container_memory_rss", "container_memory_working_set_bytes"
The definitions of both metrics are said (from the cAdvisor code)
"container_memory_rss" : The amount of anonymous and swap cache memory
"container_memory_working_set_bytes": The amount of working set memory, this includes recently accessed memory, dirty memory, and kernel memory
I think both metrics are represent the bytes size on the physical memory that process uses. But there are some differences between the two values from my grafana dashboard.
My question is:
What is the difference between two metrics?
Which metrics are much proper to monitor memory usage? Some post said both because one of those metrics reaches to the limit, then that container is oom killed.
You are right. I will try to address your questions in more detail.
What is the difference between two metrics?
container_memory_rss equals to the value of total_rss from /sys/fs/cgroups/memory/memory.status file:
// The amount of anonymous and swap cache memory (includes transparent
// hugepages).
// Units: Bytes.
RSS uint64 `json:"rss"`
The total amount of anonymous and swap cache memory (it includes transparent hugepages), and it equals to the value of total_rss from memory.status file. This should not be confused with the true resident set size or the amount of physical memory used by the cgroup. rss + file_mapped will give you the resident set size of cgroup. It does not include memory that is swapped out. It does include memory from shared libraries as long as the pages from those libraries are actually in memory. It does include all stack and heap memory.
container_memory_working_set_bytes (as already mentioned by Olesya) is the total usage - inactive file. It is an estimate of how much memory cannot be evicted:
// The amount of working set memory, this includes recently accessed memory,
// dirty memory, and kernel memory. Working set is <= "usage".
// Units: Bytes.
WorkingSet uint64 `json:"working_set"`
Working Set is the current size, in bytes, of the Working Set of this process. The Working Set is the set of memory pages touched recently by the threads in the process.
Which metrics are much proper to monitor memory usage? Some post said
both because one of those metrics reaches to the limit, then that
container is oom killed.
If you are limiting the resource usage for your pods than you should monitor both as they will cause an oom-kill if they reach a particular resource limit.
I also recommend this article which shows an example explaining the below assertion:
You might think that memory utilization is easily tracked with
container_memory_usage_bytes, however, this metric also includes
cached (think filesystem cache) items that can be evicted under memory
pressure. The better metric is container_memory_working_set_bytes as
this is what the OOM killer is watching for.
EDIT:
Adding some additional sources as a supplement:
A Deep Dive into Kubernetes Metrics — Part 3 Container Resource Metrics
#1744
Understanding Kubernetes Memory Metrics
Memory_working_set vs Memory_rss in Kubernetes, which one you should monitor?
Managing Resources for Containers
cAdvisor code

Common heap behavior for Wildfly or application memory leak?

We're running our application in Wildfly 14.0.1, with a -Xmx of 4096, running with OpenJDK 11.0.2. I've been using VisualVM 1.4.2 to monitor our heap since we previously were having OOM exceptions (because our -Xmx was only 512 which was incredibly bad).
While we are well within our memory allocation now, we have no more OOM exceptions happening, and even with a good amount of clients and processing happening we're nowhere near the -Xmx4096 (the servers have 16GB so memory isn't an issue), I'm seeing some strange heap behavior that I can't figure out where it's coming from.
Using VisualVM, Eclipse MemoryAnalyzer, as well as heaphero.io, I get summaries like the following:
Total Bytes: 460,447,623
Total Classes: 35,708
Total Instances: 2,660,155
Classloaders: 1,087
GC Roots: 4,200
Number of Objects Pending for Finalization: 0
However, in watching the Heap Monitor, I see the Used Heap over a 4 minute time period increase by about 450MB before the GC runs and drops back down only to spike again. Here's an image:
This is when no clients are connected and nothing is actively happening in our application. We do use Apache File IO to monitor remote directories, we have JMS topics, etc. so it's not like the application is completely idle, but there's zero logging and all that.
My largest objects are the well-known io.netty.buffer.PoolChunk, which in the heap dumps are about 60% of my memory usage, the total is still around 460MB so I'm confused why the heap monitor is going from ~425MB to ~900MB repeatedly, and no matter where I take my snapshots, I can't see any large increase of object counts or memory usage.
I'm just seeing a disconnect between the heap monitor, and .hprof analysis. So there doesn't see a way to tell what's causing the heap to hit that 900MB peak.
My question is if these heap spikes are totally expected when running within Wildfly, or is there something within our application that is spinning up a bunch of objects that then get GC'd? In doing a Component report, objects in our application's package structure make up an extremely small amount of the dump. Which doesn't clear us, we easily could be calling things without closing appropriately, etc.

Advice on setting pod memory request size

I have a question based on my experience trying to implement memory requests/limits correctly in an OpenShift OKD cluster. I started by setting no request, then watching to see what cluster metrics reported for memory use, then setting something close to that as a request. I ended up with high-memory-pressure nodes, thrashing, and oom kills. I have found I need to set the requests to something closer to the VIRT size in ‘top’ (include the program binary size) to keep performance up. Does this make sense? I'm confused by the asymmetry between request (and apparent need) and reported use in metrics.
You always need to leave a bit of memory headroom for overhead an memory spills. If for some reason the container exceeds the memory, either from your application, from your binary of some garbage collection system it will get killed. For example, this is common in Java apps, where you specify a heap and you need an extra overhead for the garbage collector and other things such as:
Native JRE
Perm / metaspace
JIT bytecode
JNI
NIO
Threads
This blog explains some of them.

How to analyze unmanaged heap size of a .NET process

How can I analyze the unmanaged heap size of a .NET process with Windbg?
Which commands should be used in WinDbg?
!address -summary gives you an overview not focusing on individuals heaps.
Usage summary contains the following:
Free: free memory which can be allocated ans used
Image: memory used by EXE and DLL files
MappedFile: memory used by memory mapped files
Heap / Heap32 / Heap64: memory allocated via the heap manager
Stack / Stack32 / Stack 64: memory used by stacks of threads
TEB / TEB32 / TEB64: memory used by thread environment blocks
PEB / PEB32 / PEB64: memory used by process environment blocks (e.g. command line and environment variables)
Type summary contains:
MEM_IMAGE: should roughly correspond to Image
MEM_MAPPED: should roughly correspond to MappedFile
MEM_PRIVATE: private memory which can only be used by your application and not be shared
State summary:
MEM_FREE: should roughly correspond to Free
MEM_COMMIT: memory in use
MEM_RESERVE: memory which might be used
Protect Summary should explain itself. If you're very new, it's probably not that interesting.
Largest Region by usage:
Especially important here is the free region. The largest free region determines how much memory you can get in one block. Look around for memory fragmentation to find out why this can be an issue.
!heap -s gives you the summary about heaps with focus on individual heaps.
These are all native memory allocations done via the Windows heap manager. Direct allocations via VirtualAlloc() are not listed (e.g. MSXML and .NET).
Read more about native memory management on MSDN: Managing Heap Memory and MSDN: Managing Virtual Memory

JBoss memory usage pattern

Taking a look at one of my applications running in a production server, I noticed that there is a "normal but strange" behavior of memory usage.
Let me explain: watching the execution of my JBoss 4.2.2 without any application deployed, I can see that is constantly grows and frees the used heap space, usually a few megabytes in a development server. When I deploy my application, the pattern is the same, but using more memory in average.
Well, in the production server I can see that my JBoss, even without any workload, has a minimum memory usage of 1.5GB. Still without any workload, the heap usage grows to 3.6GB, when a minor GC runs and the heap usage comes back to 1.5GB. Every 40s my JBoss grows its heap usage from 1,5GB to 3.6GB, and this pattern repeats indefinitely. When the workload grows, the difference is that the period for growing the memory usage falls to 8s.
So, my question is: is this normal?