UseContainerSupport and direct memory - kubernetes

In a container-based environment such as Kubernetes, the UseContainerSupport JVM feature is handy as it allows configuring heap size as a percentage of container memory via options such as XX:MaxRAMPercentage instead of a static value via Xmx. This way you don't have to potentially adjust your JVM options every time the container memory limit changes, potentially allowing use of vertical autoscaling. The primary goal is hitting a Java OufOfMemoryError rather than running out of memory at the container (e.g. K8s OOMKilled).
That covers heap memory. In applications that use a significant amount of direct memory via NIO (e.g. gRPC/Netty), what are the options for this? The main option I could find is XX:MaxDirectMemorySize, but this takes in a static value similar to Xmx.

There's no similar switch for MaxDirectMemorySize as far as I know.
But by default (if you don't specify -XX:MaxDirectMemorySize) the limit is same as for MaxHeapSize.
That means, if you set -XX:MaxRAMPercentage then the same limit applies to MaxDirectMemory.
Note: that you cannot verify this simply via -XX:+PrintFlagsFinal because that prints 0:
java -XX:MaxRAMPercentage=1 -XX:+PrintFlagsFinal -version | grep 'Max.*Size'
...
uint64_t MaxDirectMemorySize = 0 {product} {default}
size_t MaxHeapSize = 343932928 {product} {ergonomic}
...
openjdk version "17.0.2" 2022-01-18
...
See also https://dzone.com/articles/default-hotspot-maximum-direct-memory-size and Replace access to sun.misc.VM for JDK 11
My own experiments here: https://github.com/jumarko/clojure-experiments/pull/32

Related

Memory leak (?) using IO::Socket::Async (on FreeBSD 13.1)

In processing a stream of logs (via UDP) in a raku (v2022.07) app, I'm
hitting what appears to be a memory leak using IO::Socket::Async.
I pulled the code out into a simpler program which I've included below
(~ identical to code at https://docs.raku.org/type/IO::Socket::Async):
#!/usr/bin/env raku
#
my $socket = IO::Socket::Async.bind-udp('localhost', 24225);
react {
whenever $socket.Supply -> $v {
print $v if $v.chars > 0;
};
};
It leaks substantial ram - I let it run about 12 hours and
when I checked -- still running (on a 1T ram machine) -- with
ps auwwx [pid]
it showed 314974456 and 20739784 for VSZ and RSS (so, roughly 300G v size and 20G resident).
[btw, the UDP traffic is fairly light - average of 350 (~100 byte) packets/sec (spikes to ~1000/sec)]
So .. I rewrote above in perl5 (after similar leaky results w/
a couple of raku variants) which stabilizes quickly at about 8M resident - that's fine/stable/etc. -
but I'd prefer this process to feed a raku channel (without separate perl process/file
tailing, etc.).
My environment: FreeBSD 13.1-RELEASE-p2 GENERIC amd64 and raku:
v2022.07 built on MoarVM 2022.07 (installed with rakubrew).
I'm guessing this is unique to raku on freebsd but not sure.
I did attempt to upgrade (rakubrew) to v2022.12 to see if problem resolved there -
but in rebuilding modules (zef), too many failed (some issue with
Digest/Digest::HMAC) - so I had to revert to 2022.07.
I'll sure be grateful for any suggestions for addressing the leak or alternative
methods to address reading from a UDP port.
Not exactly a solution to your problem, but you can monitor memory usage from within your Raku code using built-in feature:
use Telemetry;
say T{"max-rss"};
Also remember that Supply by default decodes unicode chars. If your protocol is binary you may add :bin to Socket params to avoid treating binary data as text.

Vertx program memory keeps growing

I used the top command to check on the Linux server, and found that the Memory of the deployed Vertx program has been increasing. I printed the Memory usage with Native Memory Tracking, and found that the Internal Memory has been increasing.I didn't manually allocate out of heap memory in the code.
Native Memory Tracking:
Total: reserved=7595MB, committed=6379MB
Java Heap (reserved=4096MB, committed=4096MB)
(mmap: reserved=4096MB, committed=4096MB)
Class (reserved=1101MB, committed=86MB)
(classes #12776)
(malloc=7MB #18858)
(mmap: reserved=1094MB, committed=79MB)
Thread (reserved=122MB, committed=122MB)
(thread #122)
(stack: reserved=121MB, committed=121MB)
Code (reserved=253MB, committed=52MB)
(malloc=9MB #12566)
(mmap: reserved=244MB, committed=43MB)
GC (reserved=155MB, committed=155MB)
(malloc=6MB #302)
(mmap: reserved=150MB, committed=150MB)
Internal (reserved=1847MB, committed=1847MB)
(malloc=1847MB #35973)
Symbol (reserved=17MB, committed=17MB)
(malloc=14MB #137575)
(arena=3MB #1)
Native Memory Tracking (reserved=4MB, committed=4MB)
(tracking overhead=3MB)
vertx version:3.9.8
Cluster:Hazelcast
startup script:su web -s /bin/bash -c "/usr/bin/nohup /usr/bin/java -XX:NativeMemoryTracking=detail -javaagent:../target/showbiz-server-game-1.0-SNAPSHOT-fat.jar -javaagent:../../quasar-core-0.8.0.jar=b -Dvertx.hazelcast.config=/data/appdata/webdata/Project/config/cluster.xml -jar -Xms4G -Xmx4G -XX:-OmitStackTraceInFastThrow ../target/server-1.0-SNAPSHOT-fat.jar start-Dvertx-id=server -conf application-conf.json -Dlog4j.configurationFile=log4j2_logstash.xml -cluster >nohup.out 2>&1 &"
If your producers are much faster than your consumers, and back pressure isn't handled properly, it's possible to have memory that keeps increasing.
Also, this could vary depending on how the code is written.
This similar reported issue could be of help and consider exploring writeStream too.

Unable to connect to the NetBeans Distribution because of Zero sized file

I recently reinstalled Netbeans IDE on my Windows 10 PC in order to restore some unrelated configurations. When I tried checking for new plugins in order to be able to download the Sakila sample database,
I get this error.
I've tested the connection on both No Proxy and Use Proxy Settings, and both connection tests seem to end succesfully.
I have allowed Netbeans through my firewall, but this has changed nothing either.
I haven't touched my proxy configuration, so it's on default (autodetect). Switching the autodetect off doesn't change anything, either, no matter what proxy config i have on Netbeans.
Here's part of my log file that might be helpful:
Compiler: HotSpot 64-Bit Tiered Compilers
Heap memory usage: initial 32,0MB maximum 910,5MB
Non heap memory usage: initial 2,4MB maximum -1b
Garbage collector: PS Scavenge (Collections=12 Total time spent=0s)
Garbage collector: PS MarkSweep (Collections=3 Total time spent=0s)
Classes: loaded=6377 total loaded=6377 unloaded 0
INFO [org.netbeans.core.ui.warmup.DiagnosticTask]: Total memory 17.130.041.344
INFO [org.netbeans.modules.autoupdate.updateprovider.DownloadListener]: Connection content length was 0 bytes (read 0bytes), expected file size can`t be that size - likely server with file at http://updates.netbeans.org/netbeans/updates/8.0.2/uc/final/distribution/catalog.xml.gz?unique=NB_CND_EXTIDE_GFMOD_GROOVY_JAVA_JC_MOB_PHP_WEBCOMMON_WEBEE0d55337f9-fc66-4755-adec-e290169de9d5_bf88d09e-bf9f-458e-b1c9-1ea89147b12b is temporary down
INFO [org.netbeans.modules.autoupdate.ui.Utilities]: Zero sized file reported at http://updates.netbeans.org/netbeans/updates/8.0.2/uc/final/distribution/catalog.xml.gz?unique=NB_CND_EXTIDE_GFMOD_GROOVY_JAVA_JC_MOB_PHP_WEBCOMMON_WEBEE0d55337f9-fc66-4755-adec-e290169de9d5_bf88d09e-bf9f-458e-b1c9-1ea89147b12b
java.io.IOException: Zero sized file reported at http://updates.netbeans.org/netbeans/updates/8.0.2/uc/final/distribution/catalog.xml.gz?unique=NB_CND_EXTIDE_GFMOD_GROOVY_JAVA_JC_MOB_PHP_WEBCOMMON_WEBEE0d55337f9-fc66-4755-adec-e290169de9d5_bf88d09e-bf9f-458e-b1c9-1ea89147b12b
at org.netbeans.modules.autoupdate.updateprovider.DownloadListener.doCopy(DownloadListener.java:155)
at org.netbeans.modules.autoupdate.updateprovider.DownloadListener.streamOpened(DownloadListener.java:78)
at org.netbeans.modules.autoupdate.updateprovider.NetworkAccess$Task$1.run(NetworkAccess.java:111)
Caused: java.io.IOException: Zero sized file reported at http://updates.netbeans.org/netbeans/updates/8.0.2/uc/final/distribution/catalog.xml.gz?unique=NB_CND_EXTIDE_GFMOD_GROOVY_JAVA_JC_MOB_PHP_WEBCOMMON_WEBEE0d55337f9-fc66-4755-adec-e290169de9d5_bf88d09e-bf9f-458e-b1c9-1ea89147b12b
at org.netbeans.modules.autoupdate.updateprovider.DownloadListener.notifyException(DownloadListener.java:103)
at org.netbeans.modules.autoupdate.updateprovider.AutoupdateCatalogCache.copy(AutoupdateCatalogCache.java:246)
at org.netbeans.modules.autoupdate.updateprovider.AutoupdateCatalogCache.writeCatalogToCache(AutoupdateCatalogCache.java:99)
at org.netbeans.modules.autoupdate.updateprovider.AutoupdateCatalogProvider.refresh(AutoupdateCatalogProvider.java:154)
at org.netbeans.modules.autoupdate.services.UpdateUnitProviderImpl.refresh(UpdateUnitProviderImpl.java:180)
at org.netbeans.api.autoupdate.UpdateUnitProvider.refresh(UpdateUnitProvider.java:196)
[catch] at org.netbeans.modules.autoupdate.ui.Utilities.tryRefreshProviders(Utilities.java:433)
at org.netbeans.modules.autoupdate.ui.Utilities.doRefreshProviders(Utilities.java:411)
at org.netbeans.modules.autoupdate.ui.Utilities.presentRefreshProviders(Utilities.java:405)
at org.netbeans.modules.autoupdate.ui.UnitTab$14.run(UnitTab.java:806)
at org.openide.util.RequestProcessor$Task.run(RequestProcessor.java:1423)
at org.openide.util.RequestProcessor$Processor.run(RequestProcessor.java:2033)
It might be that the update server is down just right now; i haven't been able to test this either. But it also might be something wrong with my configurations. I'm going crazy!!1!
Something that worked for me was changing the "http:" to "https:" in the update urls.
I.E. Change "http://updates.netbeans.org/netbeans/updates/8.0.2/uc/final/distribution/catalog.xml.gz"
to "https://updates.netbeans.org/netbeans/updates/8.0.2/uc/final/distribution/catalog.xml.gz"
No idea why that makes it work on my end. I'm running Linux Mint 19.1.

OrientDB 2.2.2 - Is there a way to suppress OAbstractProfiler$MemoryChecker Messages?

We are running on 32bit windows and since upgrading from 1.4.1 to 2.2.2, we are seeing the following memory in stdout (numbers not exact):
INFO: Database 'BLAH' uses 770MB/912MB of DISKCACHE memory, while Heap is not completely used (usedHeap=123MB maxHeap=512MB). To improve performance set maxHeap to 124MB and DISKCACHE to 1296MB
With 32bit, we can only set a max of Xmx + storage.diskCache.bufferSize ~= 1.4gb without getting OOM or performance issues. Any combination of different sizes of either of these two configurable variables results in a variant of the above message.
Is there a way to suppress the above profiler/memory checker messages?
You can disable the profiler with:
java ... -Dprofiler.enabled=false ...
Set that configuration in your server.sh or in the last section of config/orientdb-server-config.xml file.

akka custom fork-join-executor dispatcher behaves differently on OSX and RHEL

When I deploy a Play framework application, using the Akka framework to a production machine it behaves differently then on my development workstation.
This is a system that receives a batch of device IP addresses, it performs some processing on each device and aggregates the results after all devices in the batch have been processed. This processing isn't very CPU intensive.
I basically have 2 types of actors, A BatchActor, and a DeviceActor. For the devices, I've created a created an actor backed by a RoundRobinPool router, and a custom dispatcher. I'm attempting to process ~500 device at a time (in parallel).
This issue is that when I run this code on my OSX machine, it runs as I would except.
For instance if I submit a batch of 200 device IP addresses, the application running on my workstations all the devices in parallel.
However when I copy this application to the production machine, Red Hat Enterprise Linux (RHEL), and run it submitting the same list of devices, it only processes 1 to 2 devices at a time.
What do I need to do to fix this issue?
The relevant code is as follows:
object Application extends Controller {
...
val numberOfWorkers = 500
val workers = Akka.system.actorOf(Props[DeviceActor]
.withRouter(RoundRobinPool(nrOfInstances = numberOfWorkers))
.withDispatcher("my-dispatcher")
)
def batchActor(config:BatchConfig)
= Akka.system.actorOf(BatchActor.props(workers, config), s"batch-${config.batchId}")
...
def batch = Action(parse.json) { request =>
request.body.validate[BatchConfig] match {
case config:BatchConfig => {
...
val batch = batchActor(config)
batch ! BatchActorProtocol.Start
Ok(Json.toJson(status))
}
...
}
}
The application.conf configuration section looks like the following:
my-dispatcher {
# Dispatcher is the name of the event-based dispatcher
type = Dispatcher
# What kind of ExecutionService to use
executor = "fork-join-executor"
# Configuration for the fork join pool
fork-join-executor {
# Min number of threads to cap factor-based parallelism number to
parallelism-min = 1000
# Parallelism (threads) ... ceil(available processors * factor)
parallelism-factor = 100.0
# Max number of threads to cap factor-based parallelism number to
parallelism-max = 5000
}
# Throughput defines the maximum number of messages to be
# processed per actor before the thread jumps to the next actor.
# Set to 1 for as fair as possible.
throughput = 500
}
Inside the BatchActor I'm simply parsing the list of devices and feeding it to the
class BatchActor(val workers:ActorRef, val config:BatchConfig) extends Actor
...
def receive = {
case Start => start
...
}
private def start = {
...
devices.map { devices =>
results(devices.host) = None
workers ! DeviceWork(self, config, devices, steps)
}
...
}
after which the WorkerActor submits a result object back to the BatchActer.
My workstation: OS X - v10.9.3
java -version
java version "1.7.0_67"
Java(TM) SE Runtime Environment (build 1.7.0_67-b01)
Java HotSpot(TM) 64-Bit Server VM (build 24.65-b04, mixed mode)
production machine: Red Hat Enterprise Linux Server release 6.5 (Santiago)
java -version
java version "1.7.0_65"
OpenJDK Runtime Environment (rhel-2.5.1.2.el6_5-x86_64 u65-b17)
OpenJDK 64-Bit Server VM (build 24.65-b04, mixed mode)
Software:
Scala: v2.11.2
SBT: v0.13.6
Play: v2.3.5
Akka: v2.3.4
I'm using typesafe activator/sbt to start the application. The command is as follows:
cd <project dir>
./activator run -Dhttp.port=6600
Any help appreciated. I've been stuck on this issue for a couple of days now.
I believe you have too much parallelism in your code i.e., you are creating too many threads in your dispatcher. How many cores do you have on your Redhat box ? I've never seen such high value used. A lot of threads in FJ pool may be resulting in a large number of context switches. Try just using the default dispatcher and see if that fixes your issue or not. You can also change the values of min and max parallelism to 2 or 3 times number of cores you have.
fork-join-executor {
# Min number of threads to cap factor-based parallelism number to
parallelism-min = 1000
# Parallelism (threads) ... ceil(available processors * factor)
parallelism-factor = 100.0
# Max number of threads to cap factor-based parallelism number to
parallelism-max = 5000
}
Another thing to try is to create an uber jar using (sbt-assembly) and then deploy that instead of using activator to deploy it.
Finally, you can look inside your JVMs using something like VisualJVM or Yourkit.
After hours spent trying different things including:
doing research on different threading implementations on linux - pthreads vs NPTL
reading through all the VM documentation on threading
ulimits
trying various changes in the Play and Akka framework configurations
and finally a complete re-write of the thread management using scala futures, etc..
Nothing seemed to work. Then I did a detailed comparison and the only thing that was different was that I used the Oracle Hotspot implementation on my laptop, and the OpenJDK implementation on the production machine.
So I installed the Oracle VM on the production machine and that seemed to fix the issue. Even though I couldn't determine what the ultimate solution was, it seems that the default installation of OpenJDK on RHEL is complied or configured differently enough to not allow spawning of ~ 500 threads at a time.
I'm sure I'm missing something, but after ~ 3 days of searching I couldn't find it.