Memory leak (?) using IO::Socket::Async (on FreeBSD 13.1) - sockets

In processing a stream of logs (via UDP) in a raku (v2022.07) app, I'm
hitting what appears to be a memory leak using IO::Socket::Async.
I pulled the code out into a simpler program which I've included below
(~ identical to code at https://docs.raku.org/type/IO::Socket::Async):
#!/usr/bin/env raku
#
my $socket = IO::Socket::Async.bind-udp('localhost', 24225);
react {
whenever $socket.Supply -> $v {
print $v if $v.chars > 0;
};
};
It leaks substantial ram - I let it run about 12 hours and
when I checked -- still running (on a 1T ram machine) -- with
ps auwwx [pid]
it showed 314974456 and 20739784 for VSZ and RSS (so, roughly 300G v size and 20G resident).
[btw, the UDP traffic is fairly light - average of 350 (~100 byte) packets/sec (spikes to ~1000/sec)]
So .. I rewrote above in perl5 (after similar leaky results w/
a couple of raku variants) which stabilizes quickly at about 8M resident - that's fine/stable/etc. -
but I'd prefer this process to feed a raku channel (without separate perl process/file
tailing, etc.).
My environment: FreeBSD 13.1-RELEASE-p2 GENERIC amd64 and raku:
v2022.07 built on MoarVM 2022.07 (installed with rakubrew).
I'm guessing this is unique to raku on freebsd but not sure.
I did attempt to upgrade (rakubrew) to v2022.12 to see if problem resolved there -
but in rebuilding modules (zef), too many failed (some issue with
Digest/Digest::HMAC) - so I had to revert to 2022.07.
I'll sure be grateful for any suggestions for addressing the leak or alternative
methods to address reading from a UDP port.

Not exactly a solution to your problem, but you can monitor memory usage from within your Raku code using built-in feature:
use Telemetry;
say T{"max-rss"};
Also remember that Supply by default decodes unicode chars. If your protocol is binary you may add :bin to Socket params to avoid treating binary data as text.

Related

How to count cache-misses in mmap-ed memory (using eBPF)?

I would like to get timeseries
t0, misses
...
tN, misses
where tN is a timestamp (second-resolution) and misses is a number of times the kernel made disk-IO for my PID to load missing page of the mmap()-ed memory region when process did access to that memory. Ok, maybe connection between disk-IO and memory-access is harder to track, lets assume my program can not do any disk-io with another (than assessing missing mmapped memory) reason. I THINK, I need to track something called node-load-misses in perf world.
Any ideas how eBPF can be used to collect such data? What probes should I use?
Tried to use perf record for similar purpose: I dislike how much data perf records. As I recall the try was like (also I dont remember how I parsed that output.data file):
perf record -p $PID -a -F 10 -e node-loads -e node-load-misses -o output.data
I thought eBPF could give some facility to implement such thing in less overhead way.
Loading of mmaped pages which are not present in memory is not hardware event like perf's cache-misses or node-loads or node-load-misses. When your program assess not present memory address, GPFault/pagefault exception is generated by hardware and it is handled in software by Linux kernel codes. For first access to anonymous memory physical page will be allocated and mapped for this virtual address; for access of mmaped file disk I/O will be initiated. There are two kinds of page faults in linux: minor and major, and disk I/O is major page fault.
You should try to use trace-cmd or ftrace or perf trace. Support of fault tracing was planned for perf tool in 2012, and patches were proposed in https://lwn.net/Articles/602658/
There is a tracepoint for page faults from userspace code, and this command prints some events with memory address of page fault:
echo 2^123456%2 | perf trace -e 'exceptions:page_fault_user' bc
With recent perf tool (https://mirrors.edge.kernel.org/pub/linux/kernel/tools/perf/) there is perf trace record which can record both mmap syscalls and page_fault_user into perf.data and perf script will print all events and they can be counted by some awk or python script.
Some useful links on perf and tracing: http://www.brendangregg.com/perf.html http://www.brendangregg.com/ebpf.html https://github.com/iovisor/bpftrace/blob/master/INSTALL.md
And some bcc tools may be used to trace disk I/O, like https://github.com/iovisor/bcc/blob/master/examples/tracing/disksnoop.py or https://github.com/brendangregg/perf-tools/blob/master/examples/iosnoop_example.txt
And for simple time-series stat you can use perf stat -I 1000 command with correct software events
perf stat -e cpu-clock,page-faults,minor-faults,major-faults -I 1000 ./program
...
# time counts unit events
1.000112251 413.59 msec cpu-clock # 0.414 CPUs utilized
1.000112251 5,361 page-faults # 0.013 M/sec
1.000112251 5,301 minor-faults # 0.013 M/sec
1.000112251 60 major-faults # 0.145 K/sec
2.000490561 16.32 msec cpu-clock # 0.016 CPUs utilized
2.000490561 1 page-faults # 0.005 K/sec
2.000490561 1 minor-faults # 0.005 K/sec
2.000490561 0 major-faults # 0.000 K/sec

Unable to connect to the NetBeans Distribution because of Zero sized file

I recently reinstalled Netbeans IDE on my Windows 10 PC in order to restore some unrelated configurations. When I tried checking for new plugins in order to be able to download the Sakila sample database,
I get this error.
I've tested the connection on both No Proxy and Use Proxy Settings, and both connection tests seem to end succesfully.
I have allowed Netbeans through my firewall, but this has changed nothing either.
I haven't touched my proxy configuration, so it's on default (autodetect). Switching the autodetect off doesn't change anything, either, no matter what proxy config i have on Netbeans.
Here's part of my log file that might be helpful:
Compiler: HotSpot 64-Bit Tiered Compilers
Heap memory usage: initial 32,0MB maximum 910,5MB
Non heap memory usage: initial 2,4MB maximum -1b
Garbage collector: PS Scavenge (Collections=12 Total time spent=0s)
Garbage collector: PS MarkSweep (Collections=3 Total time spent=0s)
Classes: loaded=6377 total loaded=6377 unloaded 0
INFO [org.netbeans.core.ui.warmup.DiagnosticTask]: Total memory 17.130.041.344
INFO [org.netbeans.modules.autoupdate.updateprovider.DownloadListener]: Connection content length was 0 bytes (read 0bytes), expected file size can`t be that size - likely server with file at http://updates.netbeans.org/netbeans/updates/8.0.2/uc/final/distribution/catalog.xml.gz?unique=NB_CND_EXTIDE_GFMOD_GROOVY_JAVA_JC_MOB_PHP_WEBCOMMON_WEBEE0d55337f9-fc66-4755-adec-e290169de9d5_bf88d09e-bf9f-458e-b1c9-1ea89147b12b is temporary down
INFO [org.netbeans.modules.autoupdate.ui.Utilities]: Zero sized file reported at http://updates.netbeans.org/netbeans/updates/8.0.2/uc/final/distribution/catalog.xml.gz?unique=NB_CND_EXTIDE_GFMOD_GROOVY_JAVA_JC_MOB_PHP_WEBCOMMON_WEBEE0d55337f9-fc66-4755-adec-e290169de9d5_bf88d09e-bf9f-458e-b1c9-1ea89147b12b
java.io.IOException: Zero sized file reported at http://updates.netbeans.org/netbeans/updates/8.0.2/uc/final/distribution/catalog.xml.gz?unique=NB_CND_EXTIDE_GFMOD_GROOVY_JAVA_JC_MOB_PHP_WEBCOMMON_WEBEE0d55337f9-fc66-4755-adec-e290169de9d5_bf88d09e-bf9f-458e-b1c9-1ea89147b12b
at org.netbeans.modules.autoupdate.updateprovider.DownloadListener.doCopy(DownloadListener.java:155)
at org.netbeans.modules.autoupdate.updateprovider.DownloadListener.streamOpened(DownloadListener.java:78)
at org.netbeans.modules.autoupdate.updateprovider.NetworkAccess$Task$1.run(NetworkAccess.java:111)
Caused: java.io.IOException: Zero sized file reported at http://updates.netbeans.org/netbeans/updates/8.0.2/uc/final/distribution/catalog.xml.gz?unique=NB_CND_EXTIDE_GFMOD_GROOVY_JAVA_JC_MOB_PHP_WEBCOMMON_WEBEE0d55337f9-fc66-4755-adec-e290169de9d5_bf88d09e-bf9f-458e-b1c9-1ea89147b12b
at org.netbeans.modules.autoupdate.updateprovider.DownloadListener.notifyException(DownloadListener.java:103)
at org.netbeans.modules.autoupdate.updateprovider.AutoupdateCatalogCache.copy(AutoupdateCatalogCache.java:246)
at org.netbeans.modules.autoupdate.updateprovider.AutoupdateCatalogCache.writeCatalogToCache(AutoupdateCatalogCache.java:99)
at org.netbeans.modules.autoupdate.updateprovider.AutoupdateCatalogProvider.refresh(AutoupdateCatalogProvider.java:154)
at org.netbeans.modules.autoupdate.services.UpdateUnitProviderImpl.refresh(UpdateUnitProviderImpl.java:180)
at org.netbeans.api.autoupdate.UpdateUnitProvider.refresh(UpdateUnitProvider.java:196)
[catch] at org.netbeans.modules.autoupdate.ui.Utilities.tryRefreshProviders(Utilities.java:433)
at org.netbeans.modules.autoupdate.ui.Utilities.doRefreshProviders(Utilities.java:411)
at org.netbeans.modules.autoupdate.ui.Utilities.presentRefreshProviders(Utilities.java:405)
at org.netbeans.modules.autoupdate.ui.UnitTab$14.run(UnitTab.java:806)
at org.openide.util.RequestProcessor$Task.run(RequestProcessor.java:1423)
at org.openide.util.RequestProcessor$Processor.run(RequestProcessor.java:2033)
It might be that the update server is down just right now; i haven't been able to test this either. But it also might be something wrong with my configurations. I'm going crazy!!1!
Something that worked for me was changing the "http:" to "https:" in the update urls.
I.E. Change "http://updates.netbeans.org/netbeans/updates/8.0.2/uc/final/distribution/catalog.xml.gz"
to "https://updates.netbeans.org/netbeans/updates/8.0.2/uc/final/distribution/catalog.xml.gz"
No idea why that makes it work on my end. I'm running Linux Mint 19.1.

short driver from Linux Device Drivers book

I am trying to run short from Linux Device Drivers book, a driver which uses by default the parallel interface of a pc at io address base 0x378. I am specifically using the /dev/short0 device.
Quoting from the book
/dev/short0 writes to and reads from the 8-bit port
located at the I/O address base (0x378 unless changed at load time).
The write operation (on default behavior) essentially does that
while (count--) {
outb(*(ptr++), port);
wmb( );
}
The ptr variable holds a pointer to the data the user has requested to be written to the device. Only the last byte of course survives, as preceding bytes get overwritten. The read operation works similarly, by using inb instead of outb.
Quoting also from the book
If you choose to read from an output
port, you most likely get back the last value written to the port (this applies to the parallel interface and to most other digital I/O circuits in common use)
So when i do
$ echo -n "a" > /dev/short0
$ dd if=/dev/short0 bs=1 count=1 | od -t x1
as suggested in the book, i expect to get back the ascii code for 'a' in hex, but what i get is 0xff:
1+0 records in
1+0 records out
1 byte (1 B) copied, 0,000155485 s, 6,4 kB/s
0000000 ff
0000001
I have verified, adding some printks and using dmesg, that the relevant code of the driver actually gets executed and beyond that, i' m stuck. What are some possible reasons for this not working? Or where should i look next to find out why it is not working?
For what it matters, the io address range 0x378-0x37a is initially allocated from the parport module, so i rmmod it along with a few other modules that use parport before i load the short module. Finally, on my system uname -a gives
Linux Crete 3.13.0-24-generic #47-Ubuntu SMP Fri May 2 23:31:42 UTC 2014 i686 i686 i686 GNU/Linux

OrientDB 2.2.2 - Is there a way to suppress OAbstractProfiler$MemoryChecker Messages?

We are running on 32bit windows and since upgrading from 1.4.1 to 2.2.2, we are seeing the following memory in stdout (numbers not exact):
INFO: Database 'BLAH' uses 770MB/912MB of DISKCACHE memory, while Heap is not completely used (usedHeap=123MB maxHeap=512MB). To improve performance set maxHeap to 124MB and DISKCACHE to 1296MB
With 32bit, we can only set a max of Xmx + storage.diskCache.bufferSize ~= 1.4gb without getting OOM or performance issues. Any combination of different sizes of either of these two configurable variables results in a variant of the above message.
Is there a way to suppress the above profiler/memory checker messages?
You can disable the profiler with:
java ... -Dprofiler.enabled=false ...
Set that configuration in your server.sh or in the last section of config/orientdb-server-config.xml file.

JVM crashes frequently

JVM crashes surprizingly and frequently on our prod environment and results in Jboss (EAP6.3) going down. We have java7 U72 installed
Crash logs has same output where current thread is:
Current thread (0x00000000d1d99000): JavaThread "Lucene Merge Thread #0" daemon [_thread_in_Java, id=1144, stack(0x00000000f6a00000,0x00000000f6b00000)]
and all the log is full of :
JavaThread "elasticsearch[Node BD852E44][search][T#68]" daemon [_thread_blocked, id=14396, stack(0x00000000f7b30000,0x00000000f7c30000)]
elasticsearch is some were related to indexing and it uses Lucene in hood as far as I understand but we have number or application deployed how to check on this can someone please help. complete crash logs are at : http://pastebin.com/845LU9iK
Looks like it didn't manage to record stack traces for the affected thread.
If that's the same for all crashes then it doesn't seem to match known lucene or jboss bugs.
# guarantee(result == EXCEPTION_CONTINUE_EXECUTION) failed: Unexpected result from topLevelExceptionFilter
AIUI this indicates an error in native exception handling, so it's one error masking another, probably making this crash log fairly useless.
So I can only provide really generic advice:
you're using an older JVM version, update to the latest java 7, java 8 or possibly even a java 9 dev build and see if it goes away. Even if they still crash they might provide different/more useful error reports
to diagnose potential compiler bugs you can try running with the following flags
-XX:-TieredCompilation 1 should disable the C1 compiler
-XX:+TieredCompilation -XX:TieredStopAtLevel=1 should disable the C2 compiler
-Xint disables all JIT, very slow
ask on the hotspot-dev mailing list for further guidance
1: Tiered compilation is a new java 7 feature, it basically combines the interpreter, C1 and C2 JIT compilers (which formerly were used separately in the client and server VMs) into different optimizing stages.
Each of them can have optimization bugs. Turning off individual stages helps isolating them as potential cause.
Edit: The new crash report is more useful since it at least has java frames, the interesting part is the following:
J 1559 sun.misc.Unsafe.getByte(J)B (0 bytes) # 0x000000000178e99b [0x000000000178e960+0x3b]
j java.nio.DirectByteBuffer.get()B+11
j org.apache.lucene.store.ByteBufferIndexInput.readByte()B+4
J 9447 C2 org.apache.lucene.store.DataInput.readVInt()I (114 bytes) # 0x000000000348cc00 [0x000000000348cbc0+0x40]
DataInput.readVInt seems to be an ongoing source of grief, see this SO answer for possible solutions