JVM crashes frequently - jboss

JVM crashes surprizingly and frequently on our prod environment and results in Jboss (EAP6.3) going down. We have java7 U72 installed
Crash logs has same output where current thread is:
Current thread (0x00000000d1d99000): JavaThread "Lucene Merge Thread #0" daemon [_thread_in_Java, id=1144, stack(0x00000000f6a00000,0x00000000f6b00000)]
and all the log is full of :
JavaThread "elasticsearch[Node BD852E44][search][T#68]" daemon [_thread_blocked, id=14396, stack(0x00000000f7b30000,0x00000000f7c30000)]
elasticsearch is some were related to indexing and it uses Lucene in hood as far as I understand but we have number or application deployed how to check on this can someone please help. complete crash logs are at : http://pastebin.com/845LU9iK

Looks like it didn't manage to record stack traces for the affected thread.
If that's the same for all crashes then it doesn't seem to match known lucene or jboss bugs.
# guarantee(result == EXCEPTION_CONTINUE_EXECUTION) failed: Unexpected result from topLevelExceptionFilter
AIUI this indicates an error in native exception handling, so it's one error masking another, probably making this crash log fairly useless.
So I can only provide really generic advice:
you're using an older JVM version, update to the latest java 7, java 8 or possibly even a java 9 dev build and see if it goes away. Even if they still crash they might provide different/more useful error reports
to diagnose potential compiler bugs you can try running with the following flags
-XX:-TieredCompilation 1 should disable the C1 compiler
-XX:+TieredCompilation -XX:TieredStopAtLevel=1 should disable the C2 compiler
-Xint disables all JIT, very slow
ask on the hotspot-dev mailing list for further guidance
1: Tiered compilation is a new java 7 feature, it basically combines the interpreter, C1 and C2 JIT compilers (which formerly were used separately in the client and server VMs) into different optimizing stages.
Each of them can have optimization bugs. Turning off individual stages helps isolating them as potential cause.
Edit: The new crash report is more useful since it at least has java frames, the interesting part is the following:
J 1559 sun.misc.Unsafe.getByte(J)B (0 bytes) # 0x000000000178e99b [0x000000000178e960+0x3b]
j java.nio.DirectByteBuffer.get()B+11
j org.apache.lucene.store.ByteBufferIndexInput.readByte()B+4
J 9447 C2 org.apache.lucene.store.DataInput.readVInt()I (114 bytes) # 0x000000000348cc00 [0x000000000348cbc0+0x40]
DataInput.readVInt seems to be an ongoing source of grief, see this SO answer for possible solutions

Related

Unable to connect to the NetBeans Distribution because of Zero sized file

I recently reinstalled Netbeans IDE on my Windows 10 PC in order to restore some unrelated configurations. When I tried checking for new plugins in order to be able to download the Sakila sample database,
I get this error.
I've tested the connection on both No Proxy and Use Proxy Settings, and both connection tests seem to end succesfully.
I have allowed Netbeans through my firewall, but this has changed nothing either.
I haven't touched my proxy configuration, so it's on default (autodetect). Switching the autodetect off doesn't change anything, either, no matter what proxy config i have on Netbeans.
Here's part of my log file that might be helpful:
Compiler: HotSpot 64-Bit Tiered Compilers
Heap memory usage: initial 32,0MB maximum 910,5MB
Non heap memory usage: initial 2,4MB maximum -1b
Garbage collector: PS Scavenge (Collections=12 Total time spent=0s)
Garbage collector: PS MarkSweep (Collections=3 Total time spent=0s)
Classes: loaded=6377 total loaded=6377 unloaded 0
INFO [org.netbeans.core.ui.warmup.DiagnosticTask]: Total memory 17.130.041.344
INFO [org.netbeans.modules.autoupdate.updateprovider.DownloadListener]: Connection content length was 0 bytes (read 0bytes), expected file size can`t be that size - likely server with file at http://updates.netbeans.org/netbeans/updates/8.0.2/uc/final/distribution/catalog.xml.gz?unique=NB_CND_EXTIDE_GFMOD_GROOVY_JAVA_JC_MOB_PHP_WEBCOMMON_WEBEE0d55337f9-fc66-4755-adec-e290169de9d5_bf88d09e-bf9f-458e-b1c9-1ea89147b12b is temporary down
INFO [org.netbeans.modules.autoupdate.ui.Utilities]: Zero sized file reported at http://updates.netbeans.org/netbeans/updates/8.0.2/uc/final/distribution/catalog.xml.gz?unique=NB_CND_EXTIDE_GFMOD_GROOVY_JAVA_JC_MOB_PHP_WEBCOMMON_WEBEE0d55337f9-fc66-4755-adec-e290169de9d5_bf88d09e-bf9f-458e-b1c9-1ea89147b12b
java.io.IOException: Zero sized file reported at http://updates.netbeans.org/netbeans/updates/8.0.2/uc/final/distribution/catalog.xml.gz?unique=NB_CND_EXTIDE_GFMOD_GROOVY_JAVA_JC_MOB_PHP_WEBCOMMON_WEBEE0d55337f9-fc66-4755-adec-e290169de9d5_bf88d09e-bf9f-458e-b1c9-1ea89147b12b
at org.netbeans.modules.autoupdate.updateprovider.DownloadListener.doCopy(DownloadListener.java:155)
at org.netbeans.modules.autoupdate.updateprovider.DownloadListener.streamOpened(DownloadListener.java:78)
at org.netbeans.modules.autoupdate.updateprovider.NetworkAccess$Task$1.run(NetworkAccess.java:111)
Caused: java.io.IOException: Zero sized file reported at http://updates.netbeans.org/netbeans/updates/8.0.2/uc/final/distribution/catalog.xml.gz?unique=NB_CND_EXTIDE_GFMOD_GROOVY_JAVA_JC_MOB_PHP_WEBCOMMON_WEBEE0d55337f9-fc66-4755-adec-e290169de9d5_bf88d09e-bf9f-458e-b1c9-1ea89147b12b
at org.netbeans.modules.autoupdate.updateprovider.DownloadListener.notifyException(DownloadListener.java:103)
at org.netbeans.modules.autoupdate.updateprovider.AutoupdateCatalogCache.copy(AutoupdateCatalogCache.java:246)
at org.netbeans.modules.autoupdate.updateprovider.AutoupdateCatalogCache.writeCatalogToCache(AutoupdateCatalogCache.java:99)
at org.netbeans.modules.autoupdate.updateprovider.AutoupdateCatalogProvider.refresh(AutoupdateCatalogProvider.java:154)
at org.netbeans.modules.autoupdate.services.UpdateUnitProviderImpl.refresh(UpdateUnitProviderImpl.java:180)
at org.netbeans.api.autoupdate.UpdateUnitProvider.refresh(UpdateUnitProvider.java:196)
[catch] at org.netbeans.modules.autoupdate.ui.Utilities.tryRefreshProviders(Utilities.java:433)
at org.netbeans.modules.autoupdate.ui.Utilities.doRefreshProviders(Utilities.java:411)
at org.netbeans.modules.autoupdate.ui.Utilities.presentRefreshProviders(Utilities.java:405)
at org.netbeans.modules.autoupdate.ui.UnitTab$14.run(UnitTab.java:806)
at org.openide.util.RequestProcessor$Task.run(RequestProcessor.java:1423)
at org.openide.util.RequestProcessor$Processor.run(RequestProcessor.java:2033)
It might be that the update server is down just right now; i haven't been able to test this either. But it also might be something wrong with my configurations. I'm going crazy!!1!
Something that worked for me was changing the "http:" to "https:" in the update urls.
I.E. Change "http://updates.netbeans.org/netbeans/updates/8.0.2/uc/final/distribution/catalog.xml.gz"
to "https://updates.netbeans.org/netbeans/updates/8.0.2/uc/final/distribution/catalog.xml.gz"
No idea why that makes it work on my end. I'm running Linux Mint 19.1.

OrientDB 2.2.2 - Is there a way to suppress OAbstractProfiler$MemoryChecker Messages?

We are running on 32bit windows and since upgrading from 1.4.1 to 2.2.2, we are seeing the following memory in stdout (numbers not exact):
INFO: Database 'BLAH' uses 770MB/912MB of DISKCACHE memory, while Heap is not completely used (usedHeap=123MB maxHeap=512MB). To improve performance set maxHeap to 124MB and DISKCACHE to 1296MB
With 32bit, we can only set a max of Xmx + storage.diskCache.bufferSize ~= 1.4gb without getting OOM or performance issues. Any combination of different sizes of either of these two configurable variables results in a variant of the above message.
Is there a way to suppress the above profiler/memory checker messages?
You can disable the profiler with:
java ... -Dprofiler.enabled=false ...
Set that configuration in your server.sh or in the last section of config/orientdb-server-config.xml file.

DDS 9th topic causes a crash

I am using DDS (more specifically RTI DDS) for a java application. I am creating each topic for my DDS implementation one by one in code so thus I can test each one with a DDS spy after the code is written. When I wrote the 8th topic everything worked fine. However when I then wrote the 9th topic, nothing seemed to happen as the program seemed to stop somewhere. I then debugged and after a lot of stepping into code, got this printed to council.
#
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x01349a58, pid=16109, tid=2429123440
#
# JRE version: Java(TM) SE Runtime Environment (7.0_65-b17) (build 1.7.0_65-b17)
# Java VM: Java HotSpot(TM) Server VM (24.65-b04 mixed mode linux-x86 )
# Problematic frame:
# V [libjvm.so+0x48aa58] java_lang_String::utf8_length(oopDesc*)+0x58
#
# Core dump written. Default location: /home/foo/core or core.16109
#
# An error report file with more information is saved as:
#
# /home/foo/corehs_err_pid16109.log
#
# If you would like to submit a bug report, please visit:
# http://bugreport.sun.com/bugreport/crash.jsp
#
[D0000|ENABLE]COMMENDSrReaderService_new:!create worker-specific object
[D0000|ENABLE]PRESPsService_enable:!create srr (strict reliable reader)
[D0000|ENABLE]DDS_DomainParticipantService_enable:!enable publish/subscribe service
[D0000|ENABLE]DDS_DomainParticipant_enableI:!enable service
I am not sure why this has happened all of a sudden when I created my 9th topic, yet if I only have 8 it works great. I have tried to increase my resourcelimits as well and get an Immutable QOS Policy error. Does anyone know why this error is occurring in terms of why my 9th topic causes a failure and how to fix the problem? I am running my application on 32 bit RHEL 6.6.
I found on this is because of the max objects per thread by default is set to 8 by the qos. To change this setting, before your first topic is created you must do the following.
DomainParticipantFactoryQos factoryQos =
new DomainParticipantFactoryQos();
DomainParticipantFactory.TheParticipantFactory.get_qos(factoryQos);
factoryQos.resource_limits.max_objects_per_thread = 2048;
DomainParticipantFactory.TheParticipantFactory.set_qos(factoryQos);
This then sets the size before the DDS starts and is thus editable and not immutable at that point.

High tomcat thread count caused by threads waiting on MultiThreadedHttpConnectionManager ConnectionPool

We've recently started seeing spikes in the thread counts on our tomcat servers (peaking at over 1000 when normally at around 100). We performed a thread dump on one of the tomcat servers whilst it's thread count was high and found that a large number of the threads were waiting on MultiThreadedHttpConnectionManager$ConnectionPool, stack trace as follows:
"TP-Processor21700" daemon prio=10 tid=0x4a0b3400 nid=0x2091 in Object.wait() [0x399f3000..0x399f4004]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x58ee5030> (a org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$ConnectionPool)
at org.apache.commons.httpclient.MultiThreadedHttpConnectionManager.doGetConnection(MultiThreadedHttpConnectionManager.java:518)
- locked <0x58ee5030> (a org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$ConnectionPool)
at org.apache.commons.httpclient.MultiThreadedHttpConnectionManager.getConnectionWithTimeout(MultiThreadedHttpConnectionManager.java:416)
at org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:153)
at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397)
at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323)
...
There are 3 points in our code where httpClient.executeMethod() is called (to obtain info via an http request to another tomcat server). In each case the GetMethod object passed to it has had its socket timeout value set (i.e. via getMethod.getParams().setSoTimeout();) before hand, and the MultiThreadedConnectionManager is configured in spring to have a connectionTimeout value of 10 seconds. One thing I have noticed is that only 2 of the 3 httpClient.executeMethod() invocations are followed by a call to getMethod.releaseConnection(), so I'm wondering if this may be the cause of the problem (i.e. connections not being explicitly released). However what's strange is that
the problem has only started occurring in the last few days and the source code has not been modified for over a year, plus the fact that there has been no recent surge in requests coming through to the tomcat servers. One change that did occur a couple of days before the problem started to occur was that we upgraded the JVM used by the tomcat server from Java 5 (1.5 update 14) to Java 6 (1.6 update 25). We have tried temporarily reverting the JVM version to Java 5 to see if the problem stopped occurring but it did not. Another point to note is that in most cases the tomcat server eventually recovers and
the thread count drops back to normal - we've only had one instance where a tomcat process appears to have crashed because of the thread count increase.
We are running Tomcat 5.5 with commons-httpclient-3.1.jar running against a Java 1.6 update 25 on a Red Hat linux environment.
Please let me know if you can suggest any ideas as to what may be the cause of this issue.
Thanks.
The problem was indeed caused by the fact that only 2 of the 3 httpClient.executeMethod(getMethod) invocations were followed by a call to getMethod.releaseConnection(). Ensuring all 3 httpClient.executeMethod(getMethod) invocations were inside a try/catch block followed by a finally block containing a call to getMethod.releaseConnection() prevented the high thread counts from occurring. Although this code had been in our live system for over a year, it appears that the reason the high thread count issue only recently started occurring was because various search engine crawlers had started hitting the site with lots of URL requests that caused the code where the connection was being used but not subsequently released to execute. Problem solved.

GTK applications fail to start - xfs restart needed Options

Sorry, not really programming question, but I am not sure where else I could find some help.
After a recent update (Xorg was updated among other things), GTK apps stopped running in my kde4. I have a Debian unstable, updated around 22 April. When I try to run them I get the following error:
ga#grzes:~$ iceweasel
The program 'firefox-bin' received an X Window System error.
This probably reflects a bug in the program.
The error was 'BadName (named color or font does not exist)'.
(Details: serial 888 error_code 15 request_code 45 minor_code 0)
(Note to programmers: normally, X errors are reported asynchronously;
that is, you will receive the error a while after causing it.
To debug your program, run it with the --sync command line
option to change this behavior. You can then get a meaningful
backtrace from your debugger if you break on the gdk_x_error() function.)
ga#grzes:~$ gimp The program 'gimp' received an X Window
System error.
This probably reflects a bug in the program.
The error was 'BadName (named color or font does not exist)'.
(Details: serial 6955 error_code 15 request_code 45 minor_code 0)
(Note to programmers: normally, X errors are reported asynchronously;
that is, you will receive the error a while after causing it.
To debug your program, run it with the --sync command line
option to change this behavior. You can then get a meaningful
backtrace from your debugger if you break on the gdk_x_error() function.)
(script-fu:4643): LibGimpBase-WARNING **: script-fu: gimp_wire_read():
error
I have to restart the font server manually to have it fixed:
ga#grzes:~$ su
Password:
grzes:/home/ga# /etc/init.d/xfs restart
Stopping X font server: xfs.
Setting up X font server socket directory /tmp/.font-unix...done.
Starting X font server: xfs.
Any ideas what could be wrong? Is it a configuration issue? My system has been updated for the last 7 years, so I can have some old settings.
Debian unstable is very... unstable now, since a release was made a short time ago. Major changes and packages migrations are happening. Xorg (and all X related stuff) being one of the critical packages in that process. My advice is to perform a new update/upgrade in order to catch a new version that may resolve this problem.
It's very frequent that after an update some thing will get broken in inexplicable ways, simply because the developers are uploading new, and not much tested, version of the applications
I finally figured this out: seems like xfs is not compatible with the other components currently and luckily removing it form the system completely solves the problem.