jboss hangs up for long time after starting PersistenceUnitDeployment - jboss

I see the following texts on jboss console when I start it. The reason seems to be due to garbace collection. That was not happening before. I havent changed any neither any configuration files nor any of the source codes. Any ideas to resolve?
As you can see I'm waiting nearly (12:37 - 12:21) 16 minutes in between after it starts the persistence part.
12:21:53,438 INFO [PersistenceUnitDeployment] Starting persistence unit persistence.units:ear=ikarus.ear,unitName=ikarus
36.473: [GC 74716K->29865K(241856K), 0.0153986 secs]
37.818: [GC 75689K->35101K(240896K), 0.0124849 secs]
40.876: [GC 80925K->37018K(242304K), 0.0124359 secs]
41.176: [GC 84186K->38778K(241792K), 0.0096731 secs]
41.481: [GC 85946K->40591K(241152K), 0.0166358 secs]
41.621: [GC 86863K->43877K(241600K), 0.0127246 secs]
93.771: [GC 90149K->46121K(241856K), 0.0080522 secs]
324.787: [GC 92777K->46313K(241728K), 0.0025572 secs]
534.417: [GC 92804K->46457K(241920K), 0.0012326 secs]
788.777: [GC 93241K->46677K(241792K), 0.0017520 secs]
907.338: [GC 72305K->46805K(242688K), 0.0030763 secs]
907.342: [Full GC 46805K->46781K(242688K), 0.1523979 secs]
12:37:02,786 INFO [JmxKernelAbstraction] creating wrapper delegate for: org.jboss.ejb3.entity.PersistenceUnitDeployment

Take some thread dumps in between to see what is the VM doing and where it is hanging.
The garbage collection may not necessarily by the reason for the slowdown, it may be just suggesting that there is another operation that is taking too long and that is causing the GC. Check the thread dumpts to analyze the activity of the VM during that interval.

Related

HikariCP with SpringBoot

I have configured the max. connection pool size of my SpringBoot application to 1 by using the following command:
spring.datasource.hikari.maximum-pool-size=1
Is there any way to verify and confirm this change. I want to check that this is working for my application.
You can see the pool values in log, if you enable debug log for HikariCP.
HikariCP housekeeper thread logs pool information at fixed time interval.
Just set com.zaxxer.hikari logging level to debug.
In logback.xml you can do it like
<logger name="com.zaxxer.hikari" level="debug"/>
Or you can do it in application.properties
logging.level.com.zaxxer.hikari=debug
In your console of log file you will find something similar like this.
DEBUG [HikariPool-1 housekeeper] com.zaxxer.hikari.pool.HikariPool: HikariPool-1 - Pool stats (total=10, active=0, idle=10, waiting=0)
Total value should not exceed maximum-pool-size value.

Blocked Thread on Ignite Cluster

Running Ignite inside kubernetes environment, but after a while when the load is increased getting below Blocked Thread in Thread dump,
qtp102185114-154-acceptor-2#6275fad4-ServerConnector#6a3aa688{HTTP/1.1}{0.0.0.0:8080} - priority:5 - threadId:0x00007f78a6e1c000 - nativeId:0x84 - state:BLOCKED
stackTrace:
java.lang.Thread.State: BLOCKED (on object monitor)
at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:234)
- waiting to lock <0x00000000a049f430> (a java.lang.Object)
at org.eclipse.jetty.server.ServerConnector.accept(ServerConnector.java:377)
at org.eclipse.jetty.server.AbstractConnector$Acceptor.run(AbstractConnector.java:500)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635)
at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555)
at java.lang.Thread.run(Thread.java:748)
Locked ownable synchronizers:
- None
Can anyone explain what's going wrong here? Given Blocked thread is on every node of a cluster.
It looks like Jetty (which is used for REST API internally in Ignite) doesn't have enough threads in the pool to accept all incoming connection when loading increases. I would suggest checking CPU utilization firstly and it is not 100% then you are able to increase threads count in Jetty XML configuration

Spark fails on big shuffle jobs with java.io.IOException: Filesystem closed

I often find spark fails with large jobs with a rather unhelpful meaningless exception. The worker logs look normal, no errors, but they get state "KILLED". This is extremely common for large shuffles, so operations like .distinct.
The question is, how do I diagnose what's going wrong, and ideally, how do I fix it?
Given that a lot of these operations are monoidal I've been working around the problem by splitting the data into, say 10, chunks, running the app on each chunk, then running the app on all of the resulting outputs. In other words - meta-map-reduce.
14/06/04 12:56:09 ERROR client.AppClient$ClientActor: Master removed our application: FAILED; stopping client
14/06/04 12:56:09 WARN cluster.SparkDeploySchedulerBackend: Disconnected from Spark cluster! Waiting for reconnection...
14/06/04 12:56:09 WARN scheduler.TaskSetManager: Loss was due to java.io.IOException
java.io.IOException: Filesystem closed
at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:703)
at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:779)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:840)
at java.io.DataInputStream.read(DataInputStream.java:149)
at org.apache.hadoop.io.compress.DecompressorStream.getCompressedData(DecompressorStream.java:159)
at org.apache.hadoop.io.compress.DecompressorStream.decompress(DecompressorStream.java:143)
at org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:85)
at java.io.InputStream.read(InputStream.java:101)
at org.apache.hadoop.util.LineReader.fillBuffer(LineReader.java:180)
at org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:216)
at org.apache.hadoop.util.LineReader.readLine(LineReader.java:174)
at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:209)
at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:47)
at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:164)
at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:149)
at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:71)
at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:27)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
at scala.collection.mutable.ListBuffer.$plus$plus$eq(ListBuffer.scala:176)
at scala.collection.mutable.ListBuffer.$plus$plus$eq(ListBuffer.scala:45)
at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
at scala.collection.AbstractIterator.to(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.toList(TraversableOnce.scala:257)
at scala.collection.AbstractIterator.toList(Iterator.scala:1157)
at $line5.$read$$iwC$$iwC$$iwC$$iwC$$anonfun$2.apply(<console>:13)
at $line5.$read$$iwC$$iwC$$iwC$$iwC$$anonfun$2.apply(<console>:13)
at org.apache.spark.rdd.RDD$$anonfun$1.apply(RDD.scala:450)
at org.apache.spark.rdd.RDD$$anonfun$1.apply(RDD.scala:450)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:34)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:241)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:232)
at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:241)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:232)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:34)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:241)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:232)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:161)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:102)
at org.apache.spark.scheduler.Task.run(Task.scala:53)
at org.apache.spark.executor.Executor$TaskRunner$$anonfun$run$1.apply$mcV$sp(Executor.scala:213)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:42)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:41)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at org.apache.spark.deploy.SparkHadoopUtil.runAsUser(SparkHadoopUtil.scala:41)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:178)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
As of September 1st 2014, this is an "open improvement" in Spark. Please see https://issues.apache.org/jira/browse/SPARK-3052. As syrza pointed out in the given link, the shutdown hooks are likely done in incorrect order when an executor failed which results in this message. I understand you will have to little more investigation to figure out the main cause of problem (i.e. why your executor failed). If it is a large shuffle, it might be an out-of-memory error which cause executor failure which then caused the Hadoop Filesystem to be closed in their shutdown hook. So, the RecordReaders in running tasks of that executor throw "java.io.IOException: Filesystem closed" exception. I guess it will be fixed in subsequent release and then you will get more helpful error message :)
Something calls DFSClient.close() or DFSClient.abort(), closing the client. The next file operation then results in the above exception.
I would try to figure out what calls close()/abort(). You could use a breakpoint in your debugger, or modify the Hadoop source code to throw an exception in these methods, so you would get a stack trace.
The exception about “file system closed” can be solved if the spark job is running on a cluster. You can set properties like spark.executor.cores , spark.driver.cores and spark.akka.threads to the maximum values w.r.t your resource availability. I had the same problem when my dataset was pretty large with JSON data about 20 million records. I fixed it with the above properties and it ran like a charm. In my case, I set those properties to 25,25 and 20 respectively. Hope it helps!!
Reference Link:
http://spark.apache.org/docs/latest/configuration.html

Accessing Shark tables (Hive) from Scala (shark-shell)

I have shark-0.8.0 which runs on hive-0.9.0. I am able to program on Hive by invoking shark. I created a few tables and loaded them with data.
Now, I am trying to access the data from these tables using Scala. I invoked the Scala shell using shark-shell. But when I try to select, I get an error that the table is not present.
scala> val artists = sc.sql2rdd("select artist from default.lastfm")
Hive history file=/tmp/hduser2/hive_job_log_hduser2_201405091617_1513149542.txt
151.738: [GC 317312K->83626K(1005568K), 0.0975990 secs]
151.836: [Full GC 83626K->76005K(1005568K), 0.4523880 secs]
152.313: [GC 80536K->76140K(1005568K), 0.0030990 secs]
152.316: [Full GC 76140K->62214K(1005568K), 0.1716240 secs]
FAILED: Error in semantic analysis: Line 1:19 Table not found 'lastfm'
shark.api.QueryExecutionException: FAILED: Error in semantic analysis: Line 1:19 Table not found 'lastfm'
at shark.SharkDriver.tableRdd(SharkDriver.scala:149)
at shark.SharkContext.sql2rdd(SharkContext.scala:100)
at <init>(<console>:17)
at <init>(<console>:22)
at <init>(<console>:24)
at <init>(<console>:26)
at <init>(<console>:28)
at <init>(<console>:30)
at <init>(<console>:32)
at .<init>(<console>:36)
at .<clinit>(<console>)
at .<init>(<console>:11)
at .<clinit>(<console>)
at $export(<console>)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:629)
at org.apache.spark.repl.SparkIMain$Request$$anonfun$10.apply(SparkIMain.scala:890)
at scala.tools.nsc.interpreter.Line$$anonfun$1.apply$mcV$sp(Line.scala:43)
at scala.tools.nsc.io.package$$anon$2.run(package.scala:25)
at java.lang.Thread.run(Thread.java:744)
From the documentation (https://github.com/amplab/shark/wiki/Shark-User-Guide), these steps are enough to get Shark up and running and select data using Scala. Or am I missing something? Is there some configuration file that needs to be modified to enable access to Shark from shark-shell ?
Have you updated your shark-hive directory configuration to properly reflect the hive metastore jdbc connection info?
You will need to copy the hive-default.xml to hive-site.xml . Then ensure the metastore properties are set.
Here is the basic info in hive-site.xml
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://myhost/metastore</value>
<description>the URL of the MySQL database</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hive</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>mypassword</value>
</property>
You can get more details here: configuring hive metastore

I'm thrilled that this scala snippet uses all of my processors to find the (correct) answer faster but... why does it do that?

So I was messing around with some easy problems to get better at scala and I wrote the following program to calculate primes using an Eratosthenes's sieve. When I bump up the number of primes to find, I noticed that my cpu would max out during the calculation. Now I have no clue why it's using more than 1 core and I was afraid it would muck up the answer but it appears to be correct upon multiple runs so it must not be. I'm not using .par anywhere and most all of my logic is in for-comprehensions.
Edit: I'm using scala 2.9.1
object Main {
val MAX_PRIME = 10000000
def main(args: Array[String]) {
println("Generating array")
val primeChecks = scala.collection.mutable.ArrayBuffer.fill(MAX_PRIME + 1)(true)
primeChecks(0) = false
println("Finding primes")
for (
i ← 2 to MAX_PRIME if primeChecks(i);
j ← i * 2 to MAX_PRIME by i
) primeChecks(j) = false
println("Filtering primes")
val primes = for { (status, num) ← primeChecks.zipWithIndex if status } yield num
println("Found %d prime numbers!".format(primes.length))
println("Saving the primes")
val formatter = new java.util.Formatter("primes.txt", "UTF-8")
try {
for (prime ← primes)
formatter.format("%d%n", prime.asInstanceOf[Object])
}
finally {
try { formatter.close } catch { case _ ⇒ }
}
}
}
Edit 2: You can use the following snippet in a REPL to get the multi-threading behavior so therefore it has to be because of the for-comprehension (at least in scala 2.9.1).
val max = 10000000
val t = scala.collection.mutable.ArrayBuffer.fill(max + 1)(true)
for (
i <- 2 to max if t(i);
j <- i * 2 to max by i
) t(j) = false
It's not your code that's using multiple threads, it's the JVM that is. What you are seeing is the GC kicking in. If I increase MAX_PRIME to 1000000000 and give it 6Gb of Java stack to play with I can see a steady-state of 100% of 1 CPU and about 4Gb mem. Every so often the GC kicks in and it then uses 2 CPUs. The following Java stack trace (pruned for clarity) show what's running inside the JVM:
"Attach Listener" daemon prio=3 tid=0x0000000000d13800 nid=0xf waiting on condition [0x0000000000000000]
"Low Memory Detector" daemon prio=3 tid=0x0000000000a15000 nid=0xd runnable [0x0000000000000000]
"C2 CompilerThread1" daemon prio=3 tid=0x0000000000a11800 nid=0xc waiting on condition [0x0000000000000000]
"C2 CompilerThread0" daemon prio=3 tid=0x0000000000a0e800 nid=0xb waiting on condition [0x0000000000000000]
"Signal Dispatcher" daemon prio=3 tid=0x0000000000a0d000 nid=0xa runnable [0x0000000000000000]
"Finalizer" daemon prio=3 tid=0x00000000009e7000 nid=0x9 in Object.wait() [0xffffdd7fff6dd000]
"Reference Handler" daemon prio=3 tid=0x00000000009e5800 nid=0x8 in Object.wait() [0xffffdd7fff7de000]
"main" prio=3 tid=0x0000000000428800 nid=0x2 runnable [0xffffdd7fffa3d000]
java.lang.Thread.State: RUNNABLE
at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:76)
"VM Thread" prio=3 tid=0x00000000009df800 nid=0x7 runnable
"GC task thread#0 (ParallelGC)" prio=3 tid=0x0000000000438800 nid=0x3 runnable
"GC task thread#1 (ParallelGC)" prio=3 tid=0x000000000043c000 nid=0x4 runnable
"GC task thread#2 (ParallelGC)" prio=3 tid=0x000000000043d800 nid=0x5 runnable
"GC task thread#3 (ParallelGC)" prio=3 tid=0x000000000043f800 nid=0x6 runnable
"VM Periodic Task Thread" prio=3 tid=0x0000000000a2f800 nid=0xe waiting on condition
There's only one thread (main) running Scala code, all the others are internal JVM ones. Note in particular there's 4 GC threads in this case - that's because I'm running this on a 4-way machine and by default the JVM will allocate 1 GC thread per core - the exact setup will depend on the particular mix of platform, JVM and command-line flags that are used.
If you want to understand the details (It's complicated!), the following links should get you started:
Java SE 6 Performance White Paper
Memory Management in the JavaHotSpot™ Virtual Machine
Update: Further testing with provided jar leads to multicore usage on OSX, Java 1.6.0_26, HotSpot Server VM, Scala 2.9.1.
If you're on a *nix-based system, this will say 90% and really only be using one core. It'll say 230% for 100% of 2 cores and 30% of another or any variation thereof.
For this code on my machine, the CPU usage bounces between 99% and 130%, the 130% when the garbage collector is running in the background.