Spark - JVM Insufficient memory error while using Spark SQL - scala

I am trying to run a spark job to process some Json data using Spark SQL. When i submit the job, I see the following error in the logs,
Java HotSpot(TM) 64-Bit Server VM warning: INFO: os::commit_memory(0x00007f29b96d5000, 12288, 0) failed; error='Cannot allocate memory' (errno=12)
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (mmap) failed to map 12288 bytes for committing reserved memory.
# An error report file with more information is saved as:
# /tmp/hs_err_pid5716.log
I am using the following code in the application,
val url = "foo://fooLink"
val rawData = sqlContext.read.option("multiline", true).json(url)
val pwp = new PrintWriter(new File("/tmp/file"))
rawData.collect.foreach(pwp.println)
pwp.close()
Command used to submit the job:
spark-submit --spark-conf spark.driver.userClassPathFirst=true --region us-east-1 --classname someClass somePackage-1.0-super.jar
It works for lesser data. But for some reason, the job does not create the "/tmp/file" in the cluster and throws the above error in the driver logs. Is there a way I can work around this? Any ideas would be greatly appreciated. Thanks :)

You will have to tweak some VM flags : XX:MaxDirectMemorySize and Xmx
Edit your spark-defaults.conf and modifiy the spark.executor.extraJavaOptions option to set the flags.

Related

Unable to start the daemon process. This problem might be caused by incorrect configuration of the daemon

I Try almost all Solution That I found on Internet But my problem is Still not solved.
I Open gradle.properties and I add The Below code It doesn't help me,
org.gradle.jvmargs=-Xmx1024m -XX:MaxPermSize=512m
I also delete .gradle directory then I try building again. But The Problem Still Exists. Please I want Help?
FAILURE: Build failed with an exception.
* What went wrong:
Unable to start the daemon process.
This problem might be caused by incorrect configuration of the daemon.
For example, an unrecognized jvm option is used.
Please refer to the User Manual chapter on the daemon at https://docs.gradle.org/6.7/userguide/gradle_daemon.html
Process command line: C:\Program Files\Java\jdk1.8.0_291\bin\java.exe -Xmx1024M -Dfile.encoding=windows-1252 -Duser.country=US -Duser.language=en -Duser.variant -cp C:\Users\likec\.gradle\wrapper\dists\gradle-6.7-all\cuy9mc7upwgwgeb72wkcrupxe\gradle-6.7\lib\gradle-launcher-6.7.jar org.gradle.launcher.daemon.bootstrap.GradleDaemon 6.7
Please read the following process output to find out more:
-----------------------
#
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (malloc) failed to allocate 1048576 bytes for AllocateHeap
# An error report file with more information is saved as:
# C:\Users\likec\.gradle\daemon\6.7\hs_err_pid14492.log

ExecutionSetupException: One or more nodes lost connectivity during query

While running a query on Dremio 4.6.1 installed on Kubernetes, we are getting the following error message from Dremio UI:
ExecutionSetupException: One or more nodes lost connectivity during query. Identified nodes were [dremio-executor-2.dremio-cluster-pod.dremio.svc.cluster.local:0].
Dremio-env config has the following settings:
DREMIO_MAX_DIRECT_MEMORY_SIZE_MB=13384
DREMIO_MAX_HEAP_MEMORY_SIZE_MB is not set
We are using workers of 16G /8c (Total of 10 workers)
1 Master Coordinator with the same config
Zookeeper with 1G/ 1c
Any idea what's causing this behavior ?
By doing a live logs tail before the worker crashes here are the logs:
An irrecoverable stack overflow has occurred.
Please check if any of your loaded .so files has enabled executable stack (see man page execstack(8))
#
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x00007f41cdac4fa8, pid=1, tid=0x00007f41dc2ed700
#
# JRE version: OpenJDK Runtime Environment (8.0_262-b10) (build 1.8.0_262-b10)
# Java VM: OpenJDK 64-Bit Server VM (25.262-b10 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# C 0x00007f41cdac4fa8
#
# Core dump written. Default location: /opt/dremio/core or core.1
#
# An error report file with more information is saved as:
# /tmp/hs_err_pid1.log
#
# If you would like to submit a bug report, please visit:
# http://bugreport.java.com/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#
[error occurred during error reporting , id 0xb]

Scala Spark : (org.apache.spark.repl.ExecutorClassLoader) Failed to check existence of class org on REPL class server at path

Running basic df.show() post spark notebook installation
I am getting the following error when running scala - spark code on spark-notebook. Any idea when this occurs and how to avoid?
[org.apache.spark.repl.ExecutorClassLoader] Failed to check existence of class org.apache.spark.sql.catalyst.expressions.Object on REPL class server at spark://192.168.10.194:50935/classes
[org.apache.spark.util.Utils] Aborting task
[org.apache.spark.repl.ExecutorClassLoader] Failed to check existence of class org on REPL class server at spark://192.168.10.194:50935/classes
[org.apache.spark.util.Utils] Aborting task
[org.apache.spark.repl.ExecutorClassLoader] Failed to check existence of class
I installed the spark on local, and when I was using following code it was giving me the same error.
spark.read.format("json").load("Downloads/test.json")
I think the issue was, it was trying to find some master node and taking some random or default IP. I specified the mode and then provided the IP as 127.0.0.1 and it resolved my issue.
Solution
Run the spark using local master
usr/local/bin/spark-shell --master "local[4]" --conf spark.driver.host=127.0.0.1'

Mappers fail for pig to insert data into MongoDB

I am trying to import a file from HDFS to MongoDB using MongoInsertStorage with PIG. The files are large, around 5GB. The script runs fine when I run it in local mode with
pig -x local example.pig
However if I run it in the mapreduce mode, Most of the mappers fail with the following error:
Error: com.mongodb.ConnectionString.getReadConcern()Lcom/mongodb/ReadConcern;
Container killed by the ApplicationMaster.
Container killed on request.
Exit code is 143 Container exited with a non-zero exit code 143
Can someone help me solve this issue?? I also increased the memory allocated to YARN containers but that hasnt helped.
Some mappers are also timing out after 300 seconds.
Pig Script is as follows
REGISTER mongo-java-driver-3.2.2.jar
REGISTER mongo-hadoop-core-1.4.0.jar
REGISTER mongo-hadoop-pig-1.4.0.jar
REGISTER mongodb-driver-3.2.2.jar
DEFINE MongoInsertStorage com.mongodb.hadoop.pig.MongoInsertStorage();
SET mapreduce.reduce.speculative true
BIG_DATA = LOAD 'hdfs://example.com:8020/user/someuser/sample.csv' using PigStorage(',') As (a:chararray,b:chararray,c:chararray);
STORE BIG_DATA INTO 'mongodb://insert.some.ip.here:27017/test.samplecollection' USING MongoInsertStorage('', '')
Found a solution.
For the error
Error: com.mongodb.ConnectionString.getReadConcern()Lcom/mongodb/ReadConcern;
Container killed by the ApplicationMaster.
Container killed on request.
Exit code is 143 Container exited with a non-zero exit code 143
I changed the JAR versions - hadoopcore and hadooppig from 1.4.0 to 2.0.2 and for Mongo Java driver from 3.2.2 to 3.4.2. This eliminated the ReadConcern Error on the mappers!
For the timeout, I added this after registering the jars:
SET mapreduce.task.timeout 1800000
I had been using SET mapred.task.timeout which didnt work
Hope this helps anyone who has a similar issue!

DDS 9th topic causes a crash

I am using DDS (more specifically RTI DDS) for a java application. I am creating each topic for my DDS implementation one by one in code so thus I can test each one with a DDS spy after the code is written. When I wrote the 8th topic everything worked fine. However when I then wrote the 9th topic, nothing seemed to happen as the program seemed to stop somewhere. I then debugged and after a lot of stepping into code, got this printed to council.
#
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x01349a58, pid=16109, tid=2429123440
#
# JRE version: Java(TM) SE Runtime Environment (7.0_65-b17) (build 1.7.0_65-b17)
# Java VM: Java HotSpot(TM) Server VM (24.65-b04 mixed mode linux-x86 )
# Problematic frame:
# V [libjvm.so+0x48aa58] java_lang_String::utf8_length(oopDesc*)+0x58
#
# Core dump written. Default location: /home/foo/core or core.16109
#
# An error report file with more information is saved as:
#
# /home/foo/corehs_err_pid16109.log
#
# If you would like to submit a bug report, please visit:
# http://bugreport.sun.com/bugreport/crash.jsp
#
[D0000|ENABLE]COMMENDSrReaderService_new:!create worker-specific object
[D0000|ENABLE]PRESPsService_enable:!create srr (strict reliable reader)
[D0000|ENABLE]DDS_DomainParticipantService_enable:!enable publish/subscribe service
[D0000|ENABLE]DDS_DomainParticipant_enableI:!enable service
I am not sure why this has happened all of a sudden when I created my 9th topic, yet if I only have 8 it works great. I have tried to increase my resourcelimits as well and get an Immutable QOS Policy error. Does anyone know why this error is occurring in terms of why my 9th topic causes a failure and how to fix the problem? I am running my application on 32 bit RHEL 6.6.
I found on this is because of the max objects per thread by default is set to 8 by the qos. To change this setting, before your first topic is created you must do the following.
DomainParticipantFactoryQos factoryQos =
new DomainParticipantFactoryQos();
DomainParticipantFactory.TheParticipantFactory.get_qos(factoryQos);
factoryQos.resource_limits.max_objects_per_thread = 2048;
DomainParticipantFactory.TheParticipantFactory.set_qos(factoryQos);
This then sets the size before the DDS starts and is thus editable and not immutable at that point.