Spark on Yarn: Max number of executor failures reached - scala

When I am running spark job on cluster mode I am facing following issue:
6/05/25 12:42:55 INFO Client: Application report for application_1464166348026_0025 (state: RUNNING)
16/05/25 12:42:56 INFO Client: Application report for application_1464166348026_0025 (state: FINISHED)
16/05/25 12:42:56 INFO Client:
client token: N/A
diagnostics: N/A
ApplicationMaster host: 10.255.8.181
ApplicationMaster RPC port: 0
queue: root.pimuser
start time: 1464172925289
final status: FAILED
tracking URL: http://test-hadoop-001.localdomain:8088/proxy/application_1464166348026_0025/history/application_1464166348026_0025/2
user: pimuser
Exception in thread "main" org.apache.spark.SparkException: Application application_1464166348026_0025 finished with failed status
at org.apache.spark.deploy.yarn.Client.run(Client.scala:927)
at org.apache.spark.deploy.yarn.Client$.main(Client.scala:973)
at org.apache.spark.deploy.yarn.Client.main(Client.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:672)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
16/05/25 12:42:56 INFO ShutdownHookManager: Shutdown hook called
Following Command I am using to run the job.
spark-submit --driver-java-options -XX:MaxPermSize=2048m --driver-memory 4g --deploy-mode cluster --master yarn --files cluster.xls --class com.app.test.Matching target/test-0.0.1-SNAPSHOT-jar-with-dependencies.jar
Even I tried --master yarn-cluster also but I got same error.
I am using cloudera 5.5 ,Hadoop 2.6.0-cdh5.5.1 and Spark 1.5 versions.

Related

Spark application erroring out because driver outofmemory even though lot free memory still available

Can someone please help me to figure out my simple spark application is requiring huge driver memory? Even though I allocated about 112GB, my application fails at about 67GB.
Thanks in advance
The spark driver is using huge memory for running simple application
Allocated about 112G of memory for running my application when do spark-submit
At start of the job, I see below message in the logs
1019 [main] INFO org.apache.spark.storage.memory.MemoryStore - MemoryStore started with capacity 67.0 GiB
My application fails with this error message
java.lang.IllegalStateException: dag-scheduler-event-loop has already been stopped accidentally.
at org.apache.spark.util.EventLoop.post(EventLoop.scala:107)
at org.apache.spark.scheduler.DAGScheduler.taskStarted(DAGScheduler.scala:283)
at org.apache.spark.scheduler.TaskSetManager.prepareLaunchingTask(TaskSetManager.scala:539)
at org.apache.spark.scheduler.TaskSetManager.$anonfun$resourceOffer$2(TaskSetManager.scala:478)
at scala.Option.map(Option.scala:230)
at org.apache.spark.scheduler.TaskSetManager.resourceOffer(TaskSetManager.scala:455)
at org.apache.spark.scheduler.TaskSchedulerImpl.$anonfun$resourceOfferSingleTaskSet$2(TaskSchedulerImpl.scala:395)
at org.apache.spark.scheduler.TaskSchedulerImpl.$anonfun$resourceOfferSingleTaskSet$2$adapted(TaskSchedulerImpl.scala:390)
at scala.Option.foreach(Option.scala:407)
at org.apache.spark.scheduler.TaskSchedulerImpl.$anonfun$resourceOfferSingleTaskSet$1(TaskSchedulerImpl.scala:390)
at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:158)
at org.apache.spark.scheduler.TaskSchedulerImpl.resourceOfferSingleTaskSet(TaskSchedulerImpl.scala:381)
at org.apache.spark.scheduler.TaskSchedulerImpl.$anonfun$resourceOffers$20(TaskSchedulerImpl.scala:587)
at org.apache.spark.scheduler.TaskSchedulerImpl.$anonfun$resourceOffers$20$adapted(TaskSchedulerImpl.scala:582)
at scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36)
at scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:198)
at org.apache.spark.scheduler.TaskSchedulerImpl.$anonfun$resourceOffers$16(TaskSchedulerImpl.scala:582)
at org.apache.spark.scheduler.TaskSchedulerImpl.$anonfun$resourceOffers$16$adapted(TaskSchedulerImpl.scala:555)
at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
at org.apache.spark.scheduler.TaskSchedulerImpl.resourceOffers(TaskSchedulerImpl.scala:555)
at org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend$DriverEndpoint.$anonfun$makeOffers$5(CoarseGrainedSchedulerBackend.scala:359)
at org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend.org$apache$spark$scheduler$cluster$CoarseGrainedSchedulerBackend$$withLock(CoarseGrainedSchedulerBackend.scala:955)
at org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend$DriverEndpoint.org$apache$spark$scheduler$cluster$CoarseGrainedSchedulerBackend$DriverEndpoint$$makeOffers(CoarseGrainedSchedulerBackend.scala:351)
at org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend$DriverEndpoint$$anonfun$receive$1.applyOrElse(CoarseGrainedSchedulerBackend.scala:162)
at org.apache.spark.rpc.netty.Inbox.$anonfun$process$1(Inbox.scala:115)
at org.apache.spark.rpc.netty.Inbox.safelyCall(Inbox.scala:213)
at org.apache.spark.rpc.netty.Inbox.process(Inbox.scala:100)
at org.apache.spark.rpc.netty.MessageLoop.org$apache$spark$rpc$netty$MessageLoop$$receiveLoop(MessageLoop.scala:75)
at org.apache.spark.rpc.netty.MessageLoop$$anon$1.run(MessageLoop.scala:41)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
59376410 [dispatcher-CoarseGrainedScheduler] INFO org.apache.spark.scheduler.TaskSchedulerImpl - Cancelling stage 1
59376410 [dispatcher-CoarseGrainedScheduler] INFO org.apache.spark.scheduler.TaskSchedulerImpl - Killing all running tasks in stage 1: Stage cancelled
59376415 [dispatcher-CoarseGrainedScheduler] ERROR org.apache.spark.scheduler.DAGSchedulerEventProcessLoop - DAGSchedulerEventProcessLoop failed; shutting down SparkContext
scala code snippet
val df = spark.read.parquet(data_path)
df.rdd.foreachPartition(p => {
// code to process the code...
})
Job Submit
'''
spark-submit --master "spark://x.x.x.x:7077" --driver-cores=4 --driver-memory=112G --conf spark.driver.maxResultSize=0 --conf spark.rpc.message.maxSize=2047 --conf spark.driver.host=x.x.x.x --class myclass.processor --packages "..,org.apache.hadoop:hadoop-azure:3.3.1" --deploy-mode client
'''

spark-submit failed due to Exception in thread "main" java.lang.NoSuchMethodException

I got the error information below when i tried to submit my spark job for testing purpose.
jianrui#spark:~$ sudo $SPARK_HOME/bin/spark-submit --class com.test.spark.FirstScalaExample --master spark://spark.sparkstreaming.i10.internal.cloudapp.net:7077 /opt/spark/FirstScalaExample-0.0.1.jar
Exception in thread "main" java.lang.NoSuchMethodException: com.test.spark.FirstScalaExample.main([Ljava.lang.String;)
at java.lang.Class.getMethod(Class.java:1786)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:42)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:879)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:197)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:227)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:136)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
2018-04-06 13:13:00 INFO ShutdownHookManager:54 - Shutdown hook called
2018-04-06 13:13:00 INFO ShutdownHookManager:54 - Deleting directory /tmp/spark-7f47cab1-f8b3-4731-bd67-e0d0ad013617
[Scala version - 2.11.6]
[Hadoop version - 2.7.5]
[Spark version - 2.3.0]
Note:
To be informed that i specified the hadoop native lib in this way "export LD_LIBRARY_PATH=$HADOOP_HOME/lib/native" in the "spark-env.sh", and i only extract the spark but not installed.
I don't know what exactly the problem is.

Spark in cluster mode throws error if a SparkContext is not started

I have a Spark job that initializes the spark context only if it is really necessary:
val conf = new SparkConf()
val jobs: List[Job] = ??? //get some jobs
if(jobs.nonEmpty) {
val sc = new SparkContext(conf)
sc.parallelize(jobs).foreach(....)
} else {
//do nothing
}
It worked fine on Yarn if deploy-mode is 'client'
spark-submit --master yarn --deploy-mode client
Then I switched deploy mode to 'cluster' and it started to crash in case of jobs.isEmpty
spark-submit --master yarn --deploy-mode cluster
Below is the error text:
INFO yarn.Client: Application report for
application_1509613523426_0017 (state: ACCEPTED)
17/11/02 11:37:17
INFO yarn.Client: Application report for
application_1509613523426_0017 (state: FAILED) 17/11/02 11:37:17
INFO yarn.Client: client token: N/A diagnostics: Application
application_1509613523426_0017 failed 2 times due to AM Container for
appattempt_1509613523426_0017_000002 exited with exitCode: -1000 For
more detailed output, check application tracking
page:http://xxxxxx.com:8088/cluster/app/application_1509613523426_0017Then,
click on links to logs of each attempt. Diagnostics: File does not
exist:
hdfs://xxxxxxx/.sparkStaging/application_1509613523426_0017/__spark_libs__997458388067724499.zip
java.io.FileNotFoundException: File does not exist:
hdfs://xxxxxxx/.sparkStaging/application_1509613523426_0017/__spark_libs__997458388067724499.zip
at
org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1309)
at
org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1301)
at
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1301)
at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:253)
at
org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:63)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:361)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359)
at java.security.AccessController.doPrivileged(Native Method) at
javax.security.auth.Subject.doAs(Subject.java:422) at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:358)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62)
at java.util.concurrent.FutureTask.run(FutureTask.java:266) at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266) at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
Failing this attempt. Failing the application. ApplicationMaster
host: N/A ApplicationMaster RPC port: -1 queue: dev start time:
1509622629354 final status: FAILED tracking URL:
http://xxxxxx.com:8088/cluster/app/application_1509613523426_0017 user: xxx Exception in thread "main"
org.apache.spark.SparkException: Application
application_1509613523426_0017 finished with failed status at
org.apache.spark.deploy.yarn.Client.run(Client.scala:1104) at
org.apache.spark.deploy.yarn.Client$.main(Client.scala:1150) at
org.apache.spark.deploy.yarn.Client.main(Client.scala) at
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498) at
org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:755)
at
org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
17/11/02 11:37:17 INFO util.ShutdownHookManager: Shutdown hook called
17/11/02 11:37:17 INFO util.ShutdownHookManager: Deleting directory
/tmp/spark-a5b20def-0218-4b0c-b9f8-fdf8a1802e95
Is it a bug in Yarn support or I'm missing something?
SparkContext is the one who is responsible for communication with cluster manager. If application is submitted to the cluster, but context is never created, YARN cannot determine the state of the application - this is why you get an error.

Spark ClassNotFoundException when run on yarn-cluster

my code:
import org.apache.spark.{SparkConf, SparkContext}
object Run extends App {
val conf = new SparkConf().setMaster("yarn-cluster").setAppName("t666")
sc.addJar("hdfs://10.1.11.99:8020/user/spark/share/scalaj-http_2.10-2.3.0.jar")
val sc = new SparkContext(conf)
val b = scalaj.http.Base64.encodeString("刘")
val a = Array[String](b)
sc.parallelize(a).saveAsTextFile("hdfs://10.1.11.99:8020/testdata/t2/")
}
and my submit commend is:
spark-submit --master yarn-cluster --class start.Run run.jar
the log on yarn show:
16/11/04 13:50:01 INFO cluster.YarnClusterScheduler: YarnClusterScheduler.postStartHook done
16/11/04 13:50:01 INFO spark.SparkContext: Added JAR hdfs://10.1.11.99:8020/user/spark/share/scalaj-http_2.10-2.3.0.jar at hdfs://10.1.11.99:8020/user/spark/share/scalaj-http_2.10-2.3.0.jar with timestamp 1478238601256
16/11/04 13:50:01 INFO cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster registered as NettyRpcEndpointRef(spark://YarnAM#192.168.3.49:53976)
16/11/04 13:50:01 ERROR yarn.ApplicationMaster: User class threw exception: java.lang.NoClassDefFoundError: scalaj/http/Base64
java.lang.NoClassDefFoundError: scalaj/http/Base64
at start.Run$delayedInit$body.apply(Run.scala:31)
at scala.Function0$class.apply$mcV$sp(Function0.scala:40)
at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12)
at scala.App$$anonfun$main$1.apply(App.scala:71)
at scala.App$$anonfun$main$1.apply(App.scala:71)
at scala.collection.immutable.List.foreach(List.scala:318)
at scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:32)
at scala.App$class.main(App.scala:71)
at start.Run$.main(Run.scala:9)
at start.Run.main(Run.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:542)
Caused by: java.lang.ClassNotFoundException: scalaj.http.Base64
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
... 15 more
16/11/04 13:50:01 INFO yarn.ApplicationMaster: Final app status: FAILED, exitCode: 15, (reason: User class threw exception: java.lang.NoClassDefFoundError: scalaj/http/Base64)
16/11/04 13:50:01 INFO client.RMProxy: Connecting to ResourceManager at slave3/192.168.3.48:8030
16/11/04 13:50:01 INFO yarn.YarnRMClient: Registering the ApplicationMaster
16/11/04 13:50:01 INFO yarn.ApplicationMaster: Started progress reporter thread with (heartbeat : 3000, initial allocation : 200) intervals
16/11/04 13:50:01 INFO spark.SparkContext: Invoking stop() from shutdown hook
the 2nd line show:
INFO spark.SparkContext: Added JAR hdfs://10.1.11.99:8020/user/spark/share/scalaj-http_2.10-2.3.0.jar at hdfs://10.1.11.99:8020/user/spark/share/scalaj-http_2.10-2.3.0.jar with timestamp 1478238601256
it seems already add the jar file into my classpath,but this exception i can't explain.
anyone's answer will be help me a lot!
I believe SparkContext.addJar only adds the JAR to the classpath of the workers, and not the driver. Try adding the JAR using the --jars option in the spark-submit command:
spark-submit --master yarn \
--deploy-mode cluster \
--jars hdfs://10.1.11.99:8020/user/spark/share/scalaj-http_2.10-2.3.0.jar \
--class start.Run run.jar

Spark Job failing with exit status 15

I am trying to run simple word count job in spark but I am getting exception while running job.
For more detailed output, check application tracking page:http://quickstart.cloudera:8088/proxy/application_1446699275562_0006/Then, click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_1446699275562_0006_02_000001
Exit code: 15
Stack trace: ExitCodeException exitCode=15:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
at org.apache.hadoop.util.Shell.run(Shell.java:455)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Container exited with a non-zero exit code 15
Failing this attempt. Failing the application.
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: root.cloudera
start time: 1446910483956
final status: FAILED
tracking URL: http://quickstart.cloudera:8088/cluster/app/application_1446699275562_0006
user: cloudera
Exception in thread "main" org.apache.spark.SparkException: Application finished with failed status
at org.apache.spark.deploy.yarn.Client.run(Client.scala:626)
at org.apache.spark.deploy.yarn.Client$.main(Client.scala:651)
at org.apache.spark.deploy.yarn.Client.main(Client.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:569)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:166)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:189)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:110)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
I checked log from following command
yarn logs -applicationId application_1446699275562_0006
Here is log
15/11/07 07:35:09 ERROR yarn.ApplicationMaster: User class threw exception: Output directory hdfs://quickstart.cloudera:8020/user/cloudera/WordCountOutput already exists
org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory hdfs://quickstart.cloudera:8020/user/cloudera/WordCountOutput already exists
at org.apache.hadoop.mapred.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:132)
at org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopDataset(PairRDDFunctions.scala:1053)
at org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopFile(PairRDDFunctions.scala:954)
at org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopFile(PairRDDFunctions.scala:863)
at org.apache.spark.rdd.RDD.saveAsTextFile(RDD.scala:1290)
at org.com.td.sparkdemo.spark.WordCount$.main(WordCount.scala:23)
at org.com.td.sparkdemo.spark.WordCount.main(WordCount.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:480)
15/11/07 07:35:09 INFO yarn.ApplicationMaster: Final app status: FAILED, exitCode: 15, (reason: User class threw exception: Output directory hdfs://quickstart.cloudera:8020/user/cloudera/WordCountOutput already exists)
15/11/07 07:35:14 ERROR yarn.ApplicationMaster: SparkContext did not initialize after waiting for 100000 ms. Please check earlier log output for errors. Failing the application.
Exception clearly indicates that WordCountOutput directory already exists but I made sure that directory is not there before running job.
Why I am getting this error even though directory was not there before running my job?
I came across same issue and fixed it by adding below highlighted part.
SparkConf sparkConf = new SparkConf().setAppName("Sentiment Scoring").set("spark.hadoop.validateOutputSpecs", "true");
Thanks,
Sathish.
For us this was caused by "running beyond physical memory limits". After increasing executer memory, the issue was fixed.
This Error mostly occurs while you submitting the wrong spark parameter in the spark-submit command. Please check the configuration parameters. In my case, I failed to submit masternamenode address where I need to read resources from HDFS.
Before : Where yarn threw Error- 15
dfs://masternamenode/TESTCGNATDATA/
After: Able to run the application
hdfs://masternamenode/TESTCGNATDATA/