Apache Spark DataFrame NullPointer Exception - scala

I'm doing some operations between two DataFrames:
val resultRdd = df1.join(df2, df1("column")===df2("column") &&
df1("column2").contains(df2("column")), "left_outer").rdd
resultRdd.map { t => ... }
But I get this error every time:
edited*
Job aborted due to stage failure: Task 114 in stage 40.0 failed 4 times, most recent failure: Lost task 114.3 in stage 40.0 (TID 908, 10.10.10.51, executor 1): java.lang.NullPointerException
at org.apache.spark.unsafe.types.UTF8String.contains(UTF8String.java:284)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source)
at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$10$$anon$2.hasNext(WholeStageCodegenExec.scala:396)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
at scala.collection.Iterator$class.foreach(Iterator.scala:893)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:59)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:104)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:48)
at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:310)
at scala.collection.AbstractIterator.to(Iterator.scala:1336)
at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:302)
at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1336)
at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:289)
at scala.collection.AbstractIterator.toArray(Iterator.scala:1336)
at org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$13.apply(RDD.scala:935)
at org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$13.apply(RDD.scala:935)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1944)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1944)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:99)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Because of this error, I try to print
I'm researching and I read in others questions that it could be possible that the executors cannot access to the DataFrame instead of the driver.
NullPointerException in Spark RDD map when submitted as a spark job
I've tried both coalesce() and collect() but doesn't work for me.
I don't know how to deal with this issue, any help?
I'm using Spark 2.1.0
edited2***************************
After debugging I think I found some clues:
I have my application running in a Server and if I execute once It works, but if I execute again It fails. If I restart the server and SparkContext is created as new It works again.
The log error is:
17/01/25 16:10:40 INFO TaskSetManager: Starting task 86.0 in stage 126.0 (TID 5249, localhost, executor driver, partition 86, ANY, 6678 bytes)
17/01/25 16:10:40 INFO TaskSetManager: Finished task 43.0 in stage 126.0 (TID 5248) in 6 ms on localhost (executor driver) (197/200)
17/01/25 16:10:40 INFO Executor: Running task 86.0 in stage 126.0 (TID 5249)
17/01/25 16:10:40 INFO ShuffleBlockFetcherIterator: Getting 96 non-empty blocks out of 200 blocks
17/01/25 16:10:40 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
17/01/25 16:10:40 INFO ShuffleBlockFetcherIterator: Getting 1 non-empty blocks out of 4 blocks
17/01/25 16:10:40 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
17/01/25 16:10:40 ERROR Executor: Exception in task 86.0 in stage 126.0 (TID 5249)
java.lang.NullPointerException
at org.apache.spark.unsafe.types.UTF8String.contains(UTF8String.java:284)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source)
at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$10$$anon$2.hasNext(WholeStageCodegenExec.scala:396)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:369)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:369)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:369)
at scala.collection.Iterator$class.foreach(Iterator.scala:750)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1202)
at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:59)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:104)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:48)
at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:295)
at scala.collection.AbstractIterator.to(Iterator.scala:1202)
at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:287)
at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1202)
at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:274)
at scala.collection.AbstractIterator.toArray(Iterator.scala:1202)
at org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$13.apply(RDD.scala:935)
at org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$13.apply(RDD.scala:935)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1944)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1944)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:99)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
17/01/25 16:10:40 INFO TaskSetManager: Starting task 114.0 in stage 126.0 (TID 5250, localhost, executor driver, partition 114, ANY, 6678 bytes)
17/01/25 16:10:40 INFO Executor: Running task 114.0 in stage 126.0 (TID 5250)
17/01/25 16:10:40 WARN TaskSetManager: Lost task 86.0 in stage 126.0 (TID 5249, localhost, executor driver): java.lang.NullPointerException
at org.apache.spark.unsafe.types.UTF8String.contains(UTF8String.java:284)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source)
at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$10$$anon$2.hasNext(WholeStageCodegenExec.scala:396)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:369)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:369)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:369)
at scala.collection.Iterator$class.foreach(Iterator.scala:750)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1202)
at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:59)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:104)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:48)
at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:295)
at scala.collection.AbstractIterator.to(Iterator.scala:1202)
at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:287)
at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1202)
at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:274)
at scala.collection.AbstractIterator.toArray(Iterator.scala:1202)
at org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$13.apply(RDD.scala:935)
at org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$13.apply(RDD.scala:935)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1944)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1944)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:99)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
17/01/25 16:10:40 INFO ShuffleBlockFetcherIterator: Getting 0 non-empty blocks out of 200 blocks
17/01/25 16:10:40 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
17/01/25 16:10:40 INFO ShuffleBlockFetcherIterator: Getting 1 non-empty blocks out of 4 blocks
17/01/25 16:10:40 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
17/01/25 16:10:40 ERROR TaskSetManager: Task 86 in stage 126.0 failed 1 times; aborting job
17/01/25 16:10:40 INFO Executor: Finished task 114.0 in stage 126.0 (TID 5250). 3800 bytes result sent to driver
17/01/25 16:10:40 INFO TaskSetManager: Finished task 114.0 in stage 126.0 (TID 5250) in 15 ms on localhost (executor driver) (198/200)
17/01/25 16:10:40 INFO TaskSchedulerImpl: Removed TaskSet 126.0, whose tasks have all completed, from pool
17/01/25 16:10:40 INFO TaskSchedulerImpl: Cancelling stage 126
17/01/25 16:10:40 INFO DAGScheduler: ResultStage 126 (collect at CategorizationSystem.scala:123) failed in 1.664 s due to Job aborted due to stage failure: Task 86 in stage 126.0 failed 1 times, most recent failure: Lost task 86.0 in stage 126.0 (TID 5249, localhost, executor driver): java.lang.NullPointerException
at org.apache.spark.unsafe.types.UTF8String.contains(UTF8String.java:284)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source)
at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$10$$anon$2.hasNext(WholeStageCodegenExec.scala:396)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:369)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:369)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:369)
at scala.collection.Iterator$class.foreach(Iterator.scala:750)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1202)
at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:59)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:104)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:48)
at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:295)
at scala.collection.AbstractIterator.to(Iterator.scala:1202)
at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:287)
at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1202)
at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:274)
at scala.collection.AbstractIterator.toArray(Iterator.scala:1202)
at org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$13.apply(RDD.scala:935)
at org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$13.apply(RDD.scala:935)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1944)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1944)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:99)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Driver stacktrace:
17/01/25 16:10:40 INFO DAGScheduler: Job 19 failed: collect at CategorizationSystem.scala:123, took 8.517397 s
17/01/25 16:10:41 INFO ContextCleaner: Cleaned accumulator 116937
17/01/25 16:10:41 INFO ContextCleaner: Cleaned accumulator 116938
17/01/25 16:10:41 INFO ContextCleaner: Cleaned accumulator 116939
17/01/25 16:10:41 INFO ContextCleaner: Cleaned accumulator 116940
17/01/25 16:10:41 INFO ContextCleaner: Cleaned accumulator 116941
17/01/25 16:10:41 INFO ContextCleaner: Cleaned accumulator 116942
17/01/25 16:10:41 INFO ContextCleaner: Cleaned accumulator 116943
17/01/25 16:10:41 INFO ContextCleaner: Cleaned accumulator 116944
17/01/25 16:10:41 INFO ContextCleaner: Cleaned accumulator 116945
17/01/25 16:10:41 INFO ContextCleaner: Cleaned shuffle 16
17/01/25 16:10:41 INFO ContextCleaner: Cleaned accumulator 117330
17/01/25 16:10:41 INFO ContextCleaner: Cleaned accumulator 117331
17/01/25 16:10:41 INFO ContextCleaner: Cleaned accumulator 117332
17/01/25 16:10:41 INFO ContextCleaner: Cleaned accumulator 117333
17/01/25 16:10:41 INFO BlockManagerInfo: Removed broadcast_36_piece0 on 192.168.80.136:35004 in memory (size: 20.6 KB, free: 613.8 MB)
17/01/25 16:10:41 INFO ContextCleaner: Cleaned accumulator 117334
17/01/25 16:10:41 INFO ContextCleaner: Cleaned accumulator 117335
17/01/25 16:10:41 INFO ContextCleaner: Cleaned accumulator 117336
17/01/25 16:10:41 INFO ContextCleaner: Cleaned accumulator 117337
17/01/25 16:10:41 INFO ContextCleaner: Cleaned accumulator 117338
17/01/25 16:10:41 INFO ContextCleaner: Cleaned accumulator 117339
17/01/25 16:10:41 INFO ContextCleaner: Cleaned accumulator 117340
17/01/25 16:10:41 INFO ContextCleaner: Cleaned shuffle 17
17/01/25 16:10:41 INFO ContextCleaner: Cleaned accumulator 117341
17/01/25 16:10:41 INFO ContextCleaner: Cleaned accumulator 117342
17/01/25 16:10:41 INFO ContextCleaner: Cleaned accumulator 117343
17/01/25 16:10:41 INFO ContextCleaner: Cleaned accumulator 117344
17/01/25 16:10:41 INFO ContextCleaner: Cleaned accumulator 117345
17/01/25 16:10:41 INFO ContextCleaner: Cleaned shuffle 18
17/01/25 16:10:41 INFO BlockManagerInfo: Removed broadcast_39_piece0 on 192.168.80.136:35004 in memory (size: 23.0 KB, free: 613.8 MB)
17/01/25 16:10:41 INFO ContextCleaner: Cleaned accumulator 102054
17/01/25 16:10:41 INFO ContextCleaner: Cleaned accumulator 102055
17/01/25 16:10:41 INFO ContextCleaner: Cleaned accumulator 102056
17/01/25 16:10:41 INFO ContextCleaner: Cleaned accumulator 102057
17/01/25 16:10:41 INFO ContextCleaner: Cleaned accumulator 102058
17/01/25 16:10:41 INFO ContextCleaner: Cleaned accumulator 102059
17/01/25 16:10:41 INFO ContextCleaner: Cleaned accumulator 102060
17/01/25 16:10:41 INFO ContextCleaner: Cleaned accumulator 102061
17/01/25 16:10:41 INFO ContextCleaner: Cleaned shuffle 11
17/01/25 16:10:41 INFO ContextCleaner: Cleaned accumulator 102062
17/01/25 16:10:41 INFO ContextCleaner: Cleaned accumulator 102063
17/01/25 16:10:41 INFO ContextCleaner: Cleaned accumulator 102064
17/01/25 16:10:41 INFO ContextCleaner: Cleaned accumulator 102065
17/01/25 16:10:41 INFO ContextCleaner: Cleaned accumulator 102066
17/01/25 16:10:41 INFO ContextCleaner: Cleaned shuffle 12
17/01/25 16:10:41 INFO ContextCleaner: Cleaned shuffle 15
Any help?

I solved the problem removing a df1.cache that I had in some part of the code. I don't know why this solve the problem but it works.

Related

Spark unable to read kafka Topic and gives error " unable to connect to zookeeper server within timeout 6000"

I'm trying to Execute the Example program given in Spark Directory on HDP cluster "/spark2/examples/src/main/python/streaming/kafka_wordcount.py" which tries to read kafka topic but gives Zookeeper server timeout error.
Spark is installed on HDP Cluster and Kafka is running on HDF Cluster, both are running on different cluster and are in same VPC on AWS
Command executed to run spark example on HDP cluster is "bin/spark-submit --jars spark-streaming-kafka-0-8-assembly_2.11-2.3.0.jar examples/src/main/python/streaming/kafka_wordcount.py HDF-cluster-ip-address:2181 topic"
Error Image :
enter image description here
-------------------------------------------
Time: 2018-06-20 07:51:56
-------------------------------------------
18/06/20 07:51:56 INFO JobScheduler: Finished job streaming job 1529481116000 ms.0 from job set of time 1529481116000 ms
18/06/20 07:51:56 INFO JobScheduler: Total delay: 0.171 s for time 1529481116000 ms (execution: 0.145 s)
18/06/20 07:51:56 INFO PythonRDD: Removing RDD 94 from persistence list
18/06/20 07:51:56 INFO BlockManager: Removing RDD 94
18/06/20 07:51:56 INFO BlockRDD: Removing RDD 89 from persistence list
18/06/20 07:51:56 INFO BlockManager: Removing RDD 89
18/06/20 07:51:56 INFO KafkaInputDStream: Removing blocks of RDD BlockRDD[89] at createStream at NativeMethodAccessorImpl.java:0 of time 1529481116000 ms
18/06/20 07:51:56 INFO ReceivedBlockTracker: Deleting batches: 1529481114000 ms
18/06/20 07:51:56 INFO InputInfoTracker: remove old batch metadata: 1529481114000 ms
18/06/20 07:51:57 INFO JobScheduler: Added jobs for time 1529481117000 ms
18/06/20 07:51:57 INFO JobScheduler: Starting job streaming job 1529481117000 ms.0 from job set of time 1529481117000 ms
18/06/20 07:51:57 INFO SparkContext: Starting job: runJob at PythonRDD.scala:141
18/06/20 07:51:57 INFO DAGScheduler: Registering RDD 107 (call at /usr/hdp/2.6.5.0-292/spark2/python/lib/py4j-0.10.6-src.zip/py4j/java_gateway.py:2257)
18/06/20 07:51:57 INFO DAGScheduler: Got job 27 (runJob at PythonRDD.scala:141) with 1 output partitions
18/06/20 07:51:57 INFO DAGScheduler: Final stage: ResultStage 54 (runJob at PythonRDD.scala:141)
18/06/20 07:51:57 INFO DAGScheduler: Parents of final stage: List(ShuffleMapStage 53)
18/06/20 07:51:57 INFO DAGScheduler: Missing parents: List()
18/06/20 07:51:57 INFO DAGScheduler: Submitting ResultStage 54 (PythonRDD[111] at RDD at PythonRDD.scala:48), which has no missing parents
18/06/20 07:51:57 INFO MemoryStore: Block broadcast_27 stored as values in memory (estimated size 7.0 KB, free 366.0 MB)
18/06/20 07:51:57 INFO MemoryStore: Block broadcast_27_piece0 stored as bytes in memory (estimated size 4.1 KB, free 366.0 MB)
18/06/20 07:51:57 INFO BlockManagerInfo: Added broadcast_27_piece0 in memory on ip-10-29-3-74.ec2.internal:46231 (size: 4.1 KB, free: 366.2 MB)
18/06/20 07:51:57 INFO SparkContext: Created broadcast 27 from broadcast at DAGScheduler.scala:1039
18/06/20 07:51:57 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 54 (PythonRDD[111] at RDD at PythonRDD.scala:48) (first 15 tasks are for partitions Vector(0))
18/06/20 07:51:57 INFO TaskSchedulerImpl: Adding task set 54.0 with 1 tasks
18/06/20 07:51:57 INFO TaskSetManager: Starting task 0.0 in stage 54.0 (TID 53, localhost, executor driver, partition 0, PROCESS_LOCAL, 7649 bytes)
18/06/20 07:51:57 INFO Executor: Running task 0.0 in stage 54.0 (TID 53)
18/06/20 07:51:57 INFO ShuffleBlockFetcherIterator: Getting 0 non-empty blocks out of 0 blocks
18/06/20 07:51:57 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
18/06/20 07:51:57 INFO PythonRunner: Times: total = 40, boot = -881, init = 921, finish = 0
18/06/20 07:51:57 INFO PythonRunner: Times: total = 41, boot = -881, init = 922, finish = 0
18/06/20 07:51:57 INFO Executor: Finished task 0.0 in stage 54.0 (TID 53). 1493 bytes result sent to driver
18/06/20 07:51:57 INFO TaskSetManager: Finished task 0.0 in stage 54.0 (TID 53) in 48 ms on localhost (executor driver) (1/1)
18/06/20 07:51:57 INFO TaskSchedulerImpl: Removed TaskSet 54.0, whose tasks have all completed, from pool
18/06/20 07:51:57 INFO DAGScheduler: ResultStage 54 (runJob at PythonRDD.scala:141) finished in 0.055 s
18/06/20 07:51:57 INFO DAGScheduler: Job 27 finished: runJob at PythonRDD.scala:141, took 0.058062 s
18/06/20 07:51:57 INFO ZooKeeper: Session: 0x0 closed
18/06/20 07:51:57 INFO SparkContext: Starting job: runJob at PythonRDD.scala:141
18/06/20 07:51:57 INFO DAGScheduler: Got job 28 (runJob at PythonRDD.scala:141) with 3 output partitions
18/06/20 07:51:57 INFO DAGScheduler: Final stage: ResultStage 56 (runJob at PythonRDD.scala:141)
18/06/20 07:51:57 INFO DAGScheduler: Parents of final stage: List(ShuffleMapStage 55)
18/06/20 07:51:57 INFO DAGScheduler: Missing parents: List()
18/06/20 07:51:57 INFO DAGScheduler: Submitting ResultStage 56 (PythonRDD[112] at RDD at PythonRDD.scala:48), which has no missing parents
18/06/20 07:51:57 INFO ReceiverSupervisorImpl: Stopping receiver with message: Error starting receiver 0: org.I0Itec.zkclient.exception.ZkTimeoutException: Unable to connect to zookeeper server within timeout: 10000
18/06/20 07:51:57 INFO ReceiverSupervisorImpl: Called receiver onStop
18/06/20 07:51:57 INFO ReceiverSupervisorImpl: Deregistering receiver 0
18/06/20 07:51:57 INFO MemoryStore: Block broadcast_28 stored as values in memory (estimated size 7.0 KB, free 365.9 MB)
18/06/20 07:51:57 INFO MemoryStore: Block broadcast_28_piece0 stored as bytes in memory (estimated size 4.1 KB, free 365.9 MB)
18/06/20 07:51:57 INFO ClientCnxn: EventThread shut down
18/06/20 07:51:57 INFO BlockManagerInfo: Added broadcast_28_piece0 in memory on ip-10-29-3-74.ec2.internal:46231 (size: 4.1 KB, free: 366.2 MB)
18/06/20 07:51:57 INFO SparkContext: Created broadcast 28 from broadcast at DAGScheduler.scala:1039
18/06/20 07:51:57 INFO DAGScheduler: Submitting 3 missing tasks from ResultStage 56 (PythonRDD[112] at RDD at PythonRDD.scala:48) (first 15 tasks are for partitions Vector(1, 2, 3))
18/06/20 07:51:57 INFO TaskSchedulerImpl: Adding task set 56.0 with 3 tasks
18/06/20 07:51:57 INFO TaskSetManager: Starting task 0.0 in stage 56.0 (TID 54, localhost, executor driver, partition 1, PROCESS_LOCAL, 7649 bytes)
18/06/20 07:51:57 INFO TaskSetManager: Starting task 1.0 in stage 56.0 (TID 55, localhost, executor driver, partition 2, PROCESS_LOCAL, 7649 bytes)
18/06/20 07:51:57 INFO TaskSetManager: Starting task 2.0 in stage 56.0 (TID 56, localhost, executor driver, partition 3, PROCESS_LOCAL, 7649 bytes)
18/06/20 07:51:57 INFO Executor: Running task 1.0 in stage 56.0 (TID 55)
18/06/20 07:51:57 INFO Executor: Running task 2.0 in stage 56.0 (TID 56)
18/06/20 07:51:57 INFO Executor: Running task 0.0 in stage 56.0 (TID 54)
18/06/20 07:51:57 INFO ShuffleBlockFetcherIterator: Getting 0 non-empty blocks out of 0 blocks
18/06/20 07:51:57 INFO ShuffleBlockFetcherIterator: Getting 0 non-empty blocks out of 0 blocks
18/06/20 07:51:57 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
18/06/20 07:51:57 INFO ShuffleBlockFetcherIterator: Getting 0 non-empty blocks out of 0 blocks
18/06/20 07:51:57 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
18/06/20 07:51:57 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
18/06/20 07:51:57 ERROR ReceiverTracker: Deregistered receiver for stream 0: Error starting receiver 0 - org.I0Itec.zkclient.exception.ZkTimeoutException: Unable to connect to zookeeper server within timeout: 10000
at org.I0Itec.zkclient.ZkClient.connect(ZkClient.java:880)
at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:98)
at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:84)
at kafka.consumer.ZookeeperConsumerConnector.connectZk(ZookeeperConsumerConnector.scala:171)
at kafka.consumer.ZookeeperConsumerConnector.<init>(ZookeeperConsumerConnector.scala:126)
at kafka.consumer.ZookeeperConsumerConnector.<init>(ZookeeperConsumerConnector.scala:143)
at kafka.consumer.Consumer$.create(ConsumerConnector.scala:94)
at org.apache.spark.streaming.kafka.KafkaReceiver.onStart(KafkaInputDStream.scala:100)
at org.apache.spark.streaming.receiver.ReceiverSupervisor.startReceiver(ReceiverSupervisor.scala:149)
at org.apache.spark.streaming.receiver.ReceiverSupervisor.start(ReceiverSupervisor.scala:131)
at org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$$anonfun$9.apply(ReceiverTracker.scala:600)
at org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$$anonfun$9.apply(ReceiverTracker.scala:590)
at org.apache.spark.SparkContext$$anonfun$34.apply(SparkContext.scala:2185)
at org.apache.spark.SparkContext$$anonfun$34.apply(SparkContext.scala:2185)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:109)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
18/06/20 07:51:57 INFO ReceiverSupervisorImpl: Stopped receiver 0
18/06/20 07:51:57 INFO BlockGenerator: Stopping BlockGenerator
18/06/20 07:51:57 INFO PythonRunner: Times: total = 40, boot = -947, init = 987, finish = 0
18/06/20 07:51:57 INFO PythonRunner: Times: total = 40, boot = -947, init = 987, finish = 0
18/06/20 07:51:57 INFO PythonRunner: Times: total = 41, boot = -944, init = 985, finish = 0
18/06/20 07:51:57 INFO Executor: Finished task 1.0 in stage 56.0 (TID 55). 1536 bytes result sent to driver
18/06/20 07:51:57 INFO TaskSetManager: Finished task 1.0 in stage 56.0 (TID 55) in 52 ms on localhost (executor driver) (1/3)
18/06/20 07:51:57 INFO PythonRunner: Times: total = 45, boot = -944, init = 989, finish = 0
18/06/20 07:51:57 INFO PythonRunner: Times: total = 40, boot = -32, init = 72, finish = 0
18/06/20 07:51:57 INFO Executor: Finished task 0.0 in stage 56.0 (TID 54). 1536 bytes result sent to driver
18/06/20 07:51:57 INFO TaskSetManager: Finished task 0.0 in stage 56.0 (TID 54) in 56 ms on localhost (executor driver) (2/3)
18/06/20 07:51:57 INFO PythonRunner: Times: total = 40, boot = -33, init = 73, finish = 0
18/06/20 07:51:57 INFO Executor: Finished task 2.0 in stage 56.0 (TID 56). 1536 bytes result sent to driver
18/06/20 07:51:57 INFO TaskSetManager: Finished task 2.0 in stage 56.0 (TID 56) in 58 ms on localhost (executor driver) (3/3)
18/06/20 07:51:57 INFO TaskSchedulerImpl: Removed TaskSet 56.0, whose tasks have all completed, from pool
18/06/20 07:51:57 INFO DAGScheduler: ResultStage 56 (runJob at PythonRDD.scala:141) finished in 0.063 s
18/06/20 07:51:57 INFO DAGScheduler: Job 28 finished: runJob at PythonRDD.scala:141, took 0.065728 s
-------------------------------------------
Time: 2018-06-20 07:51:57
-------------------------------------------
18/06/20 07:51:57 INFO JobScheduler: Finished job streaming job 1529481117000 ms.0 from job set of time 1529481117000 ms
18/06/20 07:51:57 INFO JobScheduler: Total delay: 0.169 s for time 1529481117000 ms (execution: 0.149 s)
18/06/20 07:51:57 INFO PythonRDD: Removing RDD 102 from persistence list
18/06/20 07:51:57 INFO BlockManager: Removing RDD 102
18/06/20 07:51:57 INFO BlockRDD: Removing RDD 97 from persistence list
18/06/20 07:51:57 INFO KafkaInputDStream: Removing blocks of RDD BlockRDD[97] at createStream at NativeMethodAccessorImpl.java:0 of time 1529481117000 ms
18/06/20 07:51:57 INFO BlockManager: Removing RDD 97
18/06/20 07:51:57 INFO ReceivedBlockTracker: Deleting batches: 1529481115000 ms
18/06/20 07:51:57 INFO InputInfoTracker: remove old batch metadata: 1529481115000 ms
18/06/20 07:51:57 INFO RecurringTimer: Stopped timer for BlockGenerator after time 1529481117400
18/06/20 07:51:57 INFO BlockGenerator: Waiting for block pushing thread to terminate
18/06/20 07:51:57 INFO BlockGenerator: Pushing out the last 0 blocks
18/06/20 07:51:57 INFO BlockGenerator: Stopped block pushing thread
18/06/20 07:51:57 INFO BlockGenerator: Stopped BlockGenerator
18/06/20 07:51:57 INFO ReceiverSupervisorImpl: Waiting for receiver to be stopped
18/06/20 07:51:57 ERROR ReceiverSupervisorImpl: Stopped receiver with error: org.I0Itec.zkclient.exception.ZkTimeoutException: Unable to connect to zookeeper server within timeout: 10000
18/06/20 07:51:57 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)
org.I0Itec.zkclient.exception.ZkTimeoutException: Unable to connect to zookeeper server within timeout: 10000
at org.I0Itec.zkclient.ZkClient.connect(ZkClient.java:880)
at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:98)
at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:84)
at kafka.consumer.ZookeeperConsumerConnector.connectZk(ZookeeperConsumerConnector.scala:171)
at kafka.consumer.ZookeeperConsumerConnector.<init>(ZookeeperConsumerConnector.scala:126)
at kafka.consumer.ZookeeperConsumerConnector.<init>(ZookeeperConsumerConnector.scala:143)
at kafka.consumer.Consumer$.create(ConsumerConnector.scala:94)
at org.apache.spark.streaming.kafka.KafkaReceiver.onStart(KafkaInputDStream.scala:100)
at org.apache.spark.streaming.receiver.ReceiverSupervisor.startReceiver(ReceiverSupervisor.scala:149)
at org.apache.spark.streaming.receiver.ReceiverSupervisor.start(ReceiverSupervisor.scala:131)
at org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$$anonfun$9.apply(ReceiverTracker.scala:600)
at org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$$anonfun$9.apply(ReceiverTracker.scala:590)
at org.apache.spark.SparkContext$$anonfun$34.apply(SparkContext.scala:2185)
at org.apache.spark.SparkContext$$anonfun$34.apply(SparkContext.scala:2185)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:109)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
18/06/20 07:51:57 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, localhost, executor driver): org.I0Itec.zkclient.exception.ZkTimeoutException: Unable to connect to zookeeper server within timeout: 10000
at org.I0Itec.zkclient.ZkClient.connect(ZkClient.java:880)
at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:98)
at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:84)
at kafka.consumer.ZookeeperConsumerConnector.connectZk(ZookeeperConsumerConnector.scala:171)
at kafka.consumer.ZookeeperConsumerConnector.<init>(ZookeeperConsumerConnector.scala:126)
at kafka.consumer.ZookeeperConsumerConnector.<init>(ZookeeperConsumerConnector.scala:143)
at kafka.consumer.Consumer$.create(ConsumerConnector.scala:94)
at org.apache.spark.streaming.kafka.KafkaReceiver.onStart(KafkaInputDStream.scala:100)
at org.apache.spark.streaming.receiver.ReceiverSupervisor.startReceiver(ReceiverSupervisor.scala:149)
at org.apache.spark.streaming.receiver.ReceiverSupervisor.start(ReceiverSupervisor.scala:131)
at org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$$anonfun$9.apply(ReceiverTracker.scala:600)
at org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$$anonfun$9.apply(ReceiverTracker.scala:590)
at org.apache.spark.SparkContext$$anonfun$34.apply(SparkContext.scala:2185)
at org.apache.spark.SparkContext$$anonfun$34.apply(SparkContext.scala:2185)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:109)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
18/06/20 07:51:57 ERROR TaskSetManager: Task 0 in stage 0.0 failed 1 times; aborting job
18/06/20 07:51:57 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool
18/06/20 07:51:57 INFO TaskSchedulerImpl: Cancelling stage 0
18/06/20 07:51:57 INFO DAGScheduler: ResultStage 0 (start at NativeMethodAccessorImpl.java:0) failed in 13.256 s due to Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost, executor driver): org.I0Itec.zkclient.exception.ZkTimeoutException: Unable to connect to zookeeper server within timeout: 10000
at org.I0Itec.zkclient.ZkClient.connect(ZkClient.java:880)
at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:98)
at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:84)
at kafka.consumer.ZookeeperConsumerConnector.connectZk(ZookeeperConsumerConnector.scala:171)
at kafka.consumer.ZookeeperConsumerConnector.<init>(ZookeeperConsumerConnector.scala:126)
at kafka.consumer.ZookeeperConsumerConnector.<init>(ZookeeperConsumerConnector.scala:143)
at kafka.consumer.Consumer$.create(ConsumerConnector.scala:94)
at org.apache.spark.streaming.kafka.KafkaReceiver.onStart(KafkaInputDStream.scala:100)
at org.apache.spark.streaming.receiver.ReceiverSupervisor.startReceiver(ReceiverSupervisor.scala:149)
at org.apache.spark.streaming.receiver.ReceiverSupervisor.start(ReceiverSupervisor.scala:131)
at org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$$anonfun$9.apply(ReceiverTracker.scala:600)
at org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$$anonfun$9.apply(ReceiverTracker.scala:590)
at org.apache.spark.SparkContext$$anonfun$34.apply(SparkContext.scala:2185)
at org.apache.spark.SparkContext$$anonfun$34.apply(SparkContext.scala:2185)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:109)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Driver stacktrace:
18/06/20 07:51:57 ERROR ReceiverTracker: Receiver has been stopped. Try to restart it.
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost, executor driver): org.I0Itec.zkclient.exception.ZkTimeoutException: Unable to connect to zookeeper server within timeout: 10000
at org.I0Itec.zkclient.ZkClient.connect(ZkClient.java:880)
at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:98)
at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:84)
at kafka.consumer.ZookeeperConsumerConnector.connectZk(ZookeeperConsumerConnector.scala:171)
at kafka.consumer.ZookeeperConsumerConnector.<init>(ZookeeperConsumerConnector.scala:126)
at kafka.consumer.ZookeeperConsumerConnector.<init>(ZookeeperConsumerConnector.scala:143)
at kafka.consumer.Consumer$.create(ConsumerConnector.scala:94)
at org.apache.spark.streaming.kafka.KafkaReceiver.onStart(KafkaInputDStream.scala:100)
at org.apache.spark.streaming.receiver.ReceiverSupervisor.startReceiver(ReceiverSupervisor.scala:149)
at org.apache.spark.streaming.receiver.ReceiverSupervisor.start(ReceiverSupervisor.scala:131)
at org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$$anonfun$9.apply(ReceiverTracker.scala:600)
at org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$$anonfun$9.apply(ReceiverTracker.scala:590)
at org.apache.spark.SparkContext$$anonfun$34.apply(SparkContext.scala:2185)
at org.apache.spark.SparkContext$$anonfun$34.apply(SparkContext.scala:2185)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:109)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Driver stacktrace:
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1599)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1587)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1586)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1586)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:831)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:831)
at scala.Option.foreach(Option.scala:257)
at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:831)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1820)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1769)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1758)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
Caused by: org.I0Itec.zkclient.exception.ZkTimeoutException: Unable to connect to zookeeper server within timeout: 10000
at org.I0Itec.zkclient.ZkClient.connect(ZkClient.java:880)
at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:98)
at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:84)
at kafka.consumer.ZookeeperConsumerConnector.connectZk(ZookeeperConsumerConnector.scala:171)
at kafka.consumer.ZookeeperConsumerConnector.<init>(ZookeeperConsumerConnector.scala:126)
at kafka.consumer.ZookeeperConsumerConnector.<init>(ZookeeperConsumerConnector.scala:143)
at kafka.consumer.Consumer$.create(ConsumerConnector.scala:94)
at org.apache.spark.streaming.kafka.KafkaReceiver.onStart(KafkaInputDStream.scala:100)
at org.apache.spark.streaming.receiver.ReceiverSupervisor.startReceiver(ReceiverSupervisor.scala:149)
at org.apache.spark.streaming.receiver.ReceiverSupervisor.start(ReceiverSupervisor.scala:131)
at org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$$anonfun$9.apply(ReceiverTracker.scala:600)
at org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$$anonfun$9.apply(ReceiverTracker.scala:590)
at org.apache.spark.SparkContext$$anonfun$34.apply(SparkContext.scala:2185)
at org.apache.spark.SparkContext$$anonfun$34.apply(SparkContext.scala:2185)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:109)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Even on same VPC check for security groups of the two systems. If they have different security groups you probably need to allow inbound and outbound ports. Another way of verifying it is try to telnet and ping both systems from one another.

Spark Submit not able to pick classpath from jar

i have created a spark job which will get data from one cassandra table and insert into another table, i am using gradle to build the jar file some how i am able to create a jar with all dependencies , i am using the below command to trigger spark job
spark-submit --class DataMigration OrderAnalytics.jar
All required jars are present inside OrderAnalytics.jar i.e lib/** still i am getting NoClassDefFoundError as below
17/08/19 22:56:38 ERROR Executor: Exception in task 1.0 in stage 0.0 (TID 1)
java.lang.NoClassDefFoundError: com/twitter/jsr166e/LongAdder
at org.apache.spark.metrics.OutputMetricsUpdater$TaskMetricsSupport$class.$init$(OutputMetricsUpdater.scala:107)
And META-INF looks like this
Manifest-Version: 1.0
Main-Class: DataMigration
Class-Path: lib/spark-sql_2.11-2.2.0.jar lib/spark-cassandra-connector
_2.11-2.0.3.jar lib/univocity-parsers-2.2.1.jar lib/spark-sketch_2.11
-2.2.0.jar lib/spark-core_2.11-2.2.0.jar lib/spark-catalyst_2.11-2.2.
0.jar lib/spark-tags_2.11-2.2.0.jar lib/parquet-column-1.8.2.jar lib/
parquet-hadoop-1.8.2.jar lib/jackson-databind-2.6.5.jar lib/xbean-asm
5-shaded-4.4.jar lib/unused-1.0.0.jar lib/jsr166e-1.1.0.jar lib/commo
ns-beanutils-1.9.3.jar lib/joda-time-2.3.jar lib/joda-convert-1.2.jar
lib/scala-reflect-2.11.8.jar lib/avro-1.7.7.jar lib/avro-mapred-1.7.
7-hadoop2.jar lib/chill_2.11-0.8.0.jar lib/chill-java-0.8.0.jar lib/h
adoop-client-2.6.5.jar lib/spark-launcher_2.11-2.2.0.jar lib/spark-ne
twork-common_2.11-2.2.0.jar lib/spark-network-shuffle_2.11-2.2.0.jar
lib/spark-unsafe_2.11-2.2.0.jar lib/jets3t-0.9.3.jar lib/curator-reci
pes-2.6.0.jar lib/javax.servlet-api-3.1.0.jar lib/commons-lang3-3.5.j
ar lib/commons-math3-3.4.1.jar lib/jsr305-1.3.9.jar lib/jul-to-slf4j-
1.7.16.jar lib/jcl-over-slf4j-1.7.16.jar lib/log4j-1.2.17.jar lib/slf
4j-log4j12-1.7.16.jar lib/compress-lzf-1.0.3.jar lib/snappy-java-1.1.
2.6.jar lib/lz4-1.3.0.jar lib/RoaringBitmap-0.5.11.jar lib/json4s-jac
kson_2.11-3.2.11.jar lib/jersey-client-2.22.2.jar lib/jersey-common-2
.22.2.jar lib/jersey-server-2.22.2.jar lib/jersey-container-servlet-2
.22.2.jar lib/jersey-container-servlet-core-2.22.2.jar lib/netty-3.9.
9.Final.jar lib/stream-2.7.0.jar lib/metrics-core-3.1.2.jar lib/metri
cs-jvm-3.1.2.jar lib/metrics-json-3.1.2.jar lib/metrics-graphite-3.1.
2.jar lib/jackson-module-scala_2.11-2.6.5.jar lib/ivy-2.4.0.jar lib/o
ro-2.0.8.jar lib/pyrolite-4.13.jar lib/py4j-0.10.4.jar lib/commons-cr
ypto-1.0.0.jar lib/janino-3.0.0.jar lib/commons-compiler-3.0.0.jar li
b/antlr4-runtime-4.5.3.jar lib/commons-codec-1.10.jar lib/parquet-com
mon-1.8.2.jar lib/parquet-encoding-1.8.2.jar lib/parquet-format-2.3.1
.jar lib/parquet-jackson-1.8.2.jar lib/jackson-core-2.6.5.jar lib/com
mons-collections-3.2.2.jar lib/commons-compress-1.4.1.jar lib/avro-ip
c-1.7.7.jar lib/avro-ipc-1.7.7-tests.jar lib/kryo-shaded-3.0.3.jar li
b/hadoop-common-2.6.5.jar lib/hadoop-hdfs-2.6.5.jar lib/hadoop-mapred
uce-client-app-2.6.5.jar lib/hadoop-yarn-api-2.6.5.jar lib/hadoop-map
reduce-client-core-2.6.5.jar lib/hadoop-mapreduce-client-jobclient-2.
6.5.jar lib/hadoop-annotations-2.6.5.jar lib/leveldbjni-all-1.8.jar l
ib/httpcore-4.3.3.jar lib/httpclient-4.3.6.jar lib/activation-1.1.1.j
ar lib/mx4j-3.0.2.jar lib/mail-1.4.7.jar lib/bcprov-jdk15on-1.51.jar
lib/java-xmlbuilder-1.0.jar lib/curator-framework-2.6.0.jar lib/zooke
eper-3.4.6.jar lib/guava-16.0.1.jar lib/json4s-core_2.11-3.2.11.jar l
ib/javax.ws.rs-api-2.0.1.jar lib/hk2-api-2.4.0-b34.jar lib/javax.inje
ct-2.4.0-b34.jar lib/hk2-locator-2.4.0-b34.jar lib/javax.annotation-a
pi-1.2.jar lib/jersey-guava-2.22.2.jar lib/osgi-resource-locator-1.0.
1.jar lib/jersey-media-jaxb-2.22.2.jar lib/validation-api-1.1.0.Final
.jar lib/jackson-module-paranamer-2.6.5.jar lib/xz-1.0.jar lib/minlog
-1.3.0.jar lib/objenesis-2.1.jar lib/commons-cli-1.2.jar lib/xmlenc-0
.52.jar lib/commons-httpclient-3.1.jar lib/commons-io-2.4.jar lib/com
mons-lang-2.6.jar lib/commons-configuration-1.6.jar lib/protobuf-java
-2.5.0.jar lib/gson-2.2.4.jar lib/hadoop-auth-2.6.5.jar lib/curator-c
lient-2.6.0.jar lib/htrace-core-3.0.4.jar lib/jetty-util-6.1.26.jar l
ib/xercesImpl-2.9.1.jar lib/hadoop-mapreduce-client-common-2.6.5.jar
lib/hadoop-mapreduce-client-shuffle-2.6.5.jar lib/hadoop-yarn-common-
2.6.5.jar lib/base64-2.3.8.jar lib/json4s-ast_2.11-3.2.11.jar lib/sca
lap-2.11.0.jar lib/hk2-utils-2.4.0-b34.jar lib/aopalliance-repackaged
-2.4.0-b34.jar lib/javassist-3.18.1-GA.jar lib/commons-digester-1.8.j
ar lib/commons-beanutils-core-1.8.0.jar lib/apacheds-kerberos-codec-2
.0.0-M15.jar lib/xml-apis-1.3.04.jar lib/hadoop-yarn-client-2.6.5.jar
lib/hadoop-yarn-server-common-2.6.5.jar lib/hadoop-yarn-server-nodem
anager-2.6.5.jar lib/jaxb-api-2.2.2.jar lib/jackson-jaxrs-1.9.13.jar
lib/jackson-xc-1.9.13.jar lib/guice-3.0.jar lib/scala-compiler-2.11.0
.jar lib/javax.inject-1.jar lib/jline-0.9.94.jar lib/apacheds-i18n-2.
0.0-M15.jar lib/api-asn1-api-1.0.0-M20.jar lib/api-util-1.0.0-M20.jar
lib/jettison-1.1.jar lib/stax-api-1.0-2.jar lib/aopalliance-1.0.jar
lib/cglib-2.2.1-v20090111.jar lib/scala-xml_2.11-1.0.1.jar lib/scala-
parser-combinators_2.11-1.0.1.jar lib/scala-library-2.11.8.jar lib/sl
f4j-api-1.7.16.jar lib/netty-all-4.0.43.Final.jar lib/jackson-core-as
l-1.9.13.jar lib/jackson-mapper-asl-1.9.13.jar lib/jackson-annotation
s-2.6.5.jar lib/commons-net-3.1.jar lib/paranamer-2.6.jar
UPDATE
As few of the comments and answer by Allison Berman suggested i have tried as below
C:\Dev-Tra\OrderAnalytics\build\libs>spark-submit --jars OrderAnalytics.jar \ --class example.DataMigration
Error: Cannot load main class from JAR file:/C:/
Run with --help for usage help or --verbose for debug output
C:\Dev-Tra\OrderAnalytics\build\libs>spark-submit --jars OrderAnalytics.jar --class example.DataMigration
Exception in thread "main" java.lang.IllegalArgumentException: Missing application resource.
at org.apache.spark.launcher.CommandBuilderUtils.checkArgument(CommandBuilderUtils.java:241)
at org.apache.spark.launcher.SparkSubmitCommandBuilder.buildSparkSubmitArgs(SparkSubmitCommandBuilder.java:160)
at org.apache.spark.launcher.SparkSubmitCommandBuilder.buildSparkSubmitCommand(SparkSubmitCommandBuilder.java:274)
at org.apache.spark.launcher.SparkSubmitCommandBuilder.buildCommand(SparkSubmitCommandBuilder.java:151)
at org.apache.spark.launcher.Main.main(Main.java:86)
but according to Spark Documentation it should be as below in which it's able to start job but not able to get all dependent jars
C:\Dev-Tra\OrderAnalytics\build\libs>spark-submit --class example.DataMigration OrderAnalytics.jar
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
17/08/21 21:59:36 INFO SparkContext: Running Spark version 2.2.0
17/08/21 21:59:36 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
17/08/21 21:59:36 INFO SparkContext: Submitted application: DataMigration
17/08/21 21:59:36 INFO SecurityManager: Changing view acls to: ram
17/08/21 21:59:36 INFO SecurityManager: Changing modify acls to: ram
17/08/21 21:59:36 INFO SecurityManager: Changing view acls groups to:
17/08/21 21:59:36 INFO SecurityManager: Changing modify acls groups to:
17/08/21 21:59:36 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(ram); groups with vi
ew permissions: Set(); users with modify permissions: Set(ram); groups with modify permissions: Set()
17/08/21 21:59:37 INFO Utils: Successfully started service 'sparkDriver' on port 62239.
17/08/21 21:59:37 INFO SparkEnv: Registering MapOutputTracker
17/08/21 21:59:37 INFO SparkEnv: Registering BlockManagerMaster
17/08/21 21:59:37 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
17/08/21 21:59:37 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
17/08/21 21:59:37 INFO DiskBlockManager: Created local directory at C:\Users\ram\AppData\Local\Temp\blockmgr-38ef35e6-219e-450c-b7da-c8075464a232
17/08/21 21:59:37 INFO MemoryStore: MemoryStore started with capacity 366.3 MB
17/08/21 21:59:37 INFO SparkEnv: Registering OutputCommitCoordinator
17/08/21 21:59:37 INFO Utils: Successfully started service 'SparkUI' on port 4040.
17/08/21 21:59:37 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://192.168.1.101:4040
17/08/21 21:59:38 INFO SparkContext: Added JAR file:/C:/Dev-Tra/OrderAnalytics/build/libs/OrderAnalytics.jar at spark://192.168.1.101:62239/jars/OrderAnalytics
.jar with timestamp 1503332978023
17/08/21 21:59:38 INFO Executor: Starting executor ID driver on host localhost
17/08/21 21:59:38 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 62248.
17/08/21 21:59:38 INFO NettyBlockTransferService: Server created on 192.168.1.101:62248
17/08/21 21:59:38 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
17/08/21 21:59:38 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 192.168.1.101, 62248, None)
17/08/21 21:59:38 INFO BlockManagerMasterEndpoint: Registering block manager 192.168.1.101:62248 with 366.3 MB RAM, BlockManagerId(driver, 192.168.1.101, 62248
, None)
17/08/21 21:59:38 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 192.168.1.101, 62248, None)
17/08/21 21:59:38 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, 192.168.1.101, 62248, None)
17/08/21 21:59:40 INFO Native: Could not load JNR C Library, native system calls through this library will not be available (set this logger level to DEBUG to
see the full stack trace).
17/08/21 21:59:40 INFO ClockFactory: Using java.lang.System clock to generate timestamps.
17/08/21 21:59:41 WARN NettyUtil: Found Netty's native epoll transport, but not running on linux-based operating system. Using NIO instead.
17/08/21 21:59:41 INFO Cluster: New Cassandra host localhost/127.0.0.1:9042 added
17/08/21 21:59:41 INFO CassandraConnector: Connected to Cassandra cluster: Test Cluster
17/08/21 21:59:42 INFO SparkContext: Starting job: runJob at RDDFunctions.scala:36
17/08/21 21:59:42 INFO DAGScheduler: Got job 0 (runJob at RDDFunctions.scala:36) with 4 output partitions
17/08/21 21:59:42 INFO DAGScheduler: Final stage: ResultStage 0 (runJob at RDDFunctions.scala:36)
17/08/21 21:59:42 INFO DAGScheduler: Parents of final stage: List()
17/08/21 21:59:42 INFO DAGScheduler: Missing parents: List()
17/08/21 21:59:42 INFO DAGScheduler: Submitting ResultStage 0 (MapPartitionsRDD[1] at map at DataMigration.scala:18), which has no missing parents
17/08/21 21:59:42 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 12.3 KB, free 366.3 MB)
17/08/21 21:59:42 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 5.8 KB, free 366.3 MB)
17/08/21 21:59:42 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 192.168.1.101:62248 (size: 5.8 KB, free: 366.3 MB)
17/08/21 21:59:42 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:1006
17/08/21 21:59:42 INFO DAGScheduler: Submitting 4 missing tasks from ResultStage 0 (MapPartitionsRDD[1] at map at DataMigration.scala:18) (first 15 tasks are f
or partitions Vector(0, 1, 2, 3))
17/08/21 21:59:42 INFO TaskSchedulerImpl: Adding task set 0.0 with 4 tasks
17/08/21 21:59:42 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, localhost, executor driver, partition 0, NODE_LOCAL, 17002 bytes)
17/08/21 21:59:42 INFO Executor: Running task 0.0 in stage 0.0 (TID 0)
17/08/21 21:59:42 INFO Executor: Fetching spark://192.168.1.101:62239/jars/OrderAnalytics.jar with timestamp 1503332978023
17/08/21 21:59:42 INFO TransportClientFactory: Successfully created connection to /192.168.1.101:62239 after 18 ms (0 ms spent in bootstraps)
17/08/21 21:59:42 INFO Utils: Fetching spark://192.168.1.101:62239/jars/OrderAnalytics.jar to C:\Users\ram\AppData\Local\Temp\spark-73cbbbe8-9e06-4a11-976
a-a766305d4148\userFiles-3e4c9dea-6273-4d9e-a17b-c807aa0e3da5\fetchFileTemp7196411614488839489.tmp
17/08/21 21:59:43 INFO Executor: Adding file:/C:/Users/ram/AppData/Local/Temp/spark-73cbbbe8-9e06-4a11-976a-a766305d4148/userFiles-3e4c9dea-6273-4d9e-a17b
-c807aa0e3da5/OrderAnalytics.jar to class loader
17/08/21 21:59:44 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)
java.lang.NoClassDefFoundError: com/twitter/jsr166e/LongAdder
at org.apache.spark.metrics.OutputMetricsUpdater$TaskMetricsSupport$class.$init$(OutputMetricsUpdater.scala:107)
at org.apache.spark.metrics.OutputMetricsUpdater$TaskMetricsUpdater.<init>(OutputMetricsUpdater.scala:152)
at org.apache.spark.metrics.OutputMetricsUpdater$.apply(OutputMetricsUpdater.scala:75)
at com.datastax.spark.connector.writer.TableWriter.writeInternal(TableWriter.scala:174)
at com.datastax.spark.connector.writer.TableWriter.insert(TableWriter.scala:162)
at com.datastax.spark.connector.writer.TableWriter.write(TableWriter.scala:149)
at com.datastax.spark.connector.RDDFunctions$$anonfun$saveToCassandra$1.apply(RDDFunctions.scala:36)
at com.datastax.spark.connector.RDDFunctions$$anonfun$saveToCassandra$1.apply(RDDFunctions.scala:36)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:108)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ClassNotFoundException: com.twitter.jsr166e.LongAdder
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 14 more
17/08/21 21:59:44 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, localhost, executor driver, partition 1, NODE_LOCAL, 15334 bytes)
17/08/21 21:59:44 INFO Executor: Running task 1.0 in stage 0.0 (TID 1)
17/08/21 21:59:44 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, localhost, executor driver): java.lang.NoClassDefFoundError: com/twitter/jsr166e/Long
Adder
at org.apache.spark.metrics.OutputMetricsUpdater$TaskMetricsSupport$class.$init$(OutputMetricsUpdater.scala:107)
at org.apache.spark.metrics.OutputMetricsUpdater$TaskMetricsUpdater.<init>(OutputMetricsUpdater.scala:152)
at org.apache.spark.metrics.OutputMetricsUpdater$.apply(OutputMetricsUpdater.scala:75)
at com.datastax.spark.connector.writer.TableWriter.writeInternal(TableWriter.scala:174)
at com.datastax.spark.connector.writer.TableWriter.insert(TableWriter.scala:162)
at com.datastax.spark.connector.writer.TableWriter.write(TableWriter.scala:149)
at com.datastax.spark.connector.RDDFunctions$$anonfun$saveToCassandra$1.apply(RDDFunctions.scala:36)
at com.datastax.spark.connector.RDDFunctions$$anonfun$saveToCassandra$1.apply(RDDFunctions.scala:36)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:108)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ClassNotFoundException: com.twitter.jsr166e.LongAdder
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 14 more
17/08/21 21:59:44 ERROR TaskSetManager: Task 0 in stage 0.0 failed 1 times; aborting job
17/08/21 21:59:44 ERROR Executor: Exception in task 1.0 in stage 0.0 (TID 1)
java.lang.NoClassDefFoundError: com/twitter/jsr166e/LongAdder
at org.apache.spark.metrics.OutputMetricsUpdater$TaskMetricsSupport$class.$init$(OutputMetricsUpdater.scala:107)
at org.apache.spark.metrics.OutputMetricsUpdater$TaskMetricsUpdater.<init>(OutputMetricsUpdater.scala:152)
at org.apache.spark.metrics.OutputMetricsUpdater$.apply(OutputMetricsUpdater.scala:75)
at com.datastax.spark.connector.writer.TableWriter.writeInternal(TableWriter.scala:174)
at com.datastax.spark.connector.writer.TableWriter.insert(TableWriter.scala:162)
at com.datastax.spark.connector.writer.TableWriter.write(TableWriter.scala:149)
at com.datastax.spark.connector.RDDFunctions$$anonfun$saveToCassandra$1.apply(RDDFunctions.scala:36)
at com.datastax.spark.connector.RDDFunctions$$anonfun$saveToCassandra$1.apply(RDDFunctions.scala:36)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:108)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
17/08/21 21:59:44 INFO TaskSchedulerImpl: Cancelling stage 0
17/08/21 21:59:44 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool
17/08/21 21:59:44 INFO TaskSchedulerImpl: Stage 0 was cancelled
17/08/21 21:59:44 INFO TaskSetManager: Lost task 1.0 in stage 0.0 (TID 1) on localhost, executor driver: java.lang.NoClassDefFoundError (com/twitter/jsr166e/Lo
ngAdder) [duplicate 1]
17/08/21 21:59:44 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool
17/08/21 21:59:44 INFO DAGScheduler: ResultStage 0 (runJob at RDDFunctions.scala:36) failed in 1.894 s due to Job aborted due to stage failure: Task 0 in stage
0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost, executor driver): java.lang.NoClassDefFoundError: com/twitter/jsr166e/L
ongAdder
at org.apache.spark.metrics.OutputMetricsUpdater$TaskMetricsSupport$class.$init$(OutputMetricsUpdater.scala:107)
at org.apache.spark.metrics.OutputMetricsUpdater$TaskMetricsUpdater.<init>(OutputMetricsUpdater.scala:152)
at org.apache.spark.metrics.OutputMetricsUpdater$.apply(OutputMetricsUpdater.scala:75)
at com.datastax.spark.connector.writer.TableWriter.writeInternal(TableWriter.scala:174)
at com.datastax.spark.connector.writer.TableWriter.insert(TableWriter.scala:162)
at com.datastax.spark.connector.writer.TableWriter.write(TableWriter.scala:149)
at com.datastax.spark.connector.RDDFunctions$$anonfun$saveToCassandra$1.apply(RDDFunctions.scala:36)
at com.datastax.spark.connector.RDDFunctions$$anonfun$saveToCassandra$1.apply(RDDFunctions.scala:36)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:108)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ClassNotFoundException: com.twitter.jsr166e.LongAdder
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 14 more
Driver stacktrace:
17/08/21 21:59:44 INFO DAGScheduler: Job 0 failed: runJob at RDDFunctions.scala:36, took 2.152376 s
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost tas
k 0.0 in stage 0.0 (TID 0, localhost, executor driver): java.lang.NoClassDefFoundError: com/twitter/jsr166e/LongAdder
at org.apache.spark.metrics.OutputMetricsUpdater$TaskMetricsSupport$class.$init$(OutputMetricsUpdater.scala:107)
at org.apache.spark.metrics.OutputMetricsUpdater$TaskMetricsUpdater.<init>(OutputMetricsUpdater.scala:152)
at org.apache.spark.metrics.OutputMetricsUpdater$.apply(OutputMetricsUpdater.scala:75)
at com.datastax.spark.connector.writer.TableWriter.writeInternal(TableWriter.scala:174)
at com.datastax.spark.connector.writer.TableWriter.insert(TableWriter.scala:162)
at com.datastax.spark.connector.writer.TableWriter.write(TableWriter.scala:149)
at com.datastax.spark.connector.RDDFunctions$$anonfun$saveToCassandra$1.apply(RDDFunctions.scala:36)
at com.datastax.spark.connector.RDDFunctions$$anonfun$saveToCassandra$1.apply(RDDFunctions.scala:36)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:108)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ClassNotFoundException: com.twitter.jsr166e.LongAdder
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 14 more
Driver stacktrace:
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1499)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1487)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1486)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1486)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:814)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:814)
at scala.Option.foreach(Option.scala:257)
at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:814)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1714)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1669)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1658)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:630)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2022)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2043)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2075)
at com.datastax.spark.connector.RDDFunctions.saveToCassandra(RDDFunctions.scala:36)
at example.DataMigration$.main(DataMigration.scala:20)
at example.DataMigration.main(DataMigration.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:755)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.NoClassDefFoundError: com/twitter/jsr166e/LongAdder
at org.apache.spark.metrics.OutputMetricsUpdater$TaskMetricsSupport$class.$init$(OutputMetricsUpdater.scala:107)
at org.apache.spark.metrics.OutputMetricsUpdater$TaskMetricsUpdater.<init>(OutputMetricsUpdater.scala:152)
at org.apache.spark.metrics.OutputMetricsUpdater$.apply(OutputMetricsUpdater.scala:75)
at com.datastax.spark.connector.writer.TableWriter.writeInternal(TableWriter.scala:174)
at com.datastax.spark.connector.writer.TableWriter.insert(TableWriter.scala:162)
at com.datastax.spark.connector.writer.TableWriter.write(TableWriter.scala:149)
at com.datastax.spark.connector.RDDFunctions$$anonfun$saveToCassandra$1.apply(RDDFunctions.scala:36)
at com.datastax.spark.connector.RDDFunctions$$anonfun$saveToCassandra$1.apply(RDDFunctions.scala:36)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:108)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ClassNotFoundException: com.twitter.jsr166e.LongAdder
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 14 more
17/08/21 21:59:51 INFO CassandraConnector: Disconnected from Cassandra cluster: Test Cluster
17/08/21 21:59:52 INFO SerialShutdownHooks: Successfully executed shutdown hook: Clearing session cache for C* connector
17/08/21 21:59:52 INFO SparkContext: Invoking stop() from shutdown hook
17/08/21 21:59:52 INFO SparkUI: Stopped Spark web UI at http://192.168.1.101:4040
17/08/21 21:59:52 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
17/08/21 21:59:52 INFO MemoryStore: MemoryStore cleared
17/08/21 21:59:52 INFO BlockManager: BlockManager stopped
17/08/21 21:59:52 INFO BlockManagerMaster: BlockManagerMaster stopped
17/08/21 21:59:52 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
17/08/21 21:59:52 INFO SparkContext: Successfully stopped SparkContext
17/08/21 21:59:52 INFO ShutdownHookManager: Shutdown hook called
17/08/21 21:59:52 INFO ShutdownHookManager: Deleting directory C:\Users\ram\AppData\Local\Temp\spark-73cbbbe8-9e06-4a11-976a-a766305d4148
can any one please tell me why spark not able to pick class path of jar or how to resolve this problem ?
Thanks
Indrajit is correct, you need to include the package. I had similar issues when I left my files in the default package. Make your folder structure the same as this http://www.scala-sbt.org/0.13/docs/Directories.html
Add a new folder YOUR_PACKAGE in src/main/scala or src/main/java and put DataMigration in YOUR_PACKAGE. Make sure the first line of DataMigration is:
package YOUR_PACKAGE
Your spark-submit will then be:
spark-submit --jars OrderAnalytics.jar \
--class YOUR_PACKAGE.DataMigration

Apache Spark: using spark-submit to transfer files from windows to cluster

Here's what I'm trying to do:
using spark-submit to submit a packaged / compiled (using sbt 0.13.12) scala programm to my virtualized "cluster" running hdp 2.4 (Spark 1.6.0, Scala 2.10.5) using virtual box
using the --files option to copy a text file "foo.txt" (which is located in the project root) from the "submitting" Windows machine (which is also running Spark 1.6.0 and Scala 2.10.5) to the working directories of executors (as described by spark-submit -h)
passing the textfile as first argument to my application
finally: reading in the file and counting the lines
The command for submitting is
spark-submit ^
--class boern.spark.SparkMeApp ^
--master "spark://127.0.0.1:7077" ^
--files "foo.txt" ^
target/scala-2.11/sparkme-project_2.11-1.0.jar foo.txt
The interesting part of code is
val fileName = args(0)
println(s"argument 0 is $fileName")
val lines = sc.textFile(fileName).cache
val c = lines.count /** line 37 */
The error (short version) I'm getting is:
INFO DAGScheduler: Job 0 failed: count at SparkMeApp.scala:37, Exception, Job aborted: java
.io.FileNotFoundException: File file:/E:/myProject/foo.txt does not exist
After two days of a combination "bruteforcing" and reading documentation I am still lost... Am I wrong, that sc.textFile(fileName).cache is executed on the workers and everything which is not preceeded by sc on master? Is using SparkFiles the way to go?
Stacktrace
E:\myProject\>spark-submit --verbose --class boern.spark.SparkMeApp --master "spark://127.0.0.1:7077" --files "foo.txt" target/scala-2.11/sparkme-project_2.11-1.0.jar foo.txt
Using properties file: null
Parsed arguments:
master spark://127.0.0.1:7077
deployMode null
executorMemory null
executorCores null
totalExecutorCores null
propertiesFile null
driverMemory null
driverCores null
driverExtraClassPath null
driverExtraLibraryPath null
driverExtraJavaOptions null
supervise false
queue null
numExecutors null
files file:/E:/myProject/foo.txt
pyFiles null
archives null
mainClass boern.spark.SparkMeApp
primaryResource file:/E:/myProject/target/scala-2.11/sparkme-project_2.11-1.0.jar
name boern.spark.SparkMeApp
childArgs [foo.txt]
jars null
packages null
packagesExclusions null
repositories null
verbose true
Spark properties used, including those specified through
--conf and those from the properties file null:
Main class:
boern.spark.SparkMeApp
Arguments:
foo.txt
System properties:
SPARK_SUBMIT -> true
spark.files -> file:/E:/myProject/foo.txt
spark.app.name -> boern.spark.SparkMeApp
spark.jars -> file:/E:/myProject/target/scala-2.11/sparkme-project_2.11-1.0.jar
spark.submit.deployMode -> client
spark.master -> spark://127.0.0.1:7077
Classpath elements:
file:/E:/myProject/target/scala-2.11/sparkme-project_2.11-1.0.jar
Working directory is E:\myProject\sbtmanual
Files:
\CONF.ENI
\mw.csv
\mw_out.csv
\pagefile.sys
\temp.rds
args:
foo.txt
config set.
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
16/09/15 14:36:21 INFO SparkContext: Running Spark version 1.6.0
16/09/15 14:36:22 INFO SecurityManager: Changing view acls to: Boern
16/09/15 14:36:22 INFO SecurityManager: Changing modify acls to: Boern
16/09/15 14:36:22 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(Boern); users with modify permissions: Set(Boern)
16/09/15 14:36:22 INFO Utils: Successfully started service 'sparkDriver' on port 59716.
16/09/15 14:36:23 INFO Slf4jLogger: Slf4jLogger started
16/09/15 14:36:23 INFO Remoting: Starting remoting
16/09/15 14:36:23 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriverActorSystem#192.168.56.1:59729]
16/09/15 14:36:23 INFO Utils: Successfully started service 'sparkDriverActorSystem' on port 59729.
16/09/15 14:36:23 INFO SparkEnv: Registering MapOutputTracker
16/09/15 14:36:23 INFO SparkEnv: Registering BlockManagerMaster
16/09/15 14:36:23 INFO DiskBlockManager: Created local directory at C:\Users\Boern\AppData\Local\Temp\blockmgr-c7ee2dab-ea00-4ae5-9f06-c6ab74f135e5
16/09/15 14:36:23 INFO MemoryStore: MemoryStore started with capacity 511.1 MB
16/09/15 14:36:23 INFO SparkEnv: Registering OutputCommitCoordinator
16/09/15 14:36:23 WARN Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041.
16/09/15 14:36:23 INFO Utils: Successfully started service 'SparkUI' on port 4041.
16/09/15 14:36:23 INFO SparkUI: Started SparkUI at http://192.168.56.1:4041
16/09/15 14:36:23 INFO HttpFileServer: HTTP File server directory is C:\Users\Boern\AppData\Local\Temp\spark-2736b20a-fc90-40e8-a7ad-2d8cac8001f2\httpd-14abb177-9801-403c-9df9-84afb2e87d70
16/09/15 14:36:23 INFO HttpServer: Starting HTTP Server
16/09/15 14:36:23 INFO Utils: Successfully started service 'HTTP file server' on port 59746.
16/09/15 14:36:23 INFO SparkContext: Added JAR file:/E:/myProject/target/scala-2.11/sparkme-project_2.11-1.0.jar at http://192.168.56.1:59746/jars/sparkme-project_2.11-1.0.jar with timestamp 1473942983631
16/09/15 14:36:23 INFO Utils: Copying E:\myProject\sbtmanual\foo.txt to C:\Users\Boern\AppData\Local\Temp\spark-2736b20a-fc90-40e8-a7ad-2d8cac8001f2\userFiles-7849db02-01ff-40ea-9250-62b87d854f4c\foo.txt
16/09/15 14:36:23 INFO SparkContext: Added file file:/E:/myProject/foo.txt at http://192.168.56.1:59746/files/foo.txt with timestamp 1473942983695
16/09/15 14:36:23 INFO AppClient$ClientEndpoint: Connecting to master spark://127.0.0.1:7077...
16/09/15 14:36:34 INFO SparkDeploySchedulerBackend: Connected to Spark cluster with app ID app-20160915123633-0015
16/09/15 14:36:34 INFO AppClient$ClientEndpoint: Executor added: app-20160915123633-0015/0 on worker-20160915105800-10.0.2.15-44537 (10.0.2.15:44537) with 4 cores
16/09/15 14:36:34 INFO SparkDeploySchedulerBackend: Granted executor ID app-20160915123633-0015/0 on hostPort 10.0.2.15:44537 with 4 cores, 1024.0 MB RAM
16/09/15 14:36:34 INFO AppClient$ClientEndpoint: Executor updated: app-20160915123633-0015/0 is now RUNNING
16/09/15 14:36:34 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 59781.
16/09/15 14:36:34 INFO NettyBlockTransferService: Server created on 59781
16/09/15 14:36:34 INFO BlockManagerMaster: Trying to register BlockManager
16/09/15 14:36:34 INFO BlockManagerMasterEndpoint: Registering block manager 192.168.56.1:59781 with 511.1 MB RAM, BlockManagerId(driver, 192.168.56.1, 59781)
16/09/15 14:36:34 INFO BlockManagerMaster: Registered BlockManager
16/09/15 14:36:34 INFO SparkDeploySchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.0
sc set.
argument 0 is foo.txt
16/09/15 14:36:34 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 208.5 KB, free 208.5 KB)
16/09/15 14:36:34 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 19.3 KB, free 227.8 KB)
16/09/15 14:36:34 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 192.168.56.1:59781 (size: 19.3 KB, free: 511.1 MB)
16/09/15 14:36:34 INFO SparkContext: Created broadcast 0 from textFile at SparkMeApp.scala:39
16/09/15 14:36:34 INFO FileInputFormat: Total input paths to process : 1
16/09/15 14:36:34 INFO SparkContext: Starting job: count at SparkMeApp.scala:41
16/09/15 14:36:34 INFO DAGScheduler: Got job 0 (count at SparkMeApp.scala:41) with 2 output partitions
16/09/15 14:36:34 INFO DAGScheduler: Final stage: ResultStage 0 (count at SparkMeApp.scala:41)
16/09/15 14:36:34 INFO DAGScheduler: Parents of final stage: List()
16/09/15 14:36:34 INFO DAGScheduler: Missing parents: List()
16/09/15 14:36:34 INFO DAGScheduler: Submitting ResultStage 0 (MapPartitionsRDD[1] at textFile at SparkMeApp.scala:39), which has no missing parents
16/09/15 14:36:34 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 2.9 KB, free 230.7 KB)
16/09/15 14:36:34 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 1752.0 B, free 232.4 KB)
16/09/15 14:36:34 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on 192.168.56.1:59781 (size: 1752.0 B, free: 511.1 MB)
16/09/15 14:36:34 INFO SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:1006
16/09/15 14:36:34 INFO DAGScheduler: Submitting 2 missing tasks from ResultStage 0 (MapPartitionsRDD[1] at textFile at SparkMeApp.scala:39)
16/09/15 14:36:34 INFO TaskSchedulerImpl: Adding task set 0.0 with 2 tasks
16/09/15 14:36:37 INFO SparkDeploySchedulerBackend: Registered executor NettyRpcEndpointRef(null) (BoernsPC:59783) with ID 0
16/09/15 14:36:37 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, BoernsPC, partition 0,PROCESS_LOCAL, 2286 bytes)
16/09/15 14:36:37 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, BoernsPC, partition 1,PROCESS_LOCAL, 2286 bytes)
16/09/15 14:36:47 INFO BlockManagerMasterEndpoint: Registering block manager BoernsPC:48448 with 511.5 MB RAM, BlockManagerId(0, BoernsPC, 48448)
16/09/15 14:36:48 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on BoernsPC:48448 (size: 1752.0 B, free: 511.5 MB)
16/09/15 14:36:48 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on BoernsPC:48448 (size: 19.3 KB, free: 511.5 MB)
16/09/15 14:36:49 WARN TaskSetManager: Lost task 1.0 in stage 0.0 (TID 1, BoernsPC): java.io.FileNotFoundException: File file:/E:/myProject/foo.txt does not exist
at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:609)
at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:822)
at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:599)
at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:421)
at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.<init>(ChecksumFileSystem.java:140)
at org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:341)
at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:767)
at org.apache.hadoop.mapred.LineRecordReader.<init>(LineRecordReader.java:109)
at org.apache.hadoop.mapred.TextInputFormat.getRecordReader(TextInputFormat.java:67)
at org.apache.spark.rdd.HadoopRDD$$anon$1.<init>(HadoopRDD.scala:237)
at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:208)
at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:101)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:69)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:268)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
16/09/15 14:36:49 INFO TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0) on executor BoernsPC: java.io.FileNotFoundException (File file:/E:/myProject/foo.txt does not exist) [duplicate 1]
16/09/15 14:36:49 INFO TaskSetManager: Starting task 0.1 in stage 0.0 (TID 2, BoernsPC, partition 0,PROCESS_LOCAL, 2286 bytes)
16/09/15 14:36:49 INFO TaskSetManager: Starting task 1.1 in stage 0.0 (TID 3, BoernsPC, partition 1,PROCESS_LOCAL, 2286 bytes)
16/09/15 14:36:49 INFO TaskSetManager: Lost task 1.1 in stage 0.0 (TID 3) on executor BoernsPC: java.io.FileNotFoundException (File file:/E:/myProject/foo.txt does not exist) [duplicate 2]
16/09/15 14:36:49 INFO TaskSetManager: Starting task 1.2 in stage 0.0 (TID 4, BoernsPC, partition 1,PROCESS_LOCAL, 2286 bytes)
16/09/15 14:36:49 INFO TaskSetManager: Lost task 0.1 in stage 0.0 (TID 2) on executor BoernsPC: java.io.FileNotFoundException (File file:/E:/myProject/foo.txt does not exist) [duplicate 3]
16/09/15 14:36:49 INFO TaskSetManager: Starting task 0.2 in stage 0.0 (TID 5, BoernsPC, partition 0,PROCESS_LOCAL, 2286 bytes)
16/09/15 14:36:49 INFO TaskSetManager: Lost task 0.2 in stage 0.0 (TID 5) on executor BoernsPC: java.io.FileNotFoundException (File file:/E:/myProject/foo.txt does not exist) [duplicate 4]
16/09/15 14:36:49 INFO TaskSetManager: Starting task 0.3 in stage 0.0 (TID 6, BoernsPC, partition 0,PROCESS_LOCAL, 2286 bytes)
16/09/15 14:36:49 INFO TaskSetManager: Lost task 1.2 in stage 0.0 (TID 4) on executor BoernsPC: java.io.FileNotFoundException (File file:/E:/myProject/foo.txt does not exist) [duplicate 5]
16/09/15 14:36:49 INFO TaskSetManager: Starting task 1.3 in stage 0.0 (TID 7, BoernsPC, partition 1,PROCESS_LOCAL, 2286 bytes)
16/09/15 14:36:49 INFO TaskSetManager: Lost task 1.3 in stage 0.0 (TID 7) on executor BoernsPC: java.io.FileNotFoundException (File file:/E:/myProject/foo.txt does not exist) [duplicate 6]
16/09/15 14:36:49 ERROR TaskSetManager: Task 1 in stage 0.0 failed 4 times; aborting job
16/09/15 14:36:49 INFO TaskSchedulerImpl: Cancelling stage 0
16/09/15 14:36:49 INFO TaskSchedulerImpl: Stage 0 was cancelled
16/09/15 14:36:49 INFO DAGScheduler: ResultStage 0 (count at SparkMeApp.scala:41) failed in 14,616 s
16/09/15 14:36:49 INFO DAGScheduler: Job 0 failed: count at SparkMeApp.scala:41, took 14,694943 s
16/09/15 14:36:49 INFO TaskSetManager: Lost task 0.3 in stage 0.0 (TID 6) on executor BoernsPC: java.io.FileNotFoundException (File file:/E:/myProject/foo.txt does not exist) [duplicate 7]
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 in stage 0.0 failed 4 times, most recent failure: Lost task 1.3 in stage 0.0 (TID 7, BoernsPC): java.io.FileNotFoundException: File file:/E:/myProject/foo.txt does not exist
at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:609)
at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:822)
at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:599)
at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:421)
at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.<init>(ChecksumFileSystem.java:140)
at org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:341)
at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:767)
at org.apache.hadoop.mapred.LineRecordReader.<init>(LineRecordReader.java:109)
at org.apache.hadoop.mapred.TextInputFormat.getRecordReader(TextInputFormat.java:67)
at org.apache.spark.rdd.HadoopRDD$$anon$1.<init>(HadoopRDD.scala:237)
at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:208)
at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:101)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:69)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:268)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Driver stacktrace:
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1431)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1419)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1418)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1418)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:799)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1640)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1599)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1588)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:620)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1832)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1845)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1858)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1929)
at org.apache.spark.rdd.RDD.count(RDD.scala:1143)
at boern.spark.SparkMeApp$.main(SparkMeApp.scala:41)
at boern.spark.SparkMeApp.main(SparkMeApp.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.io.FileNotFoundException: File file:/E:/myProject/foo.txt does not exist
at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:609)
at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:822)
at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:599)
at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:421)
at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.<init>(ChecksumFileSystem.java:140)
at org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:341)
at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:767)
at org.apache.hadoop.mapred.LineRecordReader.<init>(LineRecordReader.java:109)
at org.apache.hadoop.mapred.TextInputFormat.getRecordReader(TextInputFormat.java:67)
at org.apache.spark.rdd.HadoopRDD$$anon$1.<init>(HadoopRDD.scala:237)
at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:208)
at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:101)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:69)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:268)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
16/09/15 14:36:49 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool
16/09/15 14:36:49 INFO SparkContext: Invoking stop() from shutdown hook
16/09/15 14:36:49 INFO SparkUI: Stopped Spark web UI at http://192.168.56.1:4041
16/09/15 14:36:49 INFO SparkDeploySchedulerBackend: Shutting down all executors
16/09/15 14:36:49 INFO SparkDeploySchedulerBackend: Asking each executor to shut down
16/09/15 14:36:49 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
16/09/15 14:36:49 INFO MemoryStore: MemoryStore cleared
16/09/15 14:36:49 INFO BlockManager: BlockManager stopped
16/09/15 14:36:49 INFO BlockManagerMaster: BlockManagerMaster stopped
16/09/15 14:36:49 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
16/09/15 14:36:49 INFO SparkContext: Successfully stopped SparkContext
16/09/15 14:36:49 INFO ShutdownHookManager: Shutdown hook called
16/09/15 14:36:49 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
16/09/15 14:36:49 INFO ShutdownHookManager: Deleting directory C:\Users\Boern\AppData\Local\Temp\spark-2736b20a-fc90-40e8-a7ad-2d8cac8001f2
16/09/15 14:36:49 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
16/09/15 14:36:49 INFO ShutdownHookManager: Deleting directory C:\Users\Boern\AppData\Local\Temp\spark-2736b20a-fc90-40e8-a7ad-2d8cac8001f2\httpd-14abb177-9801-403c-9df9-84afb2e87d70

Subclassing SparkException doesn't allow for message passing between Master and Workers

I need to build an application where a master node distributes a large dataset to a number of worker nodes for parallel processing. I'm running this application on a single machine and JVM, therefore I've called setMaster("local[4]") on my SparkConf object. I'm using Spark 1.5.2 and Scala 2.10.5 through IntelliJ.
If a certain condition occurs in the portions of the dataset handled by the executors, I need the master node to be notified and perform some action. In addition to that, I need the other executors to die. To that end, I looked around the Scala Spark API and realized that SparkException allows me to do the first portion of what I'm looking for, by propagating the exception (which is Serializable, by the way) to the driver. I have verified this experimentally, as follows:
def main(args:Array[String]) = {
val conf = new SparkConf().setAppName("Spark Exceptions").setMaster("local[4]")
val sc = new SparkContext(conf)
val l = Range(1, 5000)
val parl = sc.parallelize(l, 8);
val mappedRDD = parl.map(func)
try {
val res = mappedRDD.collect()
println(res)
} catch {
case s:SparkException => println("A worker threw an exception.")
case t:Throwable => throw(t)
}
}
def func(i:Int) = {
if(i == 1 || i == 4000)
throw new SparkException("Bad number detected.")
else
Math.pow(i, 2)
}
If you look closely at the example above, you will note that since the original Range contains both 1 and 4000, two failures are guaranteed in the worker nodes. Indeed, I see two executors failing in stderr, while my stdout is populated with:
A worker threw an exception.
Process finished with exit code 0
Unfortunately, the SparkException thrown does not kill the other executors, since, as mentioned before, I can see both executors failing in stderr, while two other executors complete their tasks successfully. So my first question is: is there any way I can immediately kill the other executors once this exception is caught by the driver program?
My second question is a little bit more subtle: I'd like some information to be exchanged from the executors to the worker node about what piece of information caused the error. Sure, I could write to and read from a file, particularly since I'm on the same filesystem, but I'd like a faster and more elegant solution. So I thought I'd subclass SparkException in order to add a field that described what piece of data caused the error:
import org.apache.spark.SparkException
class WorkerViolation(msg:String, data:Any) extends SparkException(msg) {
override def toString = "A worker violation occurred: " + msg
def getData = data
def this(dat:Any) = this("Error at worker.", dat)
}
The goal is to be able to use the getData accessor to retrieve some information. To that end, I tried modifying the program above, as follows:
...
catch {
case w:WorkerViolation => println("A worker threw an exception, with data: " + w.getData)
case t:Throwable => throw(t)
}
}
def func(i:Int) = {
if(i == 1 || i == 4000)
throw new WorkerViolation("Bad number detected.", i)
else
Math.pow(i, 2)
}
Note that this time I'm both throwing and catching WorkerViolations. Unfortunately, this particular exception seems to be killing the driver node as well. The full trace is of course gigantic, yet copied for consistency:
15/12/07 18:31:17 WARN util.Utils: Your hostname, debian resolves to a loopback address: 127.0.1.1; using 192.168.2.222 instead (on interface eth0)
15/12/07 18:31:17 WARN util.Utils: Set SPARK_LOCAL_IP if you need to bind to another address
15/12/07 18:31:17 INFO spark.SecurityManager: Changing view acls to: jason
15/12/07 18:31:17 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(jason)
15/12/07 18:31:17 INFO slf4j.Slf4jLogger: Slf4jLogger started
15/12/07 18:31:17 INFO Remoting: Starting remoting
15/12/07 18:31:17 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://spark#192.168.2.222:33572]
15/12/07 18:31:17 INFO Remoting: Remoting now listens on addresses: [akka.tcp://spark#192.168.2.222:33572]
15/12/07 18:31:17 INFO spark.SparkEnv: Registering MapOutputTracker
15/12/07 18:31:17 INFO spark.SparkEnv: Registering BlockManagerMaster
15/12/07 18:31:17 INFO storage.DiskBlockManager: Created local directory at /tmp/spark-local-20151207183117-4300
15/12/07 18:31:17 INFO storage.MemoryStore: MemoryStore started with capacity 2.1 GB.
15/12/07 18:31:17 INFO network.ConnectionManager: Bound socket to port 34704 with id = ConnectionManagerId(192.168.2.222,34704)
15/12/07 18:31:17 INFO storage.BlockManagerMaster: Trying to register BlockManager
15/12/07 18:31:17 INFO storage.BlockManagerInfo: Registering block manager 192.168.2.222:34704 with 2.1 GB RAM
15/12/07 18:31:17 INFO storage.BlockManagerMaster: Registered BlockManager
15/12/07 18:31:17 INFO spark.HttpServer: Starting HTTP Server
15/12/07 18:31:17 INFO server.Server: jetty-8.1.14.v20131031
15/12/07 18:31:17 INFO server.AbstractConnector: Started SocketConnector#0.0.0.0:42426
15/12/07 18:31:17 INFO broadcast.HttpBroadcast: Broadcast server started at http://192.168.2.222:42426
15/12/07 18:31:17 INFO spark.HttpFileServer: HTTP File server directory is /tmp/spark-0ae72587-14c5-4bfe-a151-2bcafc889ee8
15/12/07 18:31:17 INFO spark.HttpServer: Starting HTTP Server
15/12/07 18:31:17 INFO server.Server: jetty-8.1.14.v20131031
15/12/07 18:31:17 INFO server.AbstractConnector: Started SocketConnector#0.0.0.0:55556
15/12/07 18:31:17 INFO server.Server: jetty-8.1.14.v20131031
15/12/07 18:31:17 INFO server.AbstractConnector: Started SelectChannelConnector#0.0.0.0:4040
15/12/07 18:31:17 INFO ui.SparkUI: Started SparkUI at http://192.168.2.222:4040
15/12/07 18:31:18 INFO spark.SparkContext: Starting job: collect at SparkExceptions.scala:16
15/12/07 18:31:18 INFO scheduler.DAGScheduler: Got job 0 (collect at SparkExceptions.scala:16) with 8 output partitions (allowLocal=false)
15/12/07 18:31:18 INFO scheduler.DAGScheduler: Final stage: Stage 0(collect at SparkExceptions.scala:16)
15/12/07 18:31:18 INFO scheduler.DAGScheduler: Parents of final stage: List()
15/12/07 18:31:18 INFO scheduler.DAGScheduler: Missing parents: List()
15/12/07 18:31:18 INFO scheduler.DAGScheduler: Submitting Stage 0 (MappedRDD[1] at map at SparkExceptions.scala:14), which has no missing parents
15/12/07 18:31:18 INFO scheduler.DAGScheduler: Submitting 8 missing tasks from Stage 0 (MappedRDD[1] at map at SparkExceptions.scala:14)
15/12/07 18:31:18 INFO scheduler.TaskSchedulerImpl: Adding task set 0.0 with 8 tasks
15/12/07 18:31:18 INFO scheduler.TaskSetManager: Starting task 0.0:0 as TID 0 on executor localhost: localhost (PROCESS_LOCAL)
15/12/07 18:31:18 INFO scheduler.TaskSetManager: Serialized task 0.0:0 as 1350 bytes in 4 ms
15/12/07 18:31:18 INFO scheduler.TaskSetManager: Starting task 0.0:1 as TID 1 on executor localhost: localhost (PROCESS_LOCAL)
15/12/07 18:31:18 INFO scheduler.TaskSetManager: Serialized task 0.0:1 as 1350 bytes in 0 ms
15/12/07 18:31:18 INFO scheduler.TaskSetManager: Starting task 0.0:2 as TID 2 on executor localhost: localhost (PROCESS_LOCAL)
15/12/07 18:31:18 INFO scheduler.TaskSetManager: Serialized task 0.0:2 as 1350 bytes in 0 ms
15/12/07 18:31:18 INFO scheduler.TaskSetManager: Starting task 0.0:3 as TID 3 on executor localhost: localhost (PROCESS_LOCAL)
15/12/07 18:31:18 INFO scheduler.TaskSetManager: Serialized task 0.0:3 as 1350 bytes in 1 ms
15/12/07 18:31:18 INFO executor.Executor: Running task ID 3
15/12/07 18:31:18 INFO executor.Executor: Running task ID 1
15/12/07 18:31:18 INFO executor.Executor: Running task ID 0
15/12/07 18:31:18 INFO executor.Executor: Running task ID 2
15/12/07 18:31:18 ERROR executor.Executor: Exception in task ID 0
A worker violation occurred: Bad number detected.
at SparkExceptions$.func(SparkExceptions.scala:26)
at SparkExceptions$$anonfun$1.apply$mcDI$sp(SparkExceptions.scala:14)
at SparkExceptions$$anonfun$1.apply(SparkExceptions.scala:14)
at SparkExceptions$$anonfun$1.apply(SparkExceptions.scala:14)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
at scala.collection.AbstractIterator.to(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
at org.apache.spark.rdd.RDD$$anonfun$15.apply(RDD.scala:717)
at org.apache.spark.rdd.RDD$$anonfun$15.apply(RDD.scala:717)
at org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1083)
at org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1083)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111)
at org.apache.spark.scheduler.Task.run(Task.scala:51)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:183)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
15/12/07 18:31:18 INFO executor.Executor: Serialized size of result for 2 is 5565
15/12/07 18:31:18 INFO executor.Executor: Serialized size of result for 1 is 5565
15/12/07 18:31:18 INFO executor.Executor: Sending result for 2 directly to driver
15/12/07 18:31:18 INFO executor.Executor: Sending result for 1 directly to driver
15/12/07 18:31:18 INFO executor.Executor: Serialized size of result for 3 is 5565
15/12/07 18:31:18 INFO executor.Executor: Finished task ID 2
15/12/07 18:31:18 INFO executor.Executor: Finished task ID 1
15/12/07 18:31:18 INFO executor.Executor: Sending result for 3 directly to driver
15/12/07 18:31:18 INFO executor.Executor: Finished task ID 3
15/12/07 18:31:18 INFO scheduler.TaskSetManager: Starting task 0.0:4 as TID 4 on executor localhost: localhost (PROCESS_LOCAL)
15/12/07 18:31:18 INFO scheduler.TaskSetManager: Serialized task 0.0:4 as 1350 bytes in 0 ms
15/12/07 18:31:18 INFO executor.Executor: Running task ID 4
15/12/07 18:31:18 INFO scheduler.TaskSetManager: Starting task 0.0:5 as TID 5 on executor localhost: localhost (PROCESS_LOCAL)
15/12/07 18:31:18 INFO scheduler.TaskSetManager: Serialized task 0.0:5 as 1350 bytes in 1 ms
15/12/07 18:31:18 INFO executor.Executor: Running task ID 5
15/12/07 18:31:18 WARN scheduler.TaskSetManager: Lost TID 0 (task 0.0:0)
15/12/07 18:31:18 INFO executor.Executor: Serialized size of result for 4 is 5565
15/12/07 18:31:18 INFO executor.Executor: Sending result for 4 directly to driver
15/12/07 18:31:18 INFO executor.Executor: Finished task ID 4
15/12/07 18:31:18 WARN scheduler.TaskSetManager: Loss was due to helpers.WorkerViolation
A worker violation occurred: Bad number detected.
at SparkExceptions$.func(SparkExceptions.scala:26)
at SparkExceptions$$anonfun$1.apply$mcDI$sp(SparkExceptions.scala:14)
at SparkExceptions$$anonfun$1.apply(SparkExceptions.scala:14)
at SparkExceptions$$anonfun$1.apply(SparkExceptions.scala:14)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
at scala.collection.AbstractIterator.to(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
at org.apache.spark.rdd.RDD$$anonfun$15.apply(RDD.scala:717)
at org.apache.spark.rdd.RDD$$anonfun$15.apply(RDD.scala:717)
at org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1083)
at org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1083)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111)
at org.apache.spark.scheduler.Task.run(Task.scala:51)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:183)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
15/12/07 18:31:18 INFO executor.Executor: Serialized size of result for 5 is 5565
15/12/07 18:31:18 INFO executor.Executor: Sending result for 5 directly to driver
15/12/07 18:31:18 INFO executor.Executor: Finished task ID 5
15/12/07 18:31:18 ERROR scheduler.TaskSetManager: Task 0.0:0 failed 1 times; aborting job
15/12/07 18:31:18 INFO scheduler.TaskSetManager: Finished TID 2 in 27 ms on localhost (progress: 1/8)
15/12/07 18:31:18 INFO scheduler.TaskSetManager: Finished TID 1 in 30 ms on localhost (progress: 2/8)
15/12/07 18:31:18 INFO scheduler.TaskSchedulerImpl: Cancelling stage 0
15/12/07 18:31:18 INFO scheduler.TaskSchedulerImpl: Stage 0 was cancelled
15/12/07 18:31:18 INFO scheduler.TaskSetManager: Finished TID 4 in 11 ms on localhost (progress: 3/8)
15/12/07 18:31:18 INFO scheduler.DAGScheduler: Failed to run collect at SparkExceptions.scala:16
15/12/07 18:31:18 INFO scheduler.TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 0.0:0 failed 1 times, most recent failure: Exception failure in TID 0 on host localhost: A worker violation occurred: Bad number detected.
SparkExceptions$.func(SparkExceptions.scala:26)
SparkExceptions$$anonfun$1.apply$mcDI$sp(SparkExceptions.scala:14)
SparkExceptions$$anonfun$1.apply(SparkExceptions.scala:14)
SparkExceptions$$anonfun$1.apply(SparkExceptions.scala:14)
scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
scala.collection.Iterator$class.foreach(Iterator.scala:727)
scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
scala.collection.AbstractIterator.to(Iterator.scala:1157)
scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
org.apache.spark.rdd.RDD$$anonfun$15.apply(RDD.scala:717)
org.apache.spark.rdd.RDD$$anonfun$15.apply(RDD.scala:717)
org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1083)
org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1083)
org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111)
org.apache.spark.scheduler.Task.run(Task.scala:51)
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:183)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
java.lang.Thread.run(Thread.java:745)
Driver stacktrace:
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1044)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1028)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1026)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1026)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:634)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:634)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:634)
at org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1229)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
at akka.actor.ActorCell.invoke(ActorCell.scala:456)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
at akka.dispatch.Mailbox.run(Mailbox.scala:219)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
15/12/07 18:31:18 INFO scheduler.TaskSetManager: Finished TID 5 in 11 ms on localhost (progress: 4/8)
15/12/07 18:31:18 INFO scheduler.TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool
15/12/07 18:31:18 INFO scheduler.TaskSetManager: Finished TID 3 in 34 ms on localhost (progress: 5/8)
15/12/07 18:31:18 INFO scheduler.TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool
Process finished with exit code 1
So my second question would then be: Why does throwing an exception of a class derived from SparkException kill the driver program as well? Is there a different strategy I can use for executor-driver communication?
FWIW, I have decided that in order to allow for a higher degree of message-passing between nodes, going down to the level of akka actors is the preferred way to go.

Spark - Can't read files from Google Cloud Storage when configuring gcs connector manually

I have a Spark Cluster deployed using bdutil for Google Cloud.
I installed a GUI on my driver instance to be able to run IntelliJ from it, so that I can try to run my Spark processes in interactive mode.
The first issue I faced was that the spark-env.sh and core-site.xml were not used at all when running from IntelliJ. I finally managed to set the configuration manually in Scala by copying values from the configuration files. Is there a way to avoid that ?
The last thing which is not working is that even if the gcs connector seems to "see" the folder I set as source, each time it tries to read the actual files in that folder, I get a java.io.EOFException.
Here's my code for my tests :
object SparkBasicTest {
def main(args: Array[String]) {
val conf = new SparkConf().setAppName("Simple Application")
conf.setMaster("spark://research-m:7077")
conf.set("spark.akka.frameSize", "512")
conf.set("spark.driver.maxResultSize", "1631m")
conf.set("spark.yarn.executor.memoryOverhead", "384")
conf.set("spark.driver.memory", "3263m")
conf.set("spark.executor.memory", "10444m")
conf.set("spark.driver.extraClassPath", ":/home/hadoop/hadoop-install/lib/gcs-connector-1.4.0-hadoop1.jar")
val path = "STAGE/out/scored"
val sc = new SparkContext(conf)
sc.hadoopConfiguration.set("fs.gs.impl", "com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystem")
sc.hadoopConfiguration.set("fs.AbstractFileSystem.gs.impl", "com.google.cloud.hadoop.fs.gcs.GoogleHadoopFS")
sc.hadoopConfiguration.set("fs.gs.project.id", "xxxxx")
sc.hadoopConfiguration.set("fs.gs.system.bucket", "yyyyy")
sc.hadoopConfiguration.set("fs.gs.metadata.cache.directory", "/hadoop_gcs_connector_metadata_cache")
sc.hadoopConfiguration.set("fs.gs.metadata.cache.enable", "true")
sc.hadoopConfiguration.set("fs.gs.metadata.cache.type", "FILESYSTEM_BACKED")
sc.hadoopConfiguration.set("fs.gs.working.dir", "/")
sc.hadoopConfiguration.set("fs.default.name", "gs://yyyyyy/")
sc.hadoopConfiguration.set("fs.defaultFS", "gs://yyyyyy/")
sc.hadoopConfiguration.set("hadoop.tmp.dir", "/hadoop/tmp")
sc.hadoopConfiguration.set("dfs.datanode.data.dir.perm", "755")
val lines = sc.textFile(path)
val result = lines.count()
}
}
And the output I get after running it :
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
15/07/27 12:00:47 INFO SparkContext: Running Spark version 1.4.0
15/07/27 12:00:48 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/07/27 12:00:48 INFO SecurityManager: Changing view acls to: antvoice
15/07/27 12:00:48 INFO SecurityManager: Changing modify acls to: antvoice
15/07/27 12:00:48 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(antvoice); users with modify permissions: Set(antvoice)
15/07/27 12:00:49 INFO Slf4jLogger: Slf4jLogger started
15/07/27 12:00:49 INFO Remoting: Starting remoting
15/07/27 12:00:50 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver#10.240.63.109:45952]
15/07/27 12:00:50 INFO Utils: Successfully started service 'sparkDriver' on port 45952.
15/07/27 12:00:50 INFO SparkEnv: Registering MapOutputTracker
15/07/27 12:00:50 INFO SparkEnv: Registering BlockManagerMaster
15/07/27 12:00:50 INFO DiskBlockManager: Created local directory at /mnt/pd1/hadoop/spark/tmp/spark-dbaf72cb-599b-40c9-a9f8-ad9ede2b0654/blockmgr-24fd090a-b9df-4754-8022-ccaf8800ca2a
15/07/27 12:00:50 INFO MemoryStore: MemoryStore started with capacity 1566.8 MB
15/07/27 12:00:50 INFO HttpFileServer: HTTP File server directory is /mnt/pd1/hadoop/spark/tmp/spark-dbaf72cb-599b-40c9-a9f8-ad9ede2b0654/httpd-27e69b24-ad3d-4019-9bf7-37649c2ebc8e
15/07/27 12:00:50 INFO HttpServer: Starting HTTP Server
15/07/27 12:00:50 INFO Utils: Successfully started service 'HTTP file server' on port 57505.
15/07/27 12:00:50 INFO SparkEnv: Registering OutputCommitCoordinator
15/07/27 12:00:56 INFO Utils: Successfully started service 'SparkUI' on port 4040.
15/07/27 12:00:56 INFO SparkUI: Started SparkUI at http://10.240.63.109:4040
15/07/27 12:00:56 INFO AppClient$ClientActor: Connecting to master akka.tcp://sparkMaster#research-m:7077/user/Master...
15/07/27 12:00:57 INFO SparkDeploySchedulerBackend: Connected to Spark cluster with app ID app-20150727120057-0000
15/07/27 12:00:57 INFO AppClient$ClientActor: Executor added: app-20150727120057-0000/0 on worker-20150727114108-10.240.205.199-50284 (10.240.205.199:50284) with 2 cores
15/07/27 12:00:57 INFO SparkDeploySchedulerBackend: Granted executor ID app-20150727120057-0000/0 on hostPort 10.240.205.199:50284 with 2 cores, 10.2 GB RAM
15/07/27 12:00:57 INFO AppClient$ClientActor: Executor updated: app-20150727120057-0000/0 is now RUNNING
15/07/27 12:00:57 INFO AppClient$ClientActor: Executor updated: app-20150727120057-0000/0 is now LOADING
15/07/27 12:00:57 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 38947.
15/07/27 12:00:57 INFO NettyBlockTransferService: Server created on 38947
15/07/27 12:00:57 INFO BlockManagerMaster: Trying to register BlockManager
15/07/27 12:00:57 INFO BlockManagerMasterEndpoint: Registering block manager 10.240.63.109:38947 with 1566.8 MB RAM, BlockManagerId(driver, 10.240.63.109, 38947)
15/07/27 12:00:57 INFO BlockManagerMaster: Registered BlockManager
15/07/27 12:00:57 INFO SparkDeploySchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.0
15/07/27 12:00:57 INFO deprecation: fs.default.name is deprecated. Instead, use fs.defaultFS
15/07/27 12:00:58 INFO MemoryStore: ensureFreeSpace(112832) called with curMem=0, maxMem=1642919362
15/07/27 12:00:58 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 110.2 KB, free 1566.7 MB)
15/07/27 12:00:58 INFO MemoryStore: ensureFreeSpace(10627) called with curMem=112832, maxMem=1642919362
15/07/27 12:00:58 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 10.4 KB, free 1566.7 MB)
15/07/27 12:00:58 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 10.240.63.109:38947 (size: 10.4 KB, free: 1566.8 MB)
15/07/27 12:00:58 INFO SparkContext: Created broadcast 0 from textFile at SparkBasicTest.scala:36
15/07/27 12:00:58 INFO GoogleHadoopFileSystemBase: GHFS version: 1.4.0-hadoop1
15/07/27 12:01:00 INFO SparkDeploySchedulerBackend: Registered executor: AkkaRpcEndpointRef(Actor[akka.tcp://sparkExecutor#10.240.205.199:54716/user/Executor#396919943]) with ID 0
15/07/27 12:01:00 INFO BlockManagerMasterEndpoint: Registering block manager 10.240.205.199:36835 with 5.3 GB RAM, BlockManagerId(0, 10.240.205.199, 36835)
15/07/27 12:01:02 INFO FileInputFormat: Total input paths to process : 47
15/07/27 12:01:02 INFO SparkContext: Starting job: count at SparkBasicTest.scala:37
15/07/27 12:01:02 INFO DAGScheduler: Got job 0 (count at SparkBasicTest.scala:37) with 47 output partitions (allowLocal=false)
15/07/27 12:01:02 INFO DAGScheduler: Final stage: ResultStage 0(count at SparkBasicTest.scala:37)
15/07/27 12:01:02 INFO DAGScheduler: Parents of final stage: List()
15/07/27 12:01:02 INFO DAGScheduler: Missing parents: List()
15/07/27 12:01:02 INFO DAGScheduler: Submitting ResultStage 0 (MapPartitionsRDD[1] at textFile at SparkBasicTest.scala:36), which has no missing parents
15/07/27 12:01:02 INFO MemoryStore: ensureFreeSpace(2968) called with curMem=123459, maxMem=1642919362
15/07/27 12:01:02 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 2.9 KB, free 1566.7 MB)
15/07/27 12:01:02 INFO MemoryStore: ensureFreeSpace(1752) called with curMem=126427, maxMem=1642919362
15/07/27 12:01:02 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 1752.0 B, free 1566.7 MB)
15/07/27 12:01:02 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on 10.240.63.109:38947 (size: 1752.0 B, free: 1566.8 MB)
15/07/27 12:01:02 INFO SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:874
15/07/27 12:01:02 INFO DAGScheduler: Submitting 47 missing tasks from ResultStage 0 (MapPartitionsRDD[1] at textFile at SparkBasicTest.scala:36)
15/07/27 12:01:02 INFO TaskSchedulerImpl: Adding task set 0.0 with 47 tasks
15/07/27 12:01:03 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, 10.240.205.199, PROCESS_LOCAL, 1416 bytes)
15/07/27 12:01:03 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, 10.240.205.199, PROCESS_LOCAL, 1416 bytes)
15/07/27 12:01:03 INFO TaskSetManager: Starting task 2.0 in stage 0.0 (TID 2, 10.240.205.199, PROCESS_LOCAL, 1416 bytes)
15/07/27 12:01:03 WARN TaskSetManager: Lost task 1.0 in stage 0.0 (TID 1, 10.240.205.199): java.io.EOFException
at java.io.ObjectInputStream$BlockDataInputStream.readFully(ObjectInputStream.java:2744)
at java.io.ObjectInputStream.readFully(ObjectInputStream.java:1032)
at org.apache.hadoop.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java:63)
at org.apache.hadoop.io.DataOutputBuffer.write(DataOutputBuffer.java:101)
at org.apache.hadoop.io.UTF8.readChars(UTF8.java:216)
at org.apache.hadoop.io.UTF8.readString(UTF8.java:208)
at org.apache.hadoop.mapred.FileSplit.readFields(FileSplit.java:87)
at org.apache.hadoop.io.ObjectWritable.readObject(ObjectWritable.java:237)
at org.apache.hadoop.io.ObjectWritable.readFields(ObjectWritable.java:66)
at org.apache.spark.SerializableWritable$$anonfun$readObject$1.apply$mcV$sp(SerializableWritable.scala:45)
at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1239)
at org.apache.spark.SerializableWritable.readObject(SerializableWritable.scala:41)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:69)
at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:95)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:194)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
15/07/27 12:01:03 INFO TaskSetManager: Starting task 1.1 in stage 0.0 (TID 3, 10.240.205.199, PROCESS_LOCAL, 1416 bytes)
15/07/27 12:01:03 INFO TaskSetManager: Starting task 3.0 in stage 0.0 (TID 4, 10.240.205.199, PROCESS_LOCAL, 1416 bytes)
15/07/27 12:01:03 INFO TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0) on executor 10.240.205.199: java.io.EOFException (null) [duplicate 1]
15/07/27 12:01:03 INFO TaskSetManager: Lost task 2.0 in stage 0.0 (TID 2) on executor 10.240.205.199: java.io.EOFException (null) [duplicate 2]
15/07/27 12:01:03 INFO TaskSetManager: Starting task 2.1 in stage 0.0 (TID 5, 10.240.205.199, PROCESS_LOCAL, 1416 bytes)
15/07/27 12:01:03 INFO TaskSetManager: Starting task 0.1 in stage 0.0 (TID 6, 10.240.205.199, PROCESS_LOCAL, 1416 bytes)
15/07/27 12:01:03 INFO TaskSetManager: Lost task 3.0 in stage 0.0 (TID 4) on executor 10.240.205.199: java.io.EOFException (null) [duplicate 3]
15/07/27 12:01:03 INFO TaskSetManager: Lost task 1.1 in stage 0.0 (TID 3) on executor 10.240.205.199: java.io.EOFException (null) [duplicate 4]
15/07/27 12:01:03 INFO TaskSetManager: Starting task 1.2 in stage 0.0 (TID 7, 10.240.205.199, PROCESS_LOCAL, 1416 bytes)
15/07/27 12:01:03 INFO TaskSetManager: Lost task 2.1 in stage 0.0 (TID 5) on executor 10.240.205.199: java.io.EOFException (null) [duplicate 5]
15/07/27 12:01:03 INFO TaskSetManager: Starting task 2.2 in stage 0.0 (TID 8, 10.240.205.199, PROCESS_LOCAL, 1416 bytes)
15/07/27 12:01:03 INFO TaskSetManager: Lost task 0.1 in stage 0.0 (TID 6) on executor 10.240.205.199: java.io.EOFException (null) [duplicate 6]
15/07/27 12:01:03 INFO TaskSetManager: Starting task 0.2 in stage 0.0 (TID 9, 10.240.205.199, PROCESS_LOCAL, 1416 bytes)
15/07/27 12:01:03 INFO TaskSetManager: Lost task 1.2 in stage 0.0 (TID 7) on executor 10.240.205.199: java.io.EOFException (null) [duplicate 7]
15/07/27 12:01:03 INFO TaskSetManager: Starting task 1.3 in stage 0.0 (TID 10, 10.240.205.199, PROCESS_LOCAL, 1416 bytes)
15/07/27 12:01:03 INFO TaskSetManager: Lost task 2.2 in stage 0.0 (TID 8) on executor 10.240.205.199: java.io.EOFException (null) [duplicate 8]
15/07/27 12:01:03 INFO TaskSetManager: Starting task 2.3 in stage 0.0 (TID 11, 10.240.205.199, PROCESS_LOCAL, 1416 bytes)
15/07/27 12:01:03 INFO TaskSetManager: Lost task 0.2 in stage 0.0 (TID 9) on executor 10.240.205.199: java.io.EOFException (null) [duplicate 9]
15/07/27 12:01:03 INFO TaskSetManager: Lost task 1.3 in stage 0.0 (TID 10) on executor 10.240.205.199: java.io.EOFException (null) [duplicate 10]
15/07/27 12:01:03 ERROR TaskSetManager: Task 1 in stage 0.0 failed 4 times; aborting job
15/07/27 12:01:03 INFO TaskSetManager: Lost task 2.3 in stage 0.0 (TID 11) on executor 10.240.205.199: java.io.EOFException (null) [duplicate 11]
15/07/27 12:01:03 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool
15/07/27 12:01:03 INFO TaskSchedulerImpl: Cancelling stage 0
15/07/27 12:01:03 INFO DAGScheduler: ResultStage 0 (count at SparkBasicTest.scala:37) failed in 0.319 s
15/07/27 12:01:03 INFO DAGScheduler: Job 0 failed: count at SparkBasicTest.scala:37, took 0.437413 s
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 in stage 0.0 failed 4 times, most recent failure: Lost task 1.3 in stage 0.0 (TID 10, 10.240.205.199): java.io.EOFException
at java.io.ObjectInputStream$BlockDataInputStream.readFully(ObjectInputStream.java:2744)
at java.io.ObjectInputStream.readFully(ObjectInputStream.java:1032)
at org.apache.hadoop.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java:63)
at org.apache.hadoop.io.DataOutputBuffer.write(DataOutputBuffer.java:101)
at org.apache.hadoop.io.UTF8.readChars(UTF8.java:216)
at org.apache.hadoop.io.UTF8.readString(UTF8.java:208)
at org.apache.hadoop.mapred.FileSplit.readFields(FileSplit.java:87)
at org.apache.hadoop.io.ObjectWritable.readObject(ObjectWritable.java:237)
at org.apache.hadoop.io.ObjectWritable.readFields(ObjectWritable.java:66)
at org.apache.spark.SerializableWritable$$anonfun$readObject$1.apply$mcV$sp(SerializableWritable.scala:45)
at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1239)
at org.apache.spark.SerializableWritable.readObject(SerializableWritable.scala:41)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:69)
at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:95)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:194)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Driver stacktrace:
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1266)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1257)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1256)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1256)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:730)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:730)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:730)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1450)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1411)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
15/07/27 12:01:03 INFO SparkContext: Invoking stop() from shutdown hook
15/07/27 12:01:03 INFO SparkUI: Stopped Spark web UI at http://10.240.63.109:4040
15/07/27 12:01:03 INFO DAGScheduler: Stopping DAGScheduler
15/07/27 12:01:03 INFO SparkDeploySchedulerBackend: Shutting down all executors
15/07/27 12:01:03 INFO SparkDeploySchedulerBackend: Asking each executor to shut down
15/07/27 12:01:03 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
15/07/27 12:01:03 INFO Utils: path = /mnt/pd1/hadoop/spark/tmp/spark-dbaf72cb-599b-40c9-a9f8-ad9ede2b0654/blockmgr-24fd090a-b9df-4754-8022-ccaf8800ca2a, already present as root for deletion.
15/07/27 12:01:03 INFO MemoryStore: MemoryStore cleared
15/07/27 12:01:03 INFO BlockManager: BlockManager stopped
15/07/27 12:01:03 INFO BlockManagerMaster: BlockManagerMaster stopped
15/07/27 12:01:03 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
15/07/27 12:01:03 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
15/07/27 12:01:03 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
15/07/27 12:01:03 INFO SparkContext: Successfully stopped SparkContext
15/07/27 12:01:03 INFO Utils: Shutdown hook called
15/07/27 12:01:03 INFO Utils: Deleting directory /mnt/pd1/hadoop/spark/tmp/spark-dbaf72cb-599b-40c9-a9f8-ad9ede2b0654
Process finished with exit code 1
What am I missing?
Thanks in advance for any help !
One possibility is that you somehow have a mismatch of Hadoop versions on your classpaths. In particular, if you use Spark's prebuilt tarball that was built for Hadoop 2 but run it on a cluster that has Hadoop 1 installed, you may hit the error you encountered. Note that the stack trace indicates errors when trying to "readObject", which means trying to deserialize a class; if class definitions differ between classloaders, this can happen.
I set up a few different IntelliJ installations on different bdutil-deployed Spark clusters, and encountered the same stack trace you saw when I tried running from cluster that has spark-1.4.0-bin-hadoop2.6.tgz for the IntelliJ library and driver, but submitting to another node which is using spark-1.4.0-bin-hadoop1.tgz. Here's a related stack overflow question running on EC2 and here's another manifestation that wasn't a mismatch, but a requirement to add the hadoop-client library to the classpath.