spark yarn cluster Stack trace: ExitCodeException exitCode=13 - scala

I try to run my program in yarn cluster mode.I'm 100% sure the class in fat jar by sbt exists.
I have no clue why the spark always throw Stack trace: ExitCodeException exitCode=13 error.
And I follow the trackping page and see java.lang.ClassNotFoundException: org.air.ebds.organize.geotrellisETLtoa.test.
Then I run the spark PI example in yarn cluster and make it.
In yarn client/local mode,it still failed with the same error:java.lang.ClassNotFoundException: org.air.ebds.organize.geotrellisETLtoa.test
P.S. The spark conf in program looks like this:
var sparkConf = new SparkConf()
.setAppName("TiffDN2TOA")
// .setIfMissing("spark.master", masterUrl)
.set("spark.executor.memory", "10g")
.set("spark.kryoserializer.buffer.max", "1024")
implicit val sc = new SparkContext(sparkConf)
object test {
val masterUrl = "local[*]"
var sparkConf = new SparkConf()
.setAppName("TiffDN2TOA")
// .setIfMissing("spark.master", masterUrl)
.set("spark.executor.memory", "10g")
.set("spark.kryoserializer.buffer.max", "1024")
implicit val sc = new SparkContext(sparkConf)
def main(args: Array[String]): Unit = {
HadoopLandsatDN2ToaMethods.scenesDn2Toa(args(0),args(1))
}
}
spark2-submit \
> --master yarn \
> --deploy-mode cluster \
> --class org.air.ebds.organize.geotrellisETLtoa.LandsatDN2Toa \
> --num-executors 4 \
> --executor-cores 4 \
> --executor-memory 10G \
> --driver-memory 12g \
> --conf "spark.kryoserializer.buffer.max=1024m spark.kryoserializer.buffer=1024m" \
> /root/Desktop/toa.jar \
> /root/Desktop/ebds_landsat8/LC08/122/031/LC08_L1TP_122031_20140727,/root/Desktop/ebds_landsat8/LC08/122/031/LC08_L1TP_122031_20140913,LC08_L1TP_122031_20141116 \
> file:///
19/07/10 16:40:21 INFO client.RMProxy: Connecting to ResourceManager at bigdataone/192.168.1.151:8032
19/07/10 16:40:21 INFO yarn.Client: Requesting a new application from cluster with 3 NodeManagers
19/07/10 16:40:21 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (65536 MB per container)
19/07/10 16:40:21 INFO yarn.Client: Will allocate AM container, with 13516 MB memory including 1228 MB overhead
19/07/10 16:40:21 INFO yarn.Client: Setting up container launch context for our AM
19/07/10 16:40:21 INFO yarn.Client: Setting up the launch environment for our AM container
19/07/10 16:40:21 INFO yarn.Client: Preparing resources for our AM container
19/07/10 16:40:21 INFO yarn.Client: Uploading resource file:/root/Desktop/toa.jar -> hdfs://bigdataone:8020/user/root/.sparkStaging/application_1561542066113_0061/toa.jar
19/07/10 16:40:33 INFO yarn.Client: Uploading resource file:/tmp/spark-0ccc5b92-4ef5-4f5e-944b-386abcbb5938/__spark_conf__3393474382108225503.zip -> hdfs://bigdataone:8020/user/root/.sparkStaging/application_1561542066113_0061/__spark_conf__.zip
19/07/10 16:40:33 INFO spark.SecurityManager: Changing view acls to: root
19/07/10 16:40:33 INFO spark.SecurityManager: Changing modify acls to: root
19/07/10 16:40:33 INFO spark.SecurityManager: Changing view acls groups to:
19/07/10 16:40:33 INFO spark.SecurityManager: Changing modify acls groups to:
19/07/10 16:40:33 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); groups with view permissions: Set(); users with modify permissions: Set(root); groups with modify permissions: Set()
19/07/10 16:40:34 INFO yarn.Client: Submitting application application_1561542066113_0061 to ResourceManager
19/07/10 16:40:34 INFO impl.YarnClientImpl: Submitted application application_1561542066113_0061
19/07/10 16:40:35 INFO yarn.Client: Application report for application_1561542066113_0061 (state: ACCEPTED)
19/07/10 16:40:35 INFO yarn.Client:
client token: N/A
diagnostics: N/A
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: root.users.root
start time: 1562748034023
final status: UNDEFINED
tracking URL: http://bigdataone:8088/proxy/application_1561542066113_0061/
user: root
19/07/10 16:40:36 INFO yarn.Client: Application report for application_1561542066113_0061 (state: ACCEPTED)
19/07/10 16:40:37 INFO yarn.Client: Application report for application_1561542066113_0061 (state: ACCEPTED)
19/07/10 16:40:38 INFO yarn.Client: Application report for application_1561542066113_0061 (state: ACCEPTED)
19/07/10 16:40:39 INFO yarn.Client: Application report for application_1561542066113_0061 (state: ACCEPTED)
19/07/10 16:40:40 INFO yarn.Client: Application report for application_1561542066113_0061 (state: ACCEPTED)
19/07/10 16:40:41 INFO yarn.Client: Application report for application_1561542066113_0061 (state: ACCEPTED)
19/07/10 16:40:42 INFO yarn.Client: Application report for application_1561542066113_0061 (state: ACCEPTED)
19/07/10 16:40:43 INFO yarn.Client: Application report for application_1561542066113_0061 (state: ACCEPTED)
19/07/10 16:40:44 INFO yarn.Client: Application report for application_1561542066113_0061 (state: ACCEPTED)
19/07/10 16:40:45 INFO yarn.Client: Application report for application_1561542066113_0061 (state: ACCEPTED)
19/07/10 16:40:46 INFO yarn.Client: Application report for application_1561542066113_0061 (state: ACCEPTED)
19/07/10 16:40:47 INFO yarn.Client: Application report for application_1561542066113_0061 (state: ACCEPTED)
19/07/10 16:40:48 INFO yarn.Client: Application report for application_1561542066113_0061 (state: ACCEPTED)
19/07/10 16:40:49 INFO yarn.Client: Application report for application_1561542066113_0061 (state: ACCEPTED)
19/07/10 16:40:50 INFO yarn.Client: Application report for application_1561542066113_0061 (state: ACCEPTED)
19/07/10 16:40:51 INFO yarn.Client: Application report for application_1561542066113_0061 (state: FAILED)
19/07/10 16:40:51 INFO yarn.Client:
client token: N/A
diagnostics: Application application_1561542066113_0061 failed 2 times due to AM Container for appattempt_1561542066113_0061_000002 exited with exitCode: 13
For more detailed output, check application tracking page:http://bigdataone:8088/proxy/application_1561542066113_0061/Then, click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_1561542066113_0061_02_000001
Exit code: 13
Stack trace: ExitCodeException exitCode=13:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:604)
at org.apache.hadoop.util.Shell.run(Shell.java:507)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:789)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:213)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Container exited with a non-zero exit code 13
Failing this attempt. Failing the application.
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: root.users.root
start time: 1562748034023
final status: FAILED
tracking URL: http://bigdataone:8088/cluster/app/application_1561542066113_0061
user: root
Exception in thread "main" org.apache.spark.SparkException: Application application_1561542066113_0061 finished with failed status
at org.apache.spark.deploy.yarn.Client.run(Client.scala:1153)
at org.apache.spark.deploy.yarn.YarnClusterApplication.start(Client.scala:1568)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:892)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:197)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:227)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:136)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
19/07/10 16:40:51 INFO util.ShutdownHookManager: Shutdown hook called
19/07/10 16:40:51 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-5e6eb641-9f2a-4351-947d-a3b4cf578f6d
19/07/10 16:40:51 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-0ccc5b92-4ef5-4f5e-944b-386abcbb5938
The application tracking page:http://bigdataone:8088/proxy/application_1561542066113_0061 shows:
19/07/10 15:14:10 INFO util.SignalUtils: Registered signal handler for TERM
19/07/10 15:14:10 INFO util.SignalUtils: Registered signal handler for HUP
19/07/10 15:14:10 INFO util.SignalUtils: Registered signal handler for INT
19/07/10 15:14:10 INFO spark.SecurityManager: Changing view acls to: yarn,root
19/07/10 15:14:10 INFO spark.SecurityManager: Changing modify acls to: yarn,root
19/07/10 15:14:10 INFO spark.SecurityManager: Changing view acls groups to:
19/07/10 15:14:10 INFO spark.SecurityManager: Changing modify acls groups to:
19/07/10 15:14:10 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(yarn, root); groups with view permissions: Set(); users with modify permissions: Set(yarn, root); groups with modify permissions: Set()
19/07/10 15:14:10 INFO yarn.ApplicationMaster: Preparing Local resources
19/07/10 15:14:11 INFO yarn.ApplicationMaster: ApplicationAttemptId: appattempt_1561542066113_0055_000002
19/07/10 15:14:11 INFO yarn.ApplicationMaster: Starting the user application in a separate Thread
19/07/10 15:14:11 ERROR yarn.ApplicationMaster: Uncaught exception:
java.lang.ClassNotFoundException: org.air.ebds.organize.geotrellisETLtoa.test
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at org.apache.spark.deploy.yarn.ApplicationMaster.startUserApplication(ApplicationMaster.scala:682)
at org.apache.spark.deploy.yarn.ApplicationMaster.runDriver(ApplicationMaster.scala:448)
at org.apache.spark.deploy.yarn.ApplicationMaster.org$apache$spark$deploy$yarn$ApplicationMaster$$runImpl(ApplicationMaster.scala:301)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$run$1.apply$mcV$sp(ApplicationMaster.scala:241)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$run$1.apply(ApplicationMaster.scala:241)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$run$1.apply(ApplicationMaster.scala:241)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$3.run(ApplicationMaster.scala:782)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1924)
at org.apache.spark.deploy.yarn.ApplicationMaster.doAsUser(ApplicationMaster.scala:781)
at org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:240)
at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:806)
at org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala)
19/07/10 15:14:11 INFO yarn.ApplicationMaster: Final app status: FAILED, exitCode: 13, (reason: Uncaught exception: java.lang.ClassNotFoundException: org.air.ebds.organize.geotrellisETLtoa.test)
19/07/10 15:14:11 INFO yarn.ApplicationMaster: Deleting staging directory hdfs://bigdataone:8020/user/root/.sparkStaging/application_1561542066113_0055
19/07/10 15:14:11 INFO util.ShutdownHookManager: Shutdown hook called

Related

hdfs : File does not exist

I have tried to read the data from s3 bucket and do the computation in spark and write the output to s3 bucket. This process has been completed successfully.But, at EMR step level i see there was job failed. If i see the log it is showing that File does not exist.
Please see the log below.
19/01/09 08:40:37 INFO RMProxy: Connecting to ResourceManager at ip-172-30-0-84.ap-northeast-1.compute.internal/172.30.0.84:8032
19/01/09 08:40:37 INFO Client: Requesting a new application from cluster with 2 NodeManagers
19/01/09 08:40:37 INFO Client: Verifying our application has not requested more than the maximum memory capability of the cluster (106496 MB per container)
19/01/09 08:40:37 INFO Client: Will allocate AM container, with 1408 MB memory including 384 MB overhead
19/01/09 08:40:37 INFO Client: Setting up container launch context for our AM
19/01/09 08:40:37 INFO Client: Setting up the launch environment for our AM container
19/01/09 08:40:37 INFO Client: Preparing resources for our AM container
19/01/09 08:40:39 WARN Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME.
19/01/09 08:40:43 INFO Client: Uploading resource file:/mnt/tmp/spark-e0c6fbd3-14b0-4fcd-bbd2-c78658fdefd0/__spark_libs__8470659354947187213.zip -> hdfs://ip-172-30-0-84.ap-northeast-1.compute.internal:8020/user/hadoop/.sparkStaging/application_1547023042733_0001/__spark_libs__8470659354947187213.zip
19/01/09 08:40:47 INFO Client: Uploading resource s3://dev-system/SparkApps/jar/rxsicheck.jar -> hdfs://ip-172-30-0-84.ap-northeast-1.compute.internal:8020/user/hadoop/.sparkStaging/application_1547023042733_0001/rxsicheck.jar
19/01/09 08:40:47 INFO S3NativeFileSystem: Opening 's3://dev-system/SparkApps/jar/rxsicheck.jar' for reading
19/01/09 08:40:47 INFO Client: Uploading resource file:/mnt/tmp/spark-e0c6fbd3-14b0-4fcd-bbd2-c78658fdefd0/__spark_conf__4575598882972227909.zip -> hdfs://ip-172-30-0-84.ap-northeast-1.compute.internal:8020/user/hadoop/.sparkStaging/application_1547023042733_0001/__spark_conf__.zip
19/01/09 08:40:47 INFO SecurityManager: Changing view acls to: hadoop
19/01/09 08:40:47 INFO SecurityManager: Changing modify acls to: hadoop
19/01/09 08:40:47 INFO SecurityManager: Changing view acls groups to:
19/01/09 08:40:47 INFO SecurityManager: Changing modify acls groups to:
19/01/09 08:40:47 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(hadoop); groups with view permissions: Set(); users with modify permissions: Set(hadoop); groups with modify permissions: Set()
19/01/09 08:40:47 INFO Client: Submitting application application_1547023042733_0001 to ResourceManager
19/01/09 08:40:48 INFO YarnClientImpl: Submitted application application_1547023042733_0001
19/01/09 08:40:49 INFO Client: Application report for application_1547023042733_0001 (state: ACCEPTED)
19/01/09 08:40:49 INFO Client:
client token: N/A
diagnostics: N/A
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: default
start time: 1547023248110
final status: UNDEFINED
tracking URL: http://ip-172-30-0-84.ap-northeast-1.compute.internal:20888/proxy/application_1547023042733_0001/
user: hadoop
19/01/09 08:40:50 INFO Client: Application report for application_1547023042733_0001 (state: ACCEPTED)
19/01/09 08:40:51 INFO Client: Application report for application_1547023042733_0001 (state: ACCEPTED)
19/01/09 08:40:52 INFO Client: Application report for application_1547023042733_0001 (state: ACCEPTED)
19/01/09 08:40:53 INFO Client: Application report for application_1547023042733_0001 (state: ACCEPTED)
19/01/09 08:40:54 INFO Client: Application report for application_1547023042733_0001 (state: ACCEPTED)
19/01/09 08:40:55 INFO Client: Application report for application_1547023042733_0001 (state: ACCEPTED)
19/01/09 08:40:56 INFO Client: Application report for application_1547023042733_0001 (state: ACCEPTED)
19/01/09 08:40:57 INFO Client: Application report for application_1547023042733_0001 (state: ACCEPTED)
19/01/09 08:40:58 INFO Client: Application report for application_1547023042733_0001 (state: ACCEPTED)
19/01/09 08:40:59 INFO Client: Application report for application_1547023042733_0001 (state: ACCEPTED)
19/01/09 08:41:00 INFO Client: Application report for application_1547023042733_0001 (state: ACCEPTED)
19/01/09 08:41:01 INFO Client: Application report for application_1547023042733_0001 (state: ACCEPTED)
19/01/09 08:41:02 INFO Client: Application report for application_1547023042733_0001 (state: ACCEPTED)
19/01/09 08:41:03 INFO Client: Application report for application_1547023042733_0001 (state: ACCEPTED)
19/01/09 08:41:04 INFO Client: Application report for application_1547023042733_0001 (state: ACCEPTED)
19/01/09 08:41:05 INFO Client: Application report for application_1547023042733_0001 (state: ACCEPTED)
19/01/09 08:41:06 INFO Client: Application report for application_1547023042733_0001 (state: ACCEPTED)
19/01/09 08:41:07 INFO Client: Application report for application_1547023042733_0001 (state: ACCEPTED)
19/01/09 08:41:08 INFO Client: Application report for application_1547023042733_0001 (state: ACCEPTED)
19/01/09 08:41:09 INFO Client: Application report for application_1547023042733_0001 (state: ACCEPTED)
19/01/09 08:41:10 INFO Client: Application report for application_1547023042733_0001 (state: ACCEPTED)
19/01/09 08:41:11 INFO Client: Application report for application_1547023042733_0001 (state: ACCEPTED)
19/01/09 08:41:12 INFO Client: Application report for application_1547023042733_0001 (state: ACCEPTED)
19/01/09 08:41:13 INFO Client: Application report for application_1547023042733_0001 (state: ACCEPTED)
19/01/09 08:41:14 INFO Client: Application report for application_1547023042733_0001 (state: ACCEPTED)
19/01/09 08:41:15 INFO Client: Application report for application_1547023042733_0001 (state: ACCEPTED)
19/01/09 08:41:16 INFO Client: Application report for application_1547023042733_0001 (state: ACCEPTED)
19/01/09 08:41:17 INFO Client: Application report for application_1547023042733_0001 (state: ACCEPTED)
19/01/09 08:41:18 INFO Client: Application report for application_1547023042733_0001 (state: FAILED)
19/01/09 08:41:18 INFO Client:
client token: N/A
diagnostics: Application application_1547023042733_0001 failed 2 times due to AM Container for appattempt_1547023042733_0001_000002 exited with exitCode: -1000
For more detailed output, check application tracking page:http://ip-172-30-0-84.ap-northeast-1.compute.internal:8088/cluster/app/application_1547023042733_0001Then, click on links to logs of each attempt.
Diagnostics: File does not exist: hdfs://ip-172-30-0-84.ap-northeast-1.compute.internal:8020/user/hadoop/.sparkStaging/application_1547023042733_0001/__spark_libs__8470659354947187213.zip
java.io.FileNotFoundException: File does not exist: hdfs://ip-172-30-0-84.ap-northeast-1.compute.internal:8020/user/hadoop/.sparkStaging/application_1547023042733_0001/__spark_libs__8470659354947187213.zip
at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1309)
at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1301)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1317)
at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:253)
at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:63)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:361)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:359)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Failing this attempt. Failing the application.
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: default
start time: 1547023248110
final status: FAILED
tracking URL: http://ip-172-30-0-84.ap-northeast-1.compute.internal:8088/cluster/app/application_1547023042733_0001
user: hadoop
Exception in thread "main" org.apache.spark.SparkException: Application application_1547023042733_0001 finished with failed status
at org.apache.spark.deploy.yarn.Client.run(Client.scala:1122)
at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1168)
at org.apache.spark.deploy.yarn.Client.main(Client.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:775)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
19/01/09 08:41:18 INFO ShutdownHookManager: Shutdown hook called
19/01/09 08:41:18 INFO ShutdownHookManager: Deleting directory /mnt/tmp/spark-e0c6fbd3-14b0-4fcd-bbd2-c78658fdefd0
Command exiting with ret '1'
I can see the my expected output result but job shows it is failed. Am i missing anything?
Here is the my code:
package Spark_package
import org.apache.spark.SparkConf
import org.apache.spark.sql.SparkSession
object SampleFile {
def main(args: Array[String]) {
val spark = SparkSession.builder.master("local[*]").appName("SampleFile").getOrCreate()
val sc = spark.sparkContext
val conf = new SparkConf().setAppName("SampleFile")
val sqlContext = spark.sqlContext
val df = spark.read.format("csv").option("header","true").option("inferSchema","true").load("s3a://test-system/Checktool/Zipdata/*.gz")
df.createOrReplaceTempView("data")
val res = spark.sql("select count(*) from data")
res.coalesce(1).write.format("csv").option("header","true").mode("Overwrite").save("s3a://dev-system/Checktool/bkup/")
spark.stop()
}
}
kindly help me how to solve this issue?
Remove the master(local[*]) and run it on respective cluster works.

Deploy Spark Scala on Cloudera or EMR

I created a sample app (Snippet 1), i deploy it on Cloudera using (sbt assembly ) , it works.
object Main {
def main(args: Array[String]) {
val conf = new SparkConf().setAppName("Spark Pi")
val spark = new SparkContext(conf)
val slices = if (args.length > 0) args(0).toInt else 2
val n = math.min(100000L * slices, Int.MaxValue).toInt // avoid overflow
val count = spark
.parallelize(1 until n, slices)
.map { i =>
val x = random * 2 - 1
val y = random * 2 - 1
if (x * x + y * y < 1) 1 else 0
}
.reduce(_ + _)
println("Pi is roughly " + 4.0 * count / n)
spark.stop()
}
}
So i want to read a json file & print the schema, so obviously i created a new spark session,
object Main {
def main(args: Array[String]) {
val spark = SparkSession
.builder()
.master("local[*]")
.appName("Test")
.getOrCreate
spark.read.json(getClass.getResource("data.json").getPath)
.printSchema()
spark.stop()
}
}
This snippet won't work, because i think i shouldn't do sparkSession.builder() . But read.json is only available in the SparkSession Object.
This my command to run the jar :
spark-submit --master yarn-cluster path_to_jar
Stack trace :
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/lib/zookeeper/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/lib/flume-ng/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/lib/parquet/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
17/07/20 04:15:35 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
17/07/20 04:15:36 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
17/07/20 04:15:36 INFO yarn.Client: Requesting a new application from cluster with 1 NodeManagers
17/07/20 04:15:36 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (8192 MB per container)
17/07/20 04:15:36 INFO yarn.Client: Will allocate AM container, with 1408 MB memory including 384 MB overhead
17/07/20 04:15:36 INFO yarn.Client: Setting up container launch context for our AM
17/07/20 04:15:36 INFO yarn.Client: Setting up the launch environment for our AM container
17/07/20 04:15:36 INFO yarn.Client: Preparing resources for our AM container
17/07/20 04:15:38 WARN shortcircuit.DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
17/07/20 04:15:38 INFO yarn.Client: Uploading resource file:/usr/lib/spark/lib/spark-assembly-1.6.0-cdh5.10.0-hadoop2.6.0-cdh5.10.0.jar -> hdfs://quickstart.cloudera:8020/user/cloudera/.sparkStaging/application_1500545966176_0008/spark-assembly-1.6.0-cdh5.10.0-hadoop2.6.0-cdh5.10.0.jar
17/07/20 04:15:39 INFO yarn.Client: Uploading resource file:/media/sf_SparkS3AirFlow/target/scala-2.11/spark-emr-test.jar -> hdfs://quickstart.cloudera:8020/user/cloudera/.sparkStaging/application_1500545966176_0008/spark-emr-test.jar
17/07/20 04:15:41 INFO yarn.Client: Uploading resource file:/tmp/spark-3a2fec6b-6148-49a9-9db7-7a7cca99c586/__spark_conf__2098882718740574102.zip -> hdfs://quickstart.cloudera:8020/user/cloudera/.sparkStaging/application_1500545966176_0008/__spark_conf__2098882718740574102.zip
17/07/20 04:15:41 INFO spark.SecurityManager: Changing view acls to: cloudera
17/07/20 04:15:41 INFO spark.SecurityManager: Changing modify acls to: cloudera
17/07/20 04:15:41 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(cloudera); users with modify permissions: Set(cloudera)
17/07/20 04:15:41 INFO yarn.Client: Submitting application 8 to ResourceManager
17/07/20 04:15:41 INFO impl.YarnClientImpl: Submitted application application_1500545966176_0008
17/07/20 04:15:42 INFO yarn.Client: Application report for application_1500545966176_0008 (state: ACCEPTED)
17/07/20 04:15:42 INFO yarn.Client:
client token: N/A
diagnostics: N/A
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: root.cloudera
start time: 1500549341177
final status: UNDEFINED
tracking URL: http://quickstart.cloudera:8088/proxy/application_1500545966176_0008/
user: cloudera
17/07/20 04:15:43 INFO yarn.Client: Application report for application_1500545966176_0008 (state: ACCEPTED)
17/07/20 04:15:44 INFO yarn.Client: Application report for application_1500545966176_0008 (state: ACCEPTED)
17/07/20 04:15:45 INFO yarn.Client: Application report for application_1500545966176_0008 (state: ACCEPTED)
17/07/20 04:15:46 INFO yarn.Client: Application report for application_1500545966176_0008 (state: ACCEPTED)
17/07/20 04:15:47 INFO yarn.Client: Application report for application_1500545966176_0008 (state: ACCEPTED)
17/07/20 04:15:48 INFO yarn.Client: Application report for application_1500545966176_0008 (state: ACCEPTED)
17/07/20 04:15:49 INFO yarn.Client: Application report for application_1500545966176_0008 (state: ACCEPTED)
17/07/20 04:15:50 INFO yarn.Client: Application report for application_1500545966176_0008 (state: ACCEPTED)
17/07/20 04:15:51 INFO yarn.Client: Application report for application_1500545966176_0008 (state: ACCEPTED)
17/07/20 04:15:52 INFO yarn.Client: Application report for application_1500545966176_0008 (state: ACCEPTED)
17/07/20 04:15:53 INFO yarn.Client: Application report for application_1500545966176_0008 (state: ACCEPTED)
17/07/20 04:15:54 INFO yarn.Client: Application report for application_1500545966176_0008 (state: ACCEPTED)
17/07/20 04:15:55 INFO yarn.Client: Application report for application_1500545966176_0008 (state: FAILED)
17/07/20 04:15:55 INFO yarn.Client:
client token: N/A
diagnostics: Application application_1500545966176_0008 failed 2 times due to AM Container for appattempt_1500545966176_0008_000002 exited with exitCode: 15
For more detailed output, check application tracking page:http://quickstart.cloudera:8088/proxy/application_1500545966176_0008/Then, click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_1500545966176_0008_02_000001
Exit code: 15
Stack trace: ExitCodeException exitCode=15:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:601)
at org.apache.hadoop.util.Shell.run(Shell.java:504)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:786)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:213)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Container exited with a non-zero exit code 15
Failing this attempt. Failing the application.
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: root.cloudera
start time: 1500549341177
final status: FAILED
tracking URL: http://quickstart.cloudera:8088/cluster/app/application_1500545966176_0008
user: cloudera
Exception in thread "main" org.apache.spark.SparkException: Application application_1500545966176_0008 finished with failed status
at org.apache.spark.deploy.yarn.Client.run(Client.scala:1030)
at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1077)
at org.apache.spark.deploy.yarn.Client.main(Client.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
17/07/20 04:15:55 INFO util.ShutdownHookManager: Shutdown hook called
17/07/20 04:15:55 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-3a2fec6b-6148-49a9-9db7-7a7cca99c586
[cloudera#quickstart ~]$ clearc^C
[cloudera#quickstart ~]$ ^C
[cloudera#quickstart ~]$ spark-submit --master yarn-cluster /media/sf_SparkS3AirFlow/target/scala-2.11/spark-emr-test.jar
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/lib/zookeeper/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/lib/flume-ng/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/lib/parquet/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
17/07/20 04:50:03 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
17/07/20 04:50:04 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
17/07/20 04:50:04 INFO yarn.Client: Requesting a new application from cluster with 1 NodeManagers
17/07/20 04:50:04 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (8192 MB per container)
17/07/20 04:50:04 INFO yarn.Client: Will allocate AM container, with 1408 MB memory including 384 MB overhead
17/07/20 04:50:04 INFO yarn.Client: Setting up container launch context for our AM
17/07/20 04:50:04 INFO yarn.Client: Setting up the launch environment for our AM container
17/07/20 04:50:04 INFO yarn.Client: Preparing resources for our AM container
17/07/20 04:50:06 WARN shortcircuit.DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
17/07/20 04:50:06 INFO yarn.Client: Uploading resource file:/usr/lib/spark/lib/spark-assembly-1.6.0-cdh5.10.0-hadoop2.6.0-cdh5.10.0.jar -> hdfs://quickstart.cloudera:8020/user/cloudera/.sparkStaging/application_1500545966176_0009/spark-assembly-1.6.0-cdh5.10.0-hadoop2.6.0-cdh5.10.0.jar
17/07/20 04:50:07 INFO yarn.Client: Uploading resource file:/media/sf_SparkS3AirFlow/target/scala-2.11/spark-emr-test.jar -> hdfs://quickstart.cloudera:8020/user/cloudera/.sparkStaging/application_1500545966176_0009/spark-emr-test.jar
17/07/20 04:50:09 INFO yarn.Client: Uploading resource file:/tmp/spark-2dbf7a93-4377-4ad1-8e78-56330ee03b7f/__spark_conf__8411937016766228982.zip -> hdfs://quickstart.cloudera:8020/user/cloudera/.sparkStaging/application_1500545966176_0009/__spark_conf__8411937016766228982.zip
17/07/20 04:50:09 WARN hdfs.DFSClient: Caught exception
java.lang.InterruptedException
at java.lang.Object.wait(Native Method)
at java.lang.Thread.join(Thread.java:1281)
at java.lang.Thread.join(Thread.java:1355)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.closeResponder(DFSOutputStream.java:951)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.endBlock(DFSOutputStream.java:689)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:878)
17/07/20 04:50:09 INFO spark.SecurityManager: Changing view acls to: cloudera
17/07/20 04:50:09 INFO spark.SecurityManager: Changing modify acls to: cloudera
17/07/20 04:50:09 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(cloudera); users with modify permissions: Set(cloudera)
17/07/20 04:50:09 INFO yarn.Client: Submitting application 9 to ResourceManager
17/07/20 04:50:09 INFO impl.YarnClientImpl: Submitted application application_1500545966176_0009
17/07/20 04:50:10 INFO yarn.Client: Application report for application_1500545966176_0009 (state: ACCEPTED)
17/07/20 04:50:10 INFO yarn.Client:
client token: N/A
diagnostics: N/A
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: root.cloudera
start time: 1500551409216
final status: UNDEFINED
tracking URL: http://quickstart.cloudera:8088/proxy/application_1500545966176_0009/
user: cloudera
17/07/20 04:50:11 INFO yarn.Client: Application report for application_1500545966176_0009 (state: ACCEPTED)
17/07/20 04:50:12 INFO yarn.Client: Application report for application_1500545966176_0009 (state: ACCEPTED)
17/07/20 04:50:13 INFO yarn.Client: Application report for application_1500545966176_0009 (state: ACCEPTED)
17/07/20 04:50:14 INFO yarn.Client: Application report for application_1500545966176_0009 (state: ACCEPTED)
17/07/20 04:50:15 INFO yarn.Client: Application report for application_1500545966176_0009 (state: ACCEPTED)
17/07/20 04:50:16 INFO yarn.Client: Application report for application_1500545966176_0009 (state: ACCEPTED)
17/07/20 04:50:17 INFO yarn.Client: Application report for application_1500545966176_0009 (state: ACCEPTED)
17/07/20 04:50:18 INFO yarn.Client: Application report for application_1500545966176_0009 (state: ACCEPTED)
17/07/20 04:50:19 INFO yarn.Client: Application report for application_1500545966176_0009 (state: ACCEPTED)
17/07/20 04:50:20 INFO yarn.Client: Application report for application_1500545966176_0009 (state: ACCEPTED)
17/07/20 04:50:21 INFO yarn.Client: Application report for application_1500545966176_0009 (state: ACCEPTED)
17/07/20 04:50:22 INFO yarn.Client: Application report for application_1500545966176_0009 (state: FAILED)
17/07/20 04:50:22 INFO yarn.Client:
client token: N/A
diagnostics: Application application_1500545966176_0009 failed 2 times due to AM Container for appattempt_1500545966176_0009_000002 exited with exitCode: 15
For more detailed output, check application tracking page:http://quickstart.cloudera:8088/proxy/application_1500545966176_0009/Then, click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_1500545966176_0009_02_000001
Exit code: 15
Stack trace: ExitCodeException exitCode=15:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:601)
at org.apache.hadoop.util.Shell.run(Shell.java:504)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:786)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:213)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Container exited with a non-zero exit code 15
Failing this attempt. Failing the application.
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: root.cloudera
start time: 1500551409216
final status: FAILED
tracking URL: http://quickstart.cloudera:8088/cluster/app/application_1500545966176_0009
user: cloudera
Exception in thread "main" org.apache.spark.SparkException: Application application_1500545966176_0009 finished with failed status
at org.apache.spark.deploy.yarn.Client.run(Client.scala:1030)
at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1077)
at org.apache.spark.deploy.yarn.Client.main(Client.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
17/07/20 04:50:22 INFO util.ShutdownHookManager: Shutdown hook called
17/07/20 04:50:22 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-2dbf7a93-4377-4ad1-8e78-56330ee03b7f
[cloudera#quickstart ~]$
Yarn log of the app_id :
Container: container_1500545966176_0011_02_000001 on quickstart.cloudera_44248
================================================================================
LogType:stderr
Log Upload Time:Thu Jul 20 05:02:55 -0700 2017
LogLength:4104
Log Contents:
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/lib/zookeeper/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/lib/flume-ng/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/lib/parquet/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
17/07/20 05:02:45 INFO yarn.ApplicationMaster: Registered signal handlers for [TERM, HUP, INT]
17/07/20 05:02:49 INFO yarn.ApplicationMaster: ApplicationAttemptId: appattempt_1500545966176_0011_000002
17/07/20 05:02:53 INFO spark.SecurityManager: Changing view acls to: yarn,cloudera
17/07/20 05:02:53 INFO spark.SecurityManager: Changing modify acls to: yarn,cloudera
17/07/20 05:02:53 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(yarn, cloudera); users with modify permissions: Set(yarn, cloudera)
17/07/20 05:02:53 INFO yarn.ApplicationMaster: Starting the user application in a separate Thread
17/07/20 05:02:53 INFO yarn.ApplicationMaster: Waiting for spark context initialization...
17/07/20 05:02:53 ERROR yarn.ApplicationMaster: User class threw exception: java.lang.NoSuchMethodError: scala.Predef$.ArrowAssoc(Ljava/lang/Object;)Ljava/lang/Object;
java.lang.NoSuchMethodError: scala.Predef$.ArrowAssoc(Ljava/lang/Object;)Ljava/lang/Object;
at org.apache.spark.sql.SparkSession$Builder.config(SparkSession.scala:780)
at org.apache.spark.sql.SparkSession$Builder.master(SparkSession.scala:833)
at Main$.main(Main.scala:8)
at Main.main(Main.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:552)
17/07/20 05:02:53 INFO yarn.ApplicationMaster: Final app status: FAILED, exitCode: 15, (reason: User class threw exception: java.lang.NoSuchMethodError: scala.Predef$.ArrowAssoc(Ljava/lang/Object;)Ljava/lang/Object;)
17/07/20 05:02:53 ERROR yarn.ApplicationMaster: Uncaught exception:
java.util.concurrent.ExecutionException: Boxed Error
at scala.concurrent.impl.Promise$.resolver(Promise.scala:55)
at scala.concurrent.impl.Promise$.scala$concurrent$impl$Promise$$resolveTry(Promise.scala:47)
at scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:244)
at scala.concurrent.Promise$class.tryFailure(Promise.scala:112)
at scala.concurrent.impl.Promise$DefaultPromise.tryFailure(Promise.scala:153)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:570)
Caused by: java.lang.NoSuchMethodError: scala.Predef$.ArrowAssoc(Ljava/lang/Object;)Ljava/lang/Object;
at org.apache.spark.sql.SparkSession$Builder.config(SparkSession.scala:780)
at org.apache.spark.sql.SparkSession$Builder.master(SparkSession.scala:833)
at Main$.main(Main.scala:8)
at Main.main(Main.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:552)
17/07/20 05:02:53 INFO yarn.ApplicationMaster: Unregistering ApplicationMaster with FAILED (diag message: User class threw exception: java.lang.NoSuchMethodError: scala.Predef$.ArrowAssoc(Ljava/lang/Object;)Ljava/lang/Object;)
17/07/20 05:02:53 INFO yarn.ApplicationMaster: Deleting staging directory .sparkStaging/application_1500545966176_0011
17/07/20 05:02:53 INFO util.ShutdownHookManager: Shutdown hook called
LogType:stdout
Log Upload Time:Thu Jul 20 05:02:55 -0700 2017
LogLength:0
Log Contents:
Can you please tell me what's wrong with this and how to fix it ?
Thank you

How to run Spark Scala code on Amazon EMR

I am trying to run the following piece of Spark code written in Scala on Amazon EMR:
import org.apache.spark.{SparkConf, SparkContext}
object TestRunner {
def main(args: Array[String]): Unit = {
val conf = new SparkConf().setAppName("Hello World")
val sc = new SparkContext(conf)
val words = sc.parallelize(Seq("a", "b", "c", "d", "e"))
val wordCounts = words.map(x => (x, 1)).reduceByKey(_ + _)
println(wordCounts)
}
}
This is the script I am using to deploy the above code into EMR:
#!/usr/bin/env bash
set -euxo pipefail
cluster_id='j-XXXXXXXXXX'
app_name="HelloWorld"
main_class="TestRunner"
jar_name="HelloWorld-assembly-0.0.1-SNAPSHOT.jar"
jar_path="target/scala-2.11/${jar_name}"
s3_jar_dir="s3://jars/"
s3_jar_path="${s3_jar_dir}${jar_name}"
###################################################
sbt assembly
aws s3 cp ${jar_path} ${s3_jar_dir}
aws emr add-steps --cluster-id ${cluster_id} --steps Type=spark,Name=${app_name},Args=[--deploy-mode,cluster,--master,yarn-cluster,--class,${main_class},${s3_jar_path}],ActionOnFailure=CONTINUE
But, this exits with producing no output at all in AWS after few minutes!
Here's my controller's output:
2016-10-20T21:03:17.043Z INFO Ensure step 3 jar file command-runner.jar
2016-10-20T21:03:17.043Z INFO StepRunner: Created Runner for step 3
INFO startExec 'hadoop jar /var/lib/aws/emr/step-runner/hadoop-jars/command-runner.jar spark-submit --deploy-mode cluster --class TestRunner s3://jars/mscheiber/HelloWorld-assembly-0.0.1-SNAPSHOT.jar'
INFO Environment:
PATH=/sbin:/usr/sbin:/bin:/usr/bin:/usr/local/sbin:/opt/aws/bin
LESS_TERMCAP_md=[01;38;5;208m
LESS_TERMCAP_me=[0m
HISTCONTROL=ignoredups
LESS_TERMCAP_mb=[01;31m
AWS_AUTO_SCALING_HOME=/opt/aws/apitools/as
UPSTART_JOB=rc
LESS_TERMCAP_se=[0m
HISTSIZE=1000
HADOOP_ROOT_LOGGER=INFO,DRFA
JAVA_HOME=/etc/alternatives/jre
AWS_DEFAULT_REGION=us-east-1
AWS_ELB_HOME=/opt/aws/apitools/elb
LESS_TERMCAP_us=[04;38;5;111m
EC2_HOME=/opt/aws/apitools/ec2
TERM=linux
XFILESEARCHPATH=/usr/dt/app-defaults/%L/Dt
runlevel=3
LANG=en_US.UTF-8
AWS_CLOUDWATCH_HOME=/opt/aws/apitools/mon
MAIL=/var/spool/mail/hadoop
LESS_TERMCAP_ue=[0m
LOGNAME=hadoop
PWD=/
LANGSH_SOURCED=1
HADOOP_CLIENT_OPTS=-Djava.io.tmpdir=/mnt/var/lib/hadoop/steps/s-3UAS8JQ0KEOV3/tmp
_=/etc/alternatives/jre/bin/java
CONSOLETYPE=serial
RUNLEVEL=3
LESSOPEN=||/usr/bin/lesspipe.sh %s
previous=N
UPSTART_EVENTS=runlevel
AWS_PATH=/opt/aws
USER=hadoop
UPSTART_INSTANCE=
PREVLEVEL=N
HADOOP_LOGFILE=syslog
HOSTNAME=ip-10-17-186-102
NLSPATH=/usr/dt/lib/nls/msg/%L/%N.cat
HADOOP_LOG_DIR=/mnt/var/log/hadoop/steps/s-3UAS8JQ0KEOV3
EC2_AMITOOL_HOME=/opt/aws/amitools/ec2
SHLVL=5
HOME=/home/hadoop
HADOOP_IDENT_STRING=hadoop
INFO redirectOutput to /mnt/var/log/hadoop/steps/s-3UAS8JQ0KEOV3/stdout
INFO redirectError to /mnt/var/log/hadoop/steps/s-3UAS8JQ0KEOV3/stderr
INFO Working dir /mnt/var/lib/hadoop/steps/s-3UAS8JQ0KEOV3
INFO ProcessRunner started child process 24549 :
hadoop 24549 4780 0 21:03 ? 00:00:00 bash /usr/lib/hadoop/bin/hadoop jar /var/lib/aws/emr/step-runner/hadoop-jars/command-runner.jar spark-submit --deploy-mode cluster --class TestRunner s3://jars/TestRunner-assembly-0.0.1-SNAPSHOT.jar
2016-10-20T21:03:21.050Z INFO HadoopJarStepRunner.Runner: startRun() called for s-3UAS8JQ0KEOV3 Child Pid: 24549
INFO Synchronously wait child process to complete : hadoop jar /var/lib/aws/emr/step-runner/hadoop-...
INFO waitProcessCompletion ended with exit code 0 : hadoop jar /var/lib/aws/emr/step-runner/hadoop-...
INFO total process run time: 44 seconds
2016-10-20T21:04:03.102Z INFO Step created jobs:
2016-10-20T21:04:03.103Z INFO Step succeeded with exitCode 0 and took 44 seconds
The syslog and stdout is empty and this is in my stderr:
16/10/20 21:03:20 INFO RMProxy: Connecting to ResourceManager at ip-10-17-186-102.ec2.internal/10.17.186.102:8032
16/10/20 21:03:21 INFO Client: Requesting a new application from cluster with 2 NodeManagers
16/10/20 21:03:21 INFO Client: Verifying our application has not requested more than the maximum memory capability of the cluster (53248 MB per container)
16/10/20 21:03:21 INFO Client: Will allocate AM container, with 53247 MB memory including 4840 MB overhead
16/10/20 21:03:21 INFO Client: Setting up container launch context for our AM
16/10/20 21:03:21 INFO Client: Setting up the launch environment for our AM container
16/10/20 21:03:21 INFO Client: Preparing resources for our AM container
16/10/20 21:03:21 WARN Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME.
16/10/20 21:03:22 INFO Client: Uploading resource file:/mnt/tmp/spark-6fceeedf-0ad5-4df1-a63e-c1d7eb1b95b4/__spark_libs__5484581201997889110.zip -> hdfs://ip-10-17-186-102.ec2.internal:8020/user/hadoop/.sparkStaging/application_1476995377469_0002/__spark_libs__5484581201997889110.zip
16/10/20 21:03:24 INFO Client: Uploading resource s3://jars/HelloWorld-assembly-0.0.1-SNAPSHOT.jar -> hdfs://ip-10-17-186-102.ec2.internal:8020/user/hadoop/.sparkStaging/application_1476995377469_0002/DataScience-assembly-0.0.1-SNAPSHOT.jar
16/10/20 21:03:24 INFO S3NativeFileSystem: Opening 's3://jars/HelloWorld-assembly-0.0.1-SNAPSHOT.jar' for reading
16/10/20 21:03:26 INFO Client: Uploading resource file:/mnt/tmp/spark-6fceeedf-0ad5-4df1-a63e-c1d7eb1b95b4/__spark_conf__5724047842379101980.zip -> hdfs://ip-10-17-186-102.ec2.internal:8020/user/hadoop/.sparkStaging/application_1476995377469_0002/__spark_conf__.zip
16/10/20 21:03:26 INFO SecurityManager: Changing view acls to: hadoop
16/10/20 21:03:26 INFO SecurityManager: Changing modify acls to: hadoop
16/10/20 21:03:26 INFO SecurityManager: Changing view acls groups to:
16/10/20 21:03:26 INFO SecurityManager: Changing modify acls groups to:
16/10/20 21:03:26 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(hadoop); groups with view permissions: Set(); users with modify permissions: Set(hadoop); groups with modify permissions: Set()
16/10/20 21:03:26 INFO Client: Submitting application application_1476995377469_0002 to ResourceManager
16/10/20 21:03:26 INFO YarnClientImpl: Submitted application application_1476995377469_0002
16/10/20 21:03:27 INFO Client: Application report for application_1476995377469_0002 (state: ACCEPTED)
16/10/20 21:03:27 INFO Client:
client token: N/A
diagnostics: N/A
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: default
start time: 1476997406896
final status: UNDEFINED
tracking URL: http://ip-10-17-186-102.ec2.internal:20888/proxy/application_1476995377469_0002/
user: hadoop
16/10/20 21:03:28 INFO Client: Application report for application_1476995377469_0002 (state: ACCEPTED)
16/10/20 21:03:29 INFO Client: Application report for application_1476995377469_0002 (state: ACCEPTED)
16/10/20 21:03:30 INFO Client: Application report for application_1476995377469_0002 (state: ACCEPTED)
16/10/20 21:03:31 INFO Client: Application report for application_1476995377469_0002 (state: RUNNING)
16/10/20 21:03:31 INFO Client:
client token: N/A
diagnostics: N/A
ApplicationMaster host: 10.17.181.184
ApplicationMaster RPC port: 0
queue: default
start time: 1476997406896
final status: UNDEFINED
tracking URL: http://ip-10-17-186-102.ec2.internal:20888/proxy/application_1476995377469_0002/
user: hadoop
16/10/20 21:03:32 INFO Client: Application report for application_1476995377469_0002 (state: RUNNING)
16/10/20 21:03:33 INFO Client: Application report for application_1476995377469_0002 (state: RUNNING)
16/10/20 21:03:34 INFO Client: Application report for application_1476995377469_0002 (state: RUNNING)
16/10/20 21:03:35 INFO Client: Application report for application_1476995377469_0002 (state: RUNNING)
16/10/20 21:03:36 INFO Client: Application report for application_1476995377469_0002 (state: RUNNING)
16/10/20 21:03:37 INFO Client: Application report for application_1476995377469_0002 (state: RUNNING)
16/10/20 21:03:38 INFO Client: Application report for application_1476995377469_0002 (state: RUNNING)
16/10/20 21:03:39 INFO Client: Application report for application_1476995377469_0002 (state: RUNNING)
16/10/20 21:03:40 INFO Client: Application report for application_1476995377469_0002 (state: RUNNING)
16/10/20 21:03:41 INFO Client: Application report for application_1476995377469_0002 (state: RUNNING)
16/10/20 21:03:42 INFO Client: Application report for application_1476995377469_0002 (state: RUNNING)
16/10/20 21:03:43 INFO Client: Application report for application_1476995377469_0002 (state: RUNNING)
16/10/20 21:03:44 INFO Client: Application report for application_1476995377469_0002 (state: RUNNING)
16/10/20 21:03:45 INFO Client: Application report for application_1476995377469_0002 (state: RUNNING)
16/10/20 21:03:46 INFO Client: Application report for application_1476995377469_0002 (state: RUNNING)
16/10/20 21:03:47 INFO Client: Application report for application_1476995377469_0002 (state: RUNNING)
16/10/20 21:03:48 INFO Client: Application report for application_1476995377469_0002 (state: RUNNING)
16/10/20 21:03:49 INFO Client: Application report for application_1476995377469_0002 (state: RUNNING)
16/10/20 21:03:50 INFO Client: Application report for application_1476995377469_0002 (state: RUNNING)
16/10/20 21:03:51 INFO Client: Application report for application_1476995377469_0002 (state: RUNNING)
16/10/20 21:03:52 INFO Client: Application report for application_1476995377469_0002 (state: RUNNING)
16/10/20 21:03:53 INFO Client: Application report for application_1476995377469_0002 (state: RUNNING)
16/10/20 21:03:54 INFO Client: Application report for application_1476995377469_0002 (state: RUNNING)
16/10/20 21:03:55 INFO Client: Application report for application_1476995377469_0002 (state: RUNNING)
16/10/20 21:03:56 INFO Client: Application report for application_1476995377469_0002 (state: RUNNING)
16/10/20 21:03:57 INFO Client: Application report for application_1476995377469_0002 (state: RUNNING)
16/10/20 21:03:58 INFO Client: Application report for application_1476995377469_0002 (state: RUNNING)
16/10/20 21:03:59 INFO Client: Application report for application_1476995377469_0002 (state: RUNNING)
16/10/20 21:04:00 INFO Client: Application report for application_1476995377469_0002 (state: FINISHED)
16/10/20 21:04:00 INFO Client:
client token: N/A
diagnostics: N/A
ApplicationMaster host: 10.17.181.184
ApplicationMaster RPC port: 0
queue: default
start time: 1476997406896
final status: SUCCEEDED
tracking URL: http://ip-10-17-186-102.ec2.internal:20888/proxy/application_1476995377469_0002/
user: hadoop
16/10/20 21:04:00 INFO Client: Deleting staging directory hdfs://ip-10-17-186-102.ec2.internal:8020/user/hadoop/.sparkStaging/application_1476995377469_0002
16/10/20 21:04:00 INFO ShutdownHookManager: Shutdown hook called
16/10/20 21:04:00 INFO ShutdownHookManager: Deleting directory /mnt/tmp/spark-6fceeedf-0ad5-4df1-a63e-c1d7eb1b95b4
Command exiting with ret '0'
What am I missing?
Looks like your application succeeded just fine. However, there are two reasons why you don't see any output in the step's stout logs.
1) You ran the application in yarn-cluster mode, which means that the driver runs on a random cluster node rather than on the master node. If you specified an S3 log uri when creating the cluster, you should see the logs for this application in the containers directory of your S3 bucket. The logs for the driver will be in container #0's logs.
2) You did not call anything like "collect()" to bring data from the Spark executors back to the driver, so your println() at the end is not printing the data anyway but rather a toString() representation of the RDD. You probably want to do something like .collect().foreach(println) instead.

spark-submit on yarn did not distribute jars to nm-local-dir

1、version
spark:2.0.0
scala:2.11.8
java:1.8.0_91
hadoop:2.7.2
2、question:
When I submit scala program to spark on yarn,it throw a exception:
Caused by: java.lang.IllegalStateException: Library directory '/opt/hadoop/tmp/nm-local-dir/usercache/hadoop/appcache/application_1471514504287_0021/container_1471514504287_0021_01_000002/assembly/target/scala-2.11/jars' does not exist; make sure Spark is built.
3、command
spark-submit --master yarn --deploy-mode cluster --class org.apache.spark.mllib.learning.recommend.CollaborativeFilteringSpark collaborativeFilteringSpark.jar
4、all logs:
16/08/19 11:07:35 INFO SparkContext: Running Spark version 2.0.0
16/08/19 11:07:35 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/08/19 11:07:36 INFO SecurityManager: Changing view acls to: hadoop
16/08/19 11:07:36 INFO SecurityManager: Changing modify acls to: hadoop
16/08/19 11:07:36 INFO SecurityManager: Changing view acls groups to:
16/08/19 11:07:36 INFO SecurityManager: Changing modify acls groups to:
16/08/19 11:07:36 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(hadoop); groups with view permissions: Set(); users with modify permissions: Set(hadoop); groups with modify permissions: Set()
16/08/19 11:07:36 INFO Utils: Successfully started service 'sparkDriver' on port 43981.
16/08/19 11:07:36 INFO SparkEnv: Registering MapOutputTracker
16/08/19 11:07:36 INFO SparkEnv: Registering BlockManagerMaster
16/08/19 11:07:36 INFO DiskBlockManager: Created local directory at /opt/spark/blockmgr-57cf9a28-536c-4f03-83cc-c6a59cdeb825
16/08/19 11:07:36 INFO MemoryStore: MemoryStore started with capacity 413.9 MB
16/08/19 11:07:36 INFO SparkEnv: Registering OutputCommitCoordinator
16/08/19 11:07:37 INFO Utils: Successfully started service 'SparkUI' on port 4040.
16/08/19 11:07:37 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://192.168.137.101:4040
16/08/19 11:07:37 INFO SparkContext: Added JAR file:/home/hadoop/spark_program/scala/collaborativeFilteringSpark.jar at spark://192.168.137.101:43981/jars/collaborativeFilteringSpark.jar with timestamp 1471576057423
16/08/19 11:07:38 INFO RMProxy: Connecting to ResourceManager at dev-01/192.168.137.101:8032
16/08/19 11:07:38 INFO Client: Requesting a new application from cluster with 1 NodeManagers
16/08/19 11:07:38 INFO Client: Verifying our application has not requested more than the maximum memory capability of the cluster (8192 MB per container)
16/08/19 11:07:38 INFO Client: Will allocate AM container, with 896 MB memory including 384 MB overhead
16/08/19 11:07:38 INFO Client: Setting up container launch context for our AM
16/08/19 11:07:38 INFO Client: Setting up the launch environment for our AM container
16/08/19 11:07:38 INFO Client: Preparing resources for our AM container
16/08/19 11:07:39 WARN Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME.
16/08/19 11:07:40 INFO Client: Uploading resource file:/opt/spark/spark-e7da4489-d07e-4c42-aa50-be789ad1943e/__spark_libs__7265506257548877328.zip -> hdfs://dev-01:9000/user/hadoop/.sparkStaging/application_1471514504287_0021/__spark_libs__7265506257548877328.zip
16/08/19 11:07:44 INFO Client: Uploading resource file:/opt/spark/spark-e7da4489-d07e-4c42-aa50-be789ad1943e/__spark_conf__3473502575984181564.zip -> hdfs://dev-01:9000/user/hadoop/.sparkStaging/application_1471514504287_0021/__spark_conf__.zip
16/08/19 11:07:44 INFO SecurityManager: Changing view acls to: hadoop
16/08/19 11:07:44 INFO SecurityManager: Changing modify acls to: hadoop
16/08/19 11:07:44 INFO SecurityManager: Changing view acls groups to:
16/08/19 11:07:44 INFO SecurityManager: Changing modify acls groups to:
16/08/19 11:07:44 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(hadoop); groups with view permissions: Set(); users with modify permissions: Set(hadoop); groups with modify permissions: Set()
16/08/19 11:07:44 INFO Client: Submitting application application_1471514504287_0021 to ResourceManager
16/08/19 11:07:44 INFO YarnClientImpl: Submitted application application_1471514504287_0021
16/08/19 11:07:44 INFO SchedulerExtensionServices: Starting Yarn extension services with app application_1471514504287_0021 and attemptId None
16/08/19 11:07:45 INFO Client: Application report for application_1471514504287_0021 (state: ACCEPTED)
16/08/19 11:07:45 INFO Client:
client token: N/A
diagnostics: N/A
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: default
start time: 1471576064764
final status: UNDEFINED
tracking URL: http://dev-01:8088/proxy/application_1471514504287_0021/
user: hadoop
16/08/19 11:07:46 INFO Client: Application report for application_1471514504287_0021 (state: ACCEPTED)
16/08/19 11:07:47 INFO Client: Application report for application_1471514504287_0021 (state: ACCEPTED)
16/08/19 11:07:48 INFO Client: Application report for application_1471514504287_0021 (state: ACCEPTED)
16/08/19 11:07:49 INFO Client: Application report for application_1471514504287_0021 (state: ACCEPTED)
16/08/19 11:07:50 INFO Client: Application report for application_1471514504287_0021 (state: ACCEPTED)
16/08/19 11:07:51 INFO Client: Application report for application_1471514504287_0021 (state: ACCEPTED)
16/08/19 11:07:52 INFO Client: Application report for application_1471514504287_0021 (state: ACCEPTED)
16/08/19 11:07:53 INFO Client: Application report for application_1471514504287_0021 (state: ACCEPTED)
16/08/19 11:07:54 INFO Client: Application report for application_1471514504287_0021 (state: ACCEPTED)
16/08/19 11:07:55 INFO YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster registered as NettyRpcEndpointRef(null)
16/08/19 11:07:55 INFO YarnClientSchedulerBackend: Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS -> dev-01, PROXY_URI_BASES -> http://dev-01:8088/proxy/application_1471514504287_0021), /proxy/application_1471514504287_0021
16/08/19 11:07:55 INFO JettyUtils: Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
16/08/19 11:07:55 INFO Client: Application report for application_1471514504287_0021 (state: ACCEPTED)
16/08/19 11:07:56 INFO Client: Application report for application_1471514504287_0021 (state: RUNNING)
16/08/19 11:07:56 INFO Client:
client token: N/A
diagnostics: N/A
ApplicationMaster host: 192.168.137.102
ApplicationMaster RPC port: 0
queue: default
start time: 1471576064764
final status: UNDEFINED
tracking URL: http://dev-01:8088/proxy/application_1471514504287_0021/
user: hadoop
16/08/19 11:07:56 INFO YarnClientSchedulerBackend: Application application_1471514504287_0021 has started running.
16/08/19 11:07:56 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 46171.
16/08/19 11:07:56 INFO NettyBlockTransferService: Server created on 192.168.137.101:46171
16/08/19 11:07:56 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 192.168.137.101, 46171)
16/08/19 11:07:56 INFO BlockManagerMasterEndpoint: Registering block manager 192.168.137.101:46171 with 413.9 MB RAM, BlockManagerId(driver, 192.168.137.101, 46171)
16/08/19 11:07:56 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 192.168.137.101, 46171)
16/08/19 11:08:03 INFO YarnSchedulerBackend$YarnDriverEndpoint: Registered executor NettyRpcEndpointRef(null) (192.168.137.102:42406) with ID 1
16/08/19 11:08:03 INFO BlockManagerMasterEndpoint: Registering block manager dev-02:35791 with 413.9 MB RAM, BlockManagerId(1, dev-02, 35791)
16/08/19 11:08:05 INFO YarnSchedulerBackend$YarnDriverEndpoint: Registered executor NettyRpcEndpointRef(null) (192.168.137.102:42410) with ID 2
16/08/19 11:08:05 INFO YarnClientSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.8
16/08/19 11:08:05 INFO BlockManagerMasterEndpoint: Registering block manager dev-02:37169 with 413.9 MB RAM, BlockManagerId(2, dev-02, 37169)
16/08/19 11:08:06 INFO SparkContext: Starting job: foreach at CollaborativeFilteringSpark.scala:62
16/08/19 11:08:06 INFO DAGScheduler: Got job 0 (foreach at CollaborativeFilteringSpark.scala:62) with 2 output partitions
16/08/19 11:08:06 INFO DAGScheduler: Final stage: ResultStage 0 (foreach at CollaborativeFilteringSpark.scala:62)
16/08/19 11:08:06 INFO DAGScheduler: Parents of final stage: List()
16/08/19 11:08:06 INFO DAGScheduler: Missing parents: List()
16/08/19 11:08:06 INFO DAGScheduler: Submitting ResultStage 0 (ParallelCollectionRDD[0] at parallelize at CollaborativeFilteringSpark.scala:18), which has no missing parents
16/08/19 11:08:06 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 1432.0 B, free 413.9 MB)
16/08/19 11:08:06 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 1035.0 B, free 413.9 MB)
16/08/19 11:08:06 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 192.168.137.101:46171 (size: 1035.0 B, free: 413.9 MB)
16/08/19 11:08:06 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:1012
16/08/19 11:08:06 INFO DAGScheduler: Submitting 2 missing tasks from ResultStage 0 (ParallelCollectionRDD[0] at parallelize at CollaborativeFilteringSpark.scala:18)
16/08/19 11:08:06 INFO YarnScheduler: Adding task set 0.0 with 2 tasks
16/08/19 11:08:06 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, dev-02, partition 0, PROCESS_LOCAL, 5417 bytes)
16/08/19 11:08:06 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, dev-02, partition 1, PROCESS_LOCAL, 5423 bytes)
16/08/19 11:08:06 INFO YarnSchedulerBackend$YarnDriverEndpoint: Launching task 0 on executor id: 2 hostname: dev-02.
16/08/19 11:08:06 INFO YarnSchedulerBackend$YarnDriverEndpoint: Launching task 1 on executor id: 1 hostname: dev-02.
16/08/19 11:08:07 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on dev-02:37169 (size: 1035.0 B, free: 413.9 MB)
16/08/19 11:08:07 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on dev-02:35791 (size: 1035.0 B, free: 413.9 MB)
16/08/19 11:08:13 WARN TaskSetManager: Lost task 1.0 in stage 0.0 (TID 1, dev-02): java.lang.ExceptionInInitializerError
at org.apache.spark.mllib.learning.recommend.CollaborativeFilteringSpark$$anonfun$main$1.apply(CollaborativeFilteringSpark.scala:64)
at org.apache.spark.mllib.learning.recommend.CollaborativeFilteringSpark$$anonfun$main$1.apply(CollaborativeFilteringSpark.scala:62)
at scala.collection.Iterator$class.foreach(Iterator.scala:893)
at org.apache.spark.InterruptibleIterator.foreach(InterruptibleIterator.scala:28)
at org.apache.spark.rdd.RDD$$anonfun$foreach$1$$anonfun$apply$27.apply(RDD.scala:875)
at org.apache.spark.rdd.RDD$$anonfun$foreach$1$$anonfun$apply$27.apply(RDD.scala:875)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1897)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1897)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70)
at org.apache.spark.scheduler.Task.run(Task.scala:85)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.IllegalStateException: Library directory '/opt/hadoop/tmp/nm-local-dir/usercache/hadoop/appcache/application_1471514504287_0021/container_1471514504287_0021_01_000002/assembly/target/scala-2.11/jars' does not exist; make sure Spark is built.
at org.apache.spark.launcher.CommandBuilderUtils.checkState(CommandBuilderUtils.java:248)
at org.apache.spark.launcher.CommandBuilderUtils.findJarsDir(CommandBuilderUtils.java:368)
at org.apache.spark.launcher.YarnCommandBuilderUtils$.findJarsDir(YarnCommandBuilderUtils.scala:38)
at org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:500)
at org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:834)
at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:167)
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:56)
at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:149)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:500)
at org.apache.spark.mllib.learning.recommend.CollaborativeFilteringSpark$.<init>(CollaborativeFilteringSpark.scala:16)
at org.apache.spark.mllib.learning.recommend.CollaborativeFilteringSpark$.<clinit>(CollaborativeFilteringSpark.scala)
... 14 more
16/08/19 11:08:13 INFO TaskSetManager: Starting task 1.1 in stage 0.0 (TID 2, dev-02, partition 1, PROCESS_LOCAL, 5423 bytes)
16/08/19 11:08:13 INFO YarnSchedulerBackend$YarnDriverEndpoint: Launching task 2 on executor id: 1 hostname: dev-02.
16/08/19 11:08:13 INFO TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0) on executor dev-02: java.lang.ExceptionInInitializerError (null) [duplicate 1]
16/08/19 11:08:13 INFO TaskSetManager: Starting task 0.1 in stage 0.0 (TID 3, dev-02, partition 0, PROCESS_LOCAL, 5417 bytes)
16/08/19 11:08:13 INFO YarnSchedulerBackend$YarnDriverEndpoint: Launching task 3 on executor id: 2 hostname: dev-02.
16/08/19 11:08:14 WARN TransportChannelHandler: Exception in connection from /192.168.137.102:42406
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
at sun.nio.ch.IOUtil.read(IOUtil.java:192)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
at io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:313)
at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:881)
at io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:242)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:119)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
at java.lang.Thread.run(Thread.java:745)
16/08/19 11:08:14 INFO YarnSchedulerBackend$YarnDriverEndpoint: Disabling executor 1.
16/08/19 11:08:14 INFO DAGScheduler: Executor lost: 1 (epoch 0)
16/08/19 11:08:14 INFO BlockManagerMasterEndpoint: Trying to remove executor 1 from BlockManagerMaster.
16/08/19 11:08:14 INFO BlockManagerMasterEndpoint: Removing block manager BlockManagerId(1, dev-02, 35791)
16/08/19 11:08:14 INFO BlockManagerMaster: Removed 1 successfully in removeExecutor
16/08/19 11:08:14 WARN TransportChannelHandler: Exception in connection from /192.168.137.102:42410
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
at sun.nio.ch.IOUtil.read(IOUtil.java:192)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
at io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:313)
at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:881)
at io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:242)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:119)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
at java.lang.Thread.run(Thread.java:745)
16/08/19 11:08:14 INFO YarnSchedulerBackend$YarnDriverEndpoint: Disabling executor 2.
16/08/19 11:08:14 INFO DAGScheduler: Executor lost: 2 (epoch 1)
16/08/19 11:08:14 INFO BlockManagerMasterEndpoint: Trying to remove executor 2 from BlockManagerMaster.
16/08/19 11:08:14 INFO BlockManagerMasterEndpoint: Removing block manager BlockManagerId(2, dev-02, 37169)
16/08/19 11:08:14 INFO BlockManagerMaster: Removed 2 successfully in removeExecutor
16/08/19 11:08:14 WARN YarnSchedulerBackend$YarnSchedulerEndpoint: Container marked as failed: container_1471514504287_0021_01_000002 on host: dev-02. Exit status: 50. Diagnostics: Exception from container-launch.
Container id: container_1471514504287_0021_01_000002
Exit code: 50
Stack trace: ExitCodeException exitCode=50:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:545)
at org.apache.hadoop.util.Shell.run(Shell.java:456)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:722)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:212)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Container exited with a non-zero exit code 50
16/08/19 11:08:14 ERROR YarnScheduler: Lost executor 1 on dev-02: Container marked as failed: container_1471514504287_0021_01_000002 on host: dev-02. Exit status: 50. Diagnostics: Exception from container-launch.
Container id: container_1471514504287_0021_01_000002
Exit code: 50
Stack trace: ExitCodeException exitCode=50:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:545)
at org.apache.hadoop.util.Shell.run(Shell.java:456)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:722)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:212)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Container exited with a non-zero exit code 50
Make sure that SPARK_HOME environment variable is extracted properly at your cluster. Such error happen when spark-shell try to find spark libraries but because SPARK_HOME is not set it can't find libraries.

Unable to run a jar or sparkApplication on aws EMR

I have a very simple app that I'm trying to run on aws emr. The jar has been built using assembly with spark a provided dependency. It resides on S3 along with a test text file that I wanted to test.
In the EMR UI I select to add a step and add the details telling it the location of the jar and the argument file location.
It runs but always fails with an error - I then set up a new cluster(sanity checking) and rna again only to get the same result, any help is appreciated.
Thank you
The error from the log:
16/03/18 11:40:56 INFO client.RMProxy: Connecting to ResourceManager at ip-10-1-1-234.ec2.internal/10.1.1.234:8032
16/03/18 11:40:56 INFO yarn.Client: Requesting a new application from cluster with 1 NodeManagers
16/03/18 11:40:56 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (11520 MB per container)
16/03/18 11:40:56 INFO yarn.Client: Will allocate AM container, with 1408 MB memory including 384 MB overhead
16/03/18 11:40:56 INFO yarn.Client: Setting up container launch context for our AM
16/03/18 11:40:56 INFO yarn.Client: Setting up the launch environment for our AM container
16/03/18 11:40:56 INFO yarn.Client: Preparing resources for our AM container
16/03/18 11:40:57 INFO yarn.Client: Uploading resource file:/usr/lib/spark/lib/spark-assembly-1.6.0-hadoop2.7.1-amzn-1.jar -> hdfs://ip-10-1-1-234.ec2.internal:8020/user/hadoop/.sparkStaging/application_1458297951763_0003/spark-assembly-1.6.0-hadoop2.7.1-amzn-1.jar
16/03/18 11:40:57 INFO metrics.MetricsSaver: MetricsConfigRecord disabledInCluster: false instanceEngineCycleSec: 60 clusterEngineCycleSec: 60 disableClusterEngine: false maxMemoryMb: 3072 maxInstanceCount: 500 lastModified: 1458297958626
16/03/18 11:40:57 INFO metrics.MetricsSaver: Created MetricsSaver j-DKMA93DFZ456:i-91bff215:SparkSubmit:20036 period:60 /mnt/var/em/raw/i-91bff215_20160318_SparkSubmit_20036_raw.bin
16/03/18 11:40:58 INFO metrics.MetricsSaver: 1 aggregated HDFSWriteDelay 590 raw values into 1 aggregated values, total 1
16/03/18 11:40:59 INFO fs.EmrFileSystem: Consistency disabled, using com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem as filesystem implementation
16/03/18 11:41:00 INFO metrics.MetricsSaver: Thread 1 created MetricsLockFreeSaver 1
16/03/18 11:41:00 INFO yarn.Client: Uploading resource file:/mnt/tmp/spark-030f9d29-f7ca-42fa-9caf-64ea103a2bb1/__spark_conf__7615049662154628286.zip -> hdfs://ip-10-1-1-234.ec2.internal:8020/user/hadoop/.sparkStaging/application_1458297951763_0003/__spark_conf__7615049662154628286.zip
16/03/18 11:41:00 INFO spark.SecurityManager: Changing view acls to: hadoop
16/03/18 11:41:00 INFO spark.SecurityManager: Changing modify acls to: hadoop
16/03/18 11:41:00 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(hadoop); users with modify permissions: Set(hadoop)
16/03/18 11:41:01 INFO yarn.Client: Submitting application 3 to ResourceManager
16/03/18 11:41:01 INFO impl.YarnClientImpl: Submitted application application_1458297951763_0003
16/03/18 11:41:02 INFO yarn.Client: Application report for application_1458297951763_0003 (state: ACCEPTED)
16/03/18 11:41:02 INFO yarn.Client:
client token: N/A
diagnostics: N/A
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: default
start time: 1458301261052
final status: UNDEFINED
tracking URL: http://ip-10-1-1-234.ec2.internal:20888/proxy/application_1458297951763_0003/
user: hadoop
16/03/18 11:41:03 INFO yarn.Client: Application report for application_1458297951763_0003 (state: ACCEPTED)
16/03/18 11:41:04 INFO yarn.Client: Application report for application_1458297951763_0003 (state: ACCEPTED)
16/03/18 11:41:05 INFO yarn.Client: Application report for application_1458297951763_0003 (state: ACCEPTED)
16/03/18 11:41:06 INFO yarn.Client: Application report for application_1458297951763_0003 (state: ACCEPTED)
16/03/18 11:41:07 INFO yarn.Client: Application report for application_1458297951763_0003 (state: ACCEPTED)
16/03/18 11:41:08 INFO yarn.Client: Application report for application_1458297951763_0003 (state: ACCEPTED)
16/03/18 11:41:09 INFO yarn.Client: Application report for application_1458297951763_0003 (state: ACCEPTED)
16/03/18 11:41:10 INFO yarn.Client: Application report for application_1458297951763_0003 (state: ACCEPTED)
16/03/18 11:41:11 INFO yarn.Client: Application report for application_1458297951763_0003 (state: ACCEPTED)
16/03/18 11:41:12 INFO yarn.Client: Application report for application_1458297951763_0003 (state: ACCEPTED)
16/03/18 11:41:13 INFO yarn.Client: Application report for application_1458297951763_0003 (state: ACCEPTED)
16/03/18 11:41:14 INFO yarn.Client: Application report for application_1458297951763_0003 (state: ACCEPTED)
16/03/18 11:41:15 INFO yarn.Client: Application report for application_1458297951763_0003 (state: ACCEPTED)
16/03/18 11:41:16 INFO yarn.Client: Application report for application_1458297951763_0003 (state: ACCEPTED)
16/03/18 11:41:17 INFO yarn.Client: Application report for application_1458297951763_0003 (state: ACCEPTED)
16/03/18 11:41:18 INFO yarn.Client: Application report for application_1458297951763_0003 (state: ACCEPTED)
16/03/18 11:41:19 INFO yarn.Client: Application report for application_1458297951763_0003 (state: ACCEPTED)
16/03/18 11:41:20 INFO yarn.Client: Application report for application_1458297951763_0003 (state: ACCEPTED)
16/03/18 11:41:21 INFO yarn.Client: Application report for application_1458297951763_0003 (state: ACCEPTED)
16/03/18 11:41:22 INFO yarn.Client: Application report for application_1458297951763_0003 (state: ACCEPTED)
16/03/18 11:41:23 INFO yarn.Client: Application report for application_1458297951763_0003 (state: ACCEPTED)
16/03/18 11:41:24 INFO yarn.Client: Application report for application_1458297951763_0003 (state: ACCEPTED)
16/03/18 11:41:25 INFO yarn.Client: Application report for application_1458297951763_0003 (state: ACCEPTED)
16/03/18 11:41:26 INFO yarn.Client: Application report for application_1458297951763_0003 (state: ACCEPTED)
16/03/18 11:41:27 INFO yarn.Client: Application report for application_1458297951763_0003 (state: ACCEPTED)
16/03/18 11:41:28 INFO yarn.Client: Application report for application_1458297951763_0003 (state: ACCEPTED)
16/03/18 11:41:29 INFO yarn.Client: Application report for application_1458297951763_0003 (state: ACCEPTED)
16/03/18 11:41:30 INFO yarn.Client: Application report for application_1458297951763_0003 (state: ACCEPTED)
16/03/18 11:41:31 INFO yarn.Client: Application report for application_1458297951763_0003 (state: ACCEPTED)
16/03/18 11:41:32 INFO yarn.Client: Application report for application_1458297951763_0003 (state: FAILED)
16/03/18 11:41:32 INFO yarn.Client:
client token: N/A
diagnostics: Application application_1458297951763_0003 failed 2 times due to AM Container for appattempt_1458297951763_0003_000002 exited with exitCode: 15
For more detailed output, check application tracking page:http://ip-10-1-1-234.ec2.internal:8088/cluster/app/application_1458297951763_0003Then, click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_1458297951763_0003_02_000001
Exit code: 15
Stack trace: ExitCodeException exitCode=15:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:545)
at org.apache.hadoop.util.Shell.run(Shell.java:456)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:722)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Container exited with a non-zero exit code 15
Failing this attempt. Failing the application.
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: default
start time: 1458301261052
final status: FAILED
tracking URL: http://ip-10-1-1-234.ec2.internal:8088/cluster/app/application_1458297951763_0003
user: hadoop
Exception in thread "main" org.apache.spark.SparkException: Application application_1458297951763_0003 finished with failed status
at org.apache.spark.deploy.yarn.Client.run(Client.scala:1029)
at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1076)
at org.apache.spark.deploy.yarn.Client.main(Client.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
16/03/18 11:41:32 INFO util.ShutdownHookManager: Shutdown hook called
16/03/18 11:41:32 INFO util.ShutdownHookManager: Deleting directory /mnt/tmp/spark-030f9d29-f7ca-42fa-9caf-64ea103a2bb1
Command exiting with ret '1'
Referring to issue Running Spark Job on Yarn Cluster
It can mean a lot of things, for us, we get the similar error message because of unsupported Java class version, and we fixed the problem by deleting the referenced Java class in our project.
Use this command to see the detailed error message:
yarn logs -application_id application_1458297951763_0003