Spark Job failing with exit status 15

Spark Job failing with exit status 15 - scala

I am trying to run simple word count job in spark but I am getting exception while running job.
For more detailed output, check application tracking page:http://quickstart.cloudera:8088/proxy/application_1446699275562_0006/Then, click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_1446699275562_0006_02_000001
Exit code: 15
Stack trace: ExitCodeException exitCode=15:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
at org.apache.hadoop.util.Shell.run(Shell.java:455)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Container exited with a non-zero exit code 15
Failing this attempt. Failing the application.
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: root.cloudera
start time: 1446910483956
final status: FAILED
tracking URL: http://quickstart.cloudera:8088/cluster/app/application_1446699275562_0006
user: cloudera
Exception in thread "main" org.apache.spark.SparkException: Application finished with failed status
at org.apache.spark.deploy.yarn.Client.run(Client.scala:626)
at org.apache.spark.deploy.yarn.Client$.main(Client.scala:651)
at org.apache.spark.deploy.yarn.Client.main(Client.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:569)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:166)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:189)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:110)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
I checked log from following command
yarn logs -applicationId application_1446699275562_0006
Here is log
15/11/07 07:35:09 ERROR yarn.ApplicationMaster: User class threw exception: Output directory hdfs://quickstart.cloudera:8020/user/cloudera/WordCountOutput already exists
org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory hdfs://quickstart.cloudera:8020/user/cloudera/WordCountOutput already exists
at org.apache.hadoop.mapred.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:132)
at org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopDataset(PairRDDFunctions.scala:1053)
at org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopFile(PairRDDFunctions.scala:954)
at org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopFile(PairRDDFunctions.scala:863)
at org.apache.spark.rdd.RDD.saveAsTextFile(RDD.scala:1290)
at org.com.td.sparkdemo.spark.WordCount$.main(WordCount.scala:23)
at org.com.td.sparkdemo.spark.WordCount.main(WordCount.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:480)
15/11/07 07:35:09 INFO yarn.ApplicationMaster: Final app status: FAILED, exitCode: 15, (reason: User class threw exception: Output directory hdfs://quickstart.cloudera:8020/user/cloudera/WordCountOutput already exists)
15/11/07 07:35:14 ERROR yarn.ApplicationMaster: SparkContext did not initialize after waiting for 100000 ms. Please check earlier log output for errors. Failing the application.
Exception clearly indicates that WordCountOutput directory already exists but I made sure that directory is not there before running job.
Why I am getting this error even though directory was not there before running my job?

I came across same issue and fixed it by adding below highlighted part.
SparkConf sparkConf = new SparkConf().setAppName("Sentiment Scoring").set("spark.hadoop.validateOutputSpecs", "true");
Thanks,
Sathish.

For us this was caused by "running beyond physical memory limits". After increasing executer memory, the issue was fixed.

This Error mostly occurs while you submitting the wrong spark parameter in the spark-submit command. Please check the configuration parameters. In my case, I failed to submit masternamenode address where I need to read resources from HDFS.
Before : Where yarn threw Error- 15
dfs://masternamenode/TESTCGNATDATA/
After: Able to run the application
hdfs://masternamenode/TESTCGNATDATA/

Related

Getting OutOfMemoryError while executing job through azure pipeline

I am gettig below error while running job through azure pipeline. I have tried to increase MaxMetaSpaceSize using below command but still its not working.
JAVA_OPTS="-Xms512m -Xmx1024m -XX:MetaspaceSize=512M -XX:MaxMetaspaceSize=1024m -Djava.net.preferIPv4Stack=true"
Error:
at org.gradle.api.internal.tasks.testing.worker.TestWorker.execute(TestWorker.java:60)
at org.gradle.process.internal.worker.child.ActionExecutionWorker.execute(ActionExecutionWorker.java:56)
at org.gradle.process.internal.worker.child.SystemApplicationClassLoaderWorker.call(SystemApplicationClassLoaderWorker.java:113)
at org.gradle.process.internal.worker.child.SystemApplicationClassLoaderWorker.call(SystemApplicationClassLoaderWorker.java:65)
at app//worker.org.gradle.process.internal.worker.GradleWorkerMain.run(GradleWorkerMain.java:69)
at app//worker.org.gradle.process.internal.worker.GradleWorkerMain.main(GradleWorkerMain.java:74)
Caused by:
org.junit.platform.commons.JUnitException: Error executing tests for engine junit-jupiter
at app//org.junit.platform.engine.support.hierarchical.HierarchicalTestEngine.execute(HierarchicalTestEngine.java:57)
at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:107)
... 30 more
Caused by:
java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError
at java.base/java.util.concurrent.ForkJoinTask.get(ForkJoinTask.java:1006)
at org.junit.platform.engine.support.hierarchical.HierarchicalTestEngine.execute(HierarchicalTestEngine.java:54)
... 31 more
Caused by:
java.lang.OutOfMemoryError
at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:490)
at java.base/java.util.concurrent.ForkJoinTask.getThrowableException(ForkJoinTask.java:603)

Spark submit - AM Container for appattempt_1512628475693_0641_000001 exited with exitCode: 15

I am running a Spark job in yarn - cluster mode and it is failing with the following error.
could anyone please suggest the reason/ cause for this.
Exception in thread "main" org.apache.spark.SparkException: Application application_1512628475693_0641 finished with failed status
at org.apache.spark.deploy.yarn.Client.run(Client.scala:1261)
at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1307)
at org.apache.spark.deploy.yarn.Client.main(Client.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:751)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
The diagnostics from Ambari are as following
AM Container for appattempt_1512628475693_0641_000001 exited with exitCode: 15
Diagnostics: Exception from container-launch.
Container id: container_e9342_1512628475693_0641_01_000001
Exit code: 15
Stack trace: org.apache.hadoop.yarn.server.nodemanager.containermanager.runtime.ContainerExecutionException: Launch container failed
Any help would be greatly appreciated, please

Spark in cluster mode throws error if a SparkContext is not started

I have a Spark job that initializes the spark context only if it is really necessary:
val conf = new SparkConf()
val jobs: List[Job] = ??? //get some jobs
if(jobs.nonEmpty) {
val sc = new SparkContext(conf)
sc.parallelize(jobs).foreach(....)
} else {
//do nothing
}
It worked fine on Yarn if deploy-mode is 'client'
spark-submit --master yarn --deploy-mode client
Then I switched deploy mode to 'cluster' and it started to crash in case of jobs.isEmpty
spark-submit --master yarn --deploy-mode cluster
Below is the error text:
INFO yarn.Client: Application report for
application_1509613523426_0017 (state: ACCEPTED)
17/11/02 11:37:17
INFO yarn.Client: Application report for
application_1509613523426_0017 (state: FAILED) 17/11/02 11:37:17
INFO yarn.Client: client token: N/A diagnostics: Application
application_1509613523426_0017 failed 2 times due to AM Container for
appattempt_1509613523426_0017_000002 exited with exitCode: -1000 For
more detailed output, check application tracking
page:http://xxxxxx.com:8088/cluster/app/application_1509613523426_0017Then,
click on links to logs of each attempt. Diagnostics: File does not
exist:
hdfs://xxxxxxx/.sparkStaging/application_1509613523426_0017/__spark_libs__997458388067724499.zip
java.io.FileNotFoundException: File does not exist:
hdfs://xxxxxxx/.sparkStaging/application_1509613523426_0017/__spark_libs__997458388067724499.zip
at
org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1309)
at
org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1301)
at
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1301)
at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:253)
at
org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:63)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:361)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359)
at java.security.AccessController.doPrivileged(Native Method) at
javax.security.auth.Subject.doAs(Subject.java:422) at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:358)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62)
at java.util.concurrent.FutureTask.run(FutureTask.java:266) at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266) at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
Failing this attempt. Failing the application. ApplicationMaster
host: N/A ApplicationMaster RPC port: -1 queue: dev start time:
1509622629354 final status: FAILED tracking URL:
http://xxxxxx.com:8088/cluster/app/application_1509613523426_0017 user: xxx Exception in thread "main"
org.apache.spark.SparkException: Application
application_1509613523426_0017 finished with failed status at
org.apache.spark.deploy.yarn.Client.run(Client.scala:1104) at
org.apache.spark.deploy.yarn.Client$.main(Client.scala:1150) at
org.apache.spark.deploy.yarn.Client.main(Client.scala) at
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498) at
org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:755)
at
org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
17/11/02 11:37:17 INFO util.ShutdownHookManager: Shutdown hook called
17/11/02 11:37:17 INFO util.ShutdownHookManager: Deleting directory
/tmp/spark-a5b20def-0218-4b0c-b9f8-fdf8a1802e95
Is it a bug in Yarn support or I'm missing something?

SparkContext is the one who is responsible for communication with cluster manager. If application is submitted to the cluster, but context is never created, YARN cannot determine the state of the application - this is why you get an error.

How to restart Spark Streaming job from checkpoint on Dataproc?

This is a follow up to Spark streaming on dataproc throws FileNotFoundException
Over the past few weeks (not sure since exactly when), restart of a spark streaming job, even with the "kill dataproc.agent" trick is throwing this exception:
17/05/16 17:39:02 INFO org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at stream-event-processor-m/10.138.0.3:8032
17/05/16 17:39:03 INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl: Submitted application application_1494955637459_0006
17/05/16 17:39:04 ERROR org.apache.spark.SparkContext: Error initializing SparkContext.
org.apache.spark.SparkException: Yarn application has already ended! It might have been killed or unable to launch application master.
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.waitForApplication(YarnClientSchedulerBackend.scala:85)
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:62)
at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:149)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:497)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2258)
at org.apache.spark.streaming.StreamingContext.<init>(StreamingContext.scala:140)
at org.apache.spark.streaming.StreamingContext$$anonfun$getOrCreate$1.apply(StreamingContext.scala:826)
at org.apache.spark.streaming.StreamingContext$$anonfun$getOrCreate$1.apply(StreamingContext.scala:826)
at scala.Option.map(Option.scala:146)
at org.apache.spark.streaming.StreamingContext$.getOrCreate(StreamingContext.scala:826)
at com.thumbtack.common.model.SparkStream$class.main(SparkStream.scala:73)
at com.thumbtack.skyfall.StreamEventProcessor$.main(StreamEventProcessor.scala:19)
at com.thumbtack.skyfall.StreamEventProcessor.main(StreamEventProcessor.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:736)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:185)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:210)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:124)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
17/05/16 17:39:04 INFO org.spark_project.jetty.server.ServerConnector: Stopped ServerConnector#5555ffcf{HTTP/1.1}{0.0.0.0:4479}
17/05/16 17:39:04 WARN org.apache.spark.scheduler.cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: Attempted to request executors before the AM has registered!
17/05/16 17:39:04 ERROR org.apache.spark.util.Utils: Uncaught exception in thread main
java.lang.NullPointerException
at org.apache.spark.network.shuffle.ExternalShuffleClient.close(ExternalShuffleClient.java:152)
at org.apache.spark.storage.BlockManager.stop(BlockManager.scala:1360)
at org.apache.spark.SparkEnv.stop(SparkEnv.scala:87)
at org.apache.spark.SparkContext$$anonfun$stop$11.apply$mcV$sp(SparkContext.scala:1797)
at org.apache.spark.util.Utils$.tryLogNonFatalError(Utils.scala:1290)
at org.apache.spark.SparkContext.stop(SparkContext.scala:1796)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:565)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2258)
at org.apache.spark.streaming.StreamingContext.<init>(StreamingContext.scala:140)
at org.apache.spark.streaming.StreamingContext$$anonfun$getOrCreate$1.apply(StreamingContext.scala:826)
at org.apache.spark.streaming.StreamingContext$$anonfun$getOrCreate$1.apply(StreamingContext.scala:826)
at scala.Option.map(Option.scala:146)
at org.apache.spark.streaming.StreamingContext$.getOrCreate(StreamingContext.scala:826)
at com.thumbtack.common.model.SparkStream$class.main(SparkStream.scala:73)
at com.thumbtack.skyfall.StreamEventProcessor$.main(StreamEventProcessor.scala:19)
at com.thumbtack.skyfall.StreamEventProcessor.main(StreamEventProcessor.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:736)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:185)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:210)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:124)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Exception in thread "main" org.apache.spark.SparkException: Yarn application has already ended! It might have been killed or unable to launch application master.
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.waitForApplication(YarnClientSchedulerBackend.scala:85)
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:62)
at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:149)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:497)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2258)
at org.apache.spark.streaming.StreamingContext.<init>(StreamingContext.scala:140)
at org.apache.spark.streaming.StreamingContext$$anonfun$getOrCreate$1.apply(StreamingContext.scala:826)
at org.apache.spark.streaming.StreamingContext$$anonfun$getOrCreate$1.apply(StreamingContext.scala:826)
at scala.Option.map(Option.scala:146)
at org.apache.spark.streaming.StreamingContext$.getOrCreate(StreamingContext.scala:826)
at com.thumbtack.common.model.SparkStream$class.main(SparkStream.scala:73)
at com.thumbtack.skyfall.StreamEventProcessor$.main(StreamEventProcessor.scala:19)
at com.thumbtack.skyfall.StreamEventProcessor.main(StreamEventProcessor.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:736)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:185)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:210)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:124)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Job output is complete
How to restart a spark streaming job from its checkpoint on a Dataproc cluster?

We've recently added auto-restart capabilities to dataproc jobs (available in gcloud beta track and in v1 API).
To take advantage of auto-restart, a job must be able to recover/cleanup so it will not work for most jobs without modification. However, it does work out of the box with Spark streaming with checkpoint files.
The restart-dataproc-agent trick should no longer be necessary. Auto-restart is resilient against Job crashes, Dataproc Agent failures, and VM restart-on-migration events.
Example:
gcloud beta dataproc jobs submit spark ... --max-failures-per-hour 1
See:
https://cloud.google.com/dataproc/docs/concepts/restartable-jobs
If you want to test out recovery, you can simulate VM migration by restarting the master VM [1]. After this you should be able to describe the job [2] and see ATTEMPT_FAILURE entry in statusHistory.
[1] gcloud compute instances reset <cluster-name>-m
[2] gcloud dataproc jobs describe

HDFS command line put throwing an exception

I am trying to put a file into hdfs using ssh from my client pc to the NameNode server. There are 2 machines: One NameNode and one DataNode. Here is the command I am trying;
$ bin/hadoop fs -fs hdfs://MY_IP:MY_PORT -put example.txt example.txt
But it throws an exception. The logs say;
2013-05-23 09:25:31,808 ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:_my_user_name_ cause:java.io.IOException: Unknown protocol to DataNode: org.apache.hadoop.hdfs.protocol.ClientProtocol
2013-05-23 09:25:31,808 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on _my_port_, call getProtocolVersion(org.apache.hadoop.hdfs.protocol.ClientProtocol, 61) from _ip_:_port_: error: java.io.IOException: Unknown protocol to DataNode: org.apache.hadoop.hdfs.protocol.ClientProtocol
java.io.IOException: Unknown protocol to DataNode: org.apache.hadoop.hdfs.protocol.ClientProtocol
at org.apache.hadoop.hdfs.server.datanode.DataNode.getProtocolVersion(DataNode.java:1759)
at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:563)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1382)
What can be the problem? Thanks a lot

Check your core-site.xml file and verify the hdfs url.

Check the version of your client's hadoop and cluster's hadoop, my guess is your client version is less than the cluster version.
java.io.IOException: Unknown protocol to DataNode: org.apache.hadoop.hdfs.protocol.ClientProtocol
at org.apache.hadoop.hdfs.server.datanode.DataNode.getProtocolVersion(DataNode.java:1759)

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Spark Job failing with exit status 15 - scala

I came across same issue and fixed it by adding below highlighted part. SparkConf sparkConf = new SparkConf().setAppName("Sentiment Scoring").set("spark.hadoop.validateOutputSpecs", "true"); Thanks, Sathish.

For us this was caused by "running beyond physical memory limits". After increasing executer memory, the issue was fixed.

Related

Getting OutOfMemoryError while executing job through azure pipeline

Spark submit - AM Container for appattempt_1512628475693_0641_000001 exited with exitCode: 15

Spark in cluster mode throws error if a SparkContext is not started

How to restart Spark Streaming job from checkpoint on Dataproc?

HDFS command line put throwing an exception

Categories

Resources