Spark submit runs successfully but when submitted through oozie it fails to connect to hive - scala

I am using CDH 5.9.0, Spark 1.6 and Scala 2.10.0. I have created a scala and spark program to create a table and load data from a file to hive. When I run it using spark submit, it completes. But the same program when submitted through oozie, it throws the below exception.
Below is the exception.
Log Type: stdout
Log Upload Time: Fri Oct 27 10:08:28 -0400 2017
Log Length: 172584
2017-10-27 10:08:20,652 INFO [main] yarn.ApplicationMaster (SignalLogger.scala:register(47)) - Registered signal handlers for [TERM, HUP, INT]
2017-10-27 10:08:21,306 INFO [main] yarn.ApplicationMaster (Logging.scala:logInfo(58)) - ApplicationAttemptId: appattempt_1507999204018_0292_000001
2017-10-27 10:08:21,952 INFO [main] spark.SecurityManager (Logging.scala:logInfo(58)) - Changing view acls to: username
2017-10-27 10:08:21,953 INFO [main] spark.SecurityManager (Logging.scala:logInfo(58)) - Changing modify acls to: username
2017-10-27 10:08:21,956 INFO [main] spark.SecurityManager (Logging.scala:logInfo(58)) - SecurityManager: authentication enabled; ui acls disabled; users with view permissions: Set(username); users with modify permissions: Set(username)
2017-10-27 10:08:21,970 INFO [main] yarn.ApplicationMaster (Logging.scala:logInfo(58)) - Starting the user application in a separate Thread
2017-10-27 10:08:21,997 INFO [main] yarn.ApplicationMaster (Logging.scala:logInfo(58)) - Waiting for spark context initialization
2017-10-27 10:08:21,998 INFO [main] yarn.ApplicationMaster (Logging.scala:logInfo(58)) - Waiting for spark context initialization ...
2017-10-27 10:08:22,308 WARN [Driver] security.UserGroupInformation (UserGroupInformation.java:doAs(1701)) - PriviledgedActionException as:username (auth:SIMPLE) cause:org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category READ is not supported in state standby. Visit https://s.apache.org/sbnn-error
2017-10-27 10:08:22,309 WARN [Driver] ipc.Client (Client.java:run(682)) - Exception encountered while connecting to the server : org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category READ is not supported in state standby. Visit https://s.apache.org/sbnn-error
2017-10-27 10:08:22,310 WARN [Driver] security.UserGroupInformation (UserGroupInformation.java:doAs(1701)) - PriviledgedActionException as:username (auth:SIMPLE) cause:org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category READ is not supported in state standby. Visit https://s.apache.org/sbnn-error
2017-10-27 10:08:22,391 INFO [Driver] spark.SparkContext (Logging.scala:logInfo(58)) - Running Spark version 1.6.0
2017-10-27 10:08:22,417 INFO [Driver] spark.SecurityManager (Logging.scala:logInfo(58)) - Changing view acls to: username
2017-10-27 10:08:22,418 INFO [Driver] spark.SecurityManager (Logging.scala:logInfo(58)) - Changing modify acls to: username
2017-10-27 10:08:22,418 INFO [Driver] spark.SecurityManager (Logging.scala:logInfo(58)) - SecurityManager: authentication enabled; ui acls disabled; users with view permissions: Set(username); users with modify permissions: Set(username)
2017-10-27 10:08:22,572 INFO [Driver] util.Utils (Logging.scala:logInfo(58)) - Successfully started service 'sparkDriver' on port 44049.
2017-10-27 10:08:22,901 INFO [sparkDriverActorSystem-akka.actor.default-dispatcher-4] slf4j.Slf4jLogger (Slf4jLogger.scala:applyOrElse(80)) - Slf4jLogger started
2017-10-27 10:08:22,936 INFO [sparkDriverActorSystem-akka.actor.default-dispatcher-4] Remoting (Slf4jLogger.scala:apply$mcV$sp(74)) - Starting remoting
2017-10-27 10:08:23,062 INFO [sparkDriverActorSystem-akka.actor.default-dispatcher-4] Remoting (Slf4jLogger.scala:apply$mcV$sp(74)) - Remoting started; listening on addresses :[akka.tcp://sparkDriverActorSystem#a.b.c.d:38305]
2017-10-27 10:08:23,064 INFO [sparkDriverActorSystem-akka.actor.default-dispatcher-4] Remoting (Slf4jLogger.scala:apply$mcV$sp(74)) - Remoting now listens on addresses: [akka.tcp://sparkDriverActorSystem#a.b.c.d:38305]
2017-10-27 10:08:23,174 INFO [Driver] util.Utils (Logging.scala:logInfo(58)) - Successfully started service 'sparkDriverActorSystem' on port 38305.
2017-10-27 10:08:23,195 INFO [Driver] spark.SparkEnv (Logging.scala:logInfo(58)) - Registering MapOutputTracker
2017-10-27 10:08:23,207 INFO [Driver] spark.SparkEnv (Logging.scala:logInfo(58)) - Registering BlockManagerMaster
2017-10-27 10:08:23,216 INFO [Driver] storage.DiskBlockManager (Logging.scala:logInfo(58)) - Created local directory at /data/01/yarn/nm/usercache/username/appcache/application_1507999204018_0292/blockmgr-ba42749b-3498-4c1d-ba8b-dc6720e815a0
2017-10-27 10:08:23,217 INFO [Driver] storage.DiskBlockManager (Logging.scala:logInfo(58)) - Created local directory at /data/02/yarn/nm/usercache/username/appcache/application_1507999204018_0292/blockmgr-d9375d30-699d-4e40-8b42-559f79f27f85
2017-10-27 10:08:23,217 INFO [Driver] storage.DiskBlockManager (Logging.scala:logInfo(58)) - Created local directory at /data/03/yarn/nm/usercache/username/appcache/application_1507999204018_0292/blockmgr-fc2caf3b-3fa0-4f1e-be01-b33b6f6d52d5
2017-10-27 10:08:23,217 INFO [Driver] storage.DiskBlockManager (Logging.scala:logInfo(58)) - Created local directory at /data/04/yarn/nm/usercache/username/appcache/application_1507999204018_0292/blockmgr-450319a4-2d4f-4159-a633-3dd2a71bafe1
2017-10-27 10:08:23,217 INFO [Driver] storage.DiskBlockManager (Logging.scala:logInfo(58)) - Created local directory at /data/05/yarn/nm/usercache/username/appcache/application_1507999204018_0292/blockmgr-c3dbf9b3-cb95-4104-b4bf-9e7b1987e210
2017-10-27 10:08:23,217 INFO [Driver] storage.DiskBlockManager (Logging.scala:logInfo(58)) - Created local directory at /data/06/yarn/nm/usercache/username/appcache/application_1507999204018_0292/blockmgr-5d9c58a6-29bb-4e8e-a8fb-3720db0004d4
2017-10-27 10:08:23,218 INFO [Driver] storage.DiskBlockManager (Logging.scala:logInfo(58)) - Created local directory at /data/07/yarn/nm/usercache/username/appcache/application_1507999204018_0292/blockmgr-999eecaf-f183-4ede-8845-eeb57a87276b
2017-10-27 10:08:23,218 INFO [Driver] storage.DiskBlockManager (Logging.scala:logInfo(58)) - Created local directory at /data/08/yarn/nm/usercache/username/appcache/application_1507999204018_0292/blockmgr-216d2449-14b1-45aa-b6c6-d6271815f485
2017-10-27 10:08:23,221 INFO [Driver] storage.MemoryStore (Logging.scala:logInfo(58)) - MemoryStore started with capacity 491.7 MB
2017-10-27 10:08:23,283 INFO [Driver] spark.SparkEnv (Logging.scala:logInfo(58)) - Registering OutputCommitCoordinator
2017-10-27 10:08:23,394 INFO [Driver] ui.JettyUtils (Logging.scala:logInfo(58)) - Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
2017-10-27 10:08:23,413 INFO [Driver] server.Server (Server.java:doStart(272)) - jetty-8.y.z-SNAPSHOT
2017-10-27 10:08:23,448 INFO [Driver] server.AbstractConnector (AbstractConnector.java:doStart(338)) - Started SelectChannelConnector#0.0.0.0:36123
2017-10-27 10:08:23,448 INFO [Driver] util.Utils (Logging.scala:logInfo(58)) - Successfully started service 'SparkUI' on port 36123.
2017-10-27 10:08:23,449 INFO [Driver] ui.SparkUI (Logging.scala:logInfo(58)) - Started SparkUI at http://a.b.c.d:36123
2017-10-27 10:08:23,498 INFO [Driver] cluster.YarnClusterScheduler (Logging.scala:logInfo(58)) - Created YarnClusterScheduler
2017-10-27 10:08:23,524 INFO [Driver] util.Utils (Logging.scala:logInfo(58)) - Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 44418.
2017-10-27 10:08:23,525 INFO [Driver] netty.NettyBlockTransferService (Logging.scala:logInfo(58)) - Server created on 44418
2017-10-27 10:08:23,527 INFO [Driver] storage.BlockManager (Logging.scala:logInfo(58)) - external shuffle service port = 7337
2017-10-27 10:08:23,527 INFO [Driver] storage.BlockManagerMaster (Logging.scala:logInfo(58)) - Trying to register BlockManager
2017-10-27 10:08:23,530 INFO [dispatcher-event-loop-11] storage.BlockManagerMasterEndpoint (Logging.scala:logInfo(58)) - Registering block manager a.b.c.d:44418 with 491.7 MB RAM, BlockManagerId(driver, a.b.c.d, 44418)
2017-10-27 10:08:23,533 INFO [Driver] storage.BlockManagerMaster (Logging.scala:logInfo(58)) - Registered BlockManager
2017-10-27 10:08:24,106 INFO [Driver] scheduler.EventLoggingListener (Logging.scala:logInfo(58)) - Logging events to hdfs://.../user/spark/applicationHistory/application_1507999204018_0292_1
2017-10-27 10:08:24,133 INFO [Driver] cluster.YarnClusterSchedulerBackend (Logging.scala:logInfo(58)) - SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.8
2017-10-27 10:08:24,133 INFO [Driver] cluster.YarnClusterScheduler (Logging.scala:logInfo(58)) - YarnClusterScheduler.postStartHook done
2017-10-27 10:08:24,140 INFO [dispatcher-event-loop-13] cluster.YarnSchedulerBackend$YarnSchedulerEndpoint (Logging.scala:logInfo(58)) - ApplicationMaster registered as NettyRpcEndpointRef(spark://YarnAM#a.b.c.d:44049)
2017-10-27 10:08:24,191 INFO [main] yarn.YarnRMClient (Logging.scala:logInfo(58)) - Registering the ApplicationMaster
2017-10-27 10:08:24,295 INFO [main] yarn.ApplicationMaster (Logging.scala:logInfo(58)) - Started progress reporter thread with (heartbeat : 3000, initial allocation : 200) intervals
2017-10-27 10:08:25,107 INFO [Driver] hive.HiveContext (Logging.scala:logInfo(58)) - Initializing execution hive, version 1.1.0
2017-10-27 10:08:25,146 INFO [Driver] client.ClientWrapper (Logging.scala:logInfo(58)) - Inspected Hadoop version: 2.6.0-cdh5.9.0
2017-10-27 10:08:25,147 INFO [Driver] client.ClientWrapper (Logging.scala:logInfo(58)) - Loaded org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version 2.6.0-cdh5.9.0
2017-10-27 10:08:25,582 INFO [Driver] metastore.HiveMetaStore (HiveMetaStore.java:newRawStore(644)) - 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
2017-10-27 10:08:25,600 INFO [Driver] metastore.ObjectStore (ObjectStore.java:initialize(333)) - ObjectStore, initialize called
2017-10-27 10:08:25,671 WARN [Driver] DataNucleus.General (Log4JLogger.java:warn(96)) - Plugin (Bundle) "org.datanucleus" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/data/03/yarn/nm/usercache/username/appcache/application_1507999204018_0292/container_e69_1507999204018_0292_01_000001/datanucleus-core-3.2.2.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/data/05/yarn/nm/filecache/507/datanucleus-core-3.2.2.jar."
2017-10-27 10:08:25,687 WARN [Driver] DataNucleus.General (Log4JLogger.java:warn(96)) - Plugin (Bundle) "org.datanucleus.api.jdo" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/data/03/yarn/nm/usercache/username/appcache/application_1507999204018_0292/container_e69_1507999204018_0292_01_000001/datanucleus-api-jdo-3.2.1.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/data/07/yarn/nm/filecache/582/datanucleus-api-jdo-3.2.1.jar."
2017-10-27 10:08:25,688 WARN [Driver] DataNucleus.General (Log4JLogger.java:warn(96)) - Plugin (Bundle) "org.datanucleus.store.rdbms" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/data/08/yarn/nm/filecache/554/datanucleus-rdbms-3.2.1.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/data/03/yarn/nm/usercache/username/appcache/application_1507999204018_0292/container_e69_1507999204018_0292_01_000001/datanucleus-rdbms-3.2.1.jar."
2017-10-27 10:08:25,709 INFO [Driver] DataNucleus.Persistence (Log4JLogger.java:info(77)) - Property hive.metastore.integral.jdo.pushdown unknown - will be ignored
2017-10-27 10:08:25,710 INFO [Driver] DataNucleus.Persistence (Log4JLogger.java:info(77)) - Property datanucleus.cache.level2 unknown - will be ignored
2017-10-27 10:08:26,178 WARN [Driver] bonecp.BoneCPConfig (BoneCPConfig.java:sanitize(1537)) - Max Connections < 1. Setting to 20
2017-10-27 10:08:26,180 ERROR [Driver] Datastore.Schema (Log4JLogger.java:error(125)) - Failed initialising database.
Unable to open a test connection to the given database. JDBC url = jdbc:derby:;databaseName=/data/03/yarn/nm/usercache/username/appcache/application_1507999204018_0292/container_e69_1507999204018_0292_01_000001/tmp/spark-633fb1f8-1f38-44ac-a54e-81465354bedc/metastore;create=true, username = APP. Terminating connection pool. Original Exception: ------
java.sql.SQLException: No suitable driver found for jdbc:derby:;databaseName=/data/03/yarn/nm/usercache/username/appcache/application_1507999204018_0292/container_e69_1507999204018_0292_01_000001/tmp/spark-633fb1f8-1f38-44ac-a54e-81465354bedc/metastore;create=true
at java.sql.DriverManager.getConnection(DriverManager.java:689)
at java.sql.DriverManager.getConnection(DriverManager.java:208)
at com.jolbox.bonecp.BoneCP.obtainRawInternalConnection(BoneCP.java:254)
at com.jolbox.bonecp.BoneCP.<init>(BoneCP.java:305)
at com.jolbox.bonecp.BoneCPDataSource.maybeInit(BoneCPDataSource.java:150)
at com.jolbox.bonecp.BoneCPDataSource.getConnection(BoneCPDataSource.java:112)
at org.datanucleus.store.rdbms.ConnectionFactoryImpl$ManagedConnectionImpl.getConnection(ConnectionFactoryImpl.java:479)
at org.datanucleus.store.rdbms.RDBMSStoreManager.<init>(RDBMSStoreManager.java:304)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.datanucleus.plugin.NonManagedPluginRegistry.createExecutableExtension(NonManagedPluginRegistry.java:631)
at org.datanucleus.plugin.PluginManager.createExecutableExtension(PluginManager.java:301)
at org.datanucleus.NucleusContext.createStoreManagerForProperties(NucleusContext.java:1069)
at org.datanucleus.NucleusContext.initialise(NucleusContext.java:359)
at org.datanucleus.api.jdo.JDOPersistenceManagerFactory.freezeConfiguration(JDOPersistenceManagerFactory.java:768)
at org.datanucleus.api.jdo.JDOPersistenceManagerFactory.createPersistenceManagerFactory(JDOPersistenceManagerFactory.java:326)
at org.datanucleus.api.jdo.JDOPersistenceManagerFactory.getPersistenceManagerFactory(JDOPersistenceManagerFactory.java:195)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at javax.jdo.JDOHelper$16.run(JDOHelper.java:1965)
at java.security.AccessController.doPrivileged(Native Method)
at javax.jdo.JDOHelper.invoke(JDOHelper.java:1960)
at javax.jdo.JDOHelper.invokeGetPersistenceManagerFactoryOnImplementation(JDOHelper.java:1166)
at javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:808)
at javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:701)
at org.apache.hadoop.hive.metastore.ObjectStore.getPMF(ObjectStore.java:411)
at org.apache.hadoop.hive.metastore.ObjectStore.getPersistenceManager(ObjectStore.java:440)
at org.apache.hadoop.hive.metastore.ObjectStore.initialize(ObjectStore.java:335)
at org.apache.hadoop.hive.metastore.ObjectStore.setConf(ObjectStore.java:291)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
at org.apache.hadoop.hive.metastore.RawStoreProxy.<init>(RawStoreProxy.java:57)
at org.apache.hadoop.hive.metastore.RawStoreProxy.getProxy(RawStoreProxy.java:66)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newRawStore(HiveMetaStore.java:648)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:626)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:675)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:484)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:78)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:84)
at org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:5999)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:203)
at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<init>(SessionHiveMetaStoreClient.java:74)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1528)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:67)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:82)
at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3037)
at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3056)
at org.apache.hadoop.hive.ql.metadata.Hive.getAllFunctions(Hive.java:3281)
at org.apache.hadoop.hive.ql.metadata.Hive.reloadFunctions(Hive.java:217)
at org.apache.hadoop.hive.ql.metadata.Hive.registerAllFunctionsOnce(Hive.java:201)
at org.apache.hadoop.hive.ql.metadata.Hive.<init>(Hive.java:324)
at org.apache.hadoop.hive.ql.metadata.Hive.get(Hive.java:285)
at org.apache.hadoop.hive.ql.metadata.Hive.get(Hive.java:260)
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:514)
at org.apache.spark.sql.hive.client.ClientWrapper.<init>(ClientWrapper.scala:194)
at org.apache.spark.sql.hive.client.IsolatedClientLoader.createClient(IsolatedClientLoader.scala:238)
at org.apache.spark.sql.hive.HiveContext.executionHive$lzycompute(HiveContext.scala:220)
at org.apache.spark.sql.hive.HiveContext.executionHive(HiveContext.scala:210)
at org.apache.spark.sql.hive.HiveContext.functionRegistry$lzycompute(HiveContext.scala:464)
at org.apache.spark.sql.hive.HiveContext.functionRegistry(HiveContext.scala:463)
at org.apache.spark.sql.UDFRegistration.<init>(UDFRegistration.scala:40)
at org.apache.spark.sql.SQLContext.<init>(SQLContext.scala:330)
at org.apache.spark.sql.hive.HiveContext.<init>(HiveContext.scala:90)
at org.apache.spark.sql.hive.HiveContext.<init>(HiveContext.scala:101)
at prfrx.externaltableerror$.main(externaltableerror.scala:28)
at prfrx.externaltableerror.main(externaltableerror.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:542)
at com.jolbox.bonecp.BoneCP.<init>(BoneCP.java:312)
at com.jolbox.bonecp.BoneCPDataSource.maybeInit(BoneCPDataSource.java:150)
at com.jolbox.bonecp.BoneCPDataSource.getConnection(BoneCPDataSource.java:112)
at org.datanucleus.store.rdbms.ConnectionFactoryImpl$ManagedConnectionImpl.getConnection(ConnectionFactoryImpl.java:479)
at org.datanucleus.store.rdbms.RDBMSStoreManager.<init>(RDBMSStoreManager.java:304)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.datanucleus.plugin.NonManagedPluginRegistry.createExecutableExtension(NonManagedPluginRegistry.java:631)
at org.datanucleus.plugin.PluginManager.createExecutableExtension(PluginManager.java:301)
at org.datanucleus.NucleusContext.createStoreManagerForProperties(NucleusContext.java:1069)
at org.datanucleus.NucleusContext.initialise(NucleusContext.java:359)
at org.datanucleus.api.jdo.JDOPersistenceManagerFactory.freezeConfiguration(JDOPersistenceManagerFactory.java:768)
... 62 more
Caused by: java.sql.SQLException: No suitable driver found for jdbc:derby:;databaseName=/data/03/yarn/nm/usercache/username/appcache/application_1507999204018_0292/container_e69_1507999204018_0292_01_000001/tmp/spark-633fb1f8-1f38-44ac-a54e-81465354bedc/metastore;create=true
at java.sql.DriverManager.getConnection(DriverManager.java:689)
at java.sql.DriverManager.getConnection(DriverManager.java:208)
at com.jolbox.bonecp.BoneCP.obtainRawInternalConnection(BoneCP.java:254)
at com.jolbox.bonecp.BoneCP.<init>(BoneCP.java:305)
Below is the code I am using.
object externaltableerror {
def main(args: Array[String]) {
val conf = new Configuration();
conf.set("fs.defaultFS", "hdfs://...")
conf.addResource("hdfs://.../core-site.xml");
conf.addResource("hdfs://.../hdfs-site.xml");
conf.addResource("hdfs://.../hive-site.xml");
val fs = FileSystem.get(conf)
val os = fs.create(new Path("/.../Error.txt"))
try {
//System.setProperty("hive.metastore.uris", "thrift://...");
val sc = new SparkContext(new SparkConf().setAppName("withhive"))
val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
val files = sc.textFile("hdfs://.../Example.txt").first()
val rdd = sc.parallelize(List(files))
val fm = rdd.flatMap(line => line.split("\t")).map(x => x.concat(" string"))
val alternative = fm.reduce((s1, s2) => s1 + "," + s2)
val ddl = "Create external table table_name(" + alternative + ") ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' LOCATION 'hdfs://.../' tblproperties (\"skip.header.line.count\"=\"1\")"
hiveContext.sql(ddl)
sc.stop()
} catch{
// case e : Exception => new PrintWriter("hdfs://.../Error.txt") { write(e.getStackTrace.mkString("\n")); close }
// println("H" + e.getStackTrace)
case e : Exception => os.write(e.getStackTrace.mkString("\n").getBytes)
}
}
}
Any suggestions on how to get the job running with oozie will be of great help. Thanks!

I had the same issue - I fixed it by using the parameter --files /etc/hive/conf/hive-site.xml in my spark-submit job. (first I tried it in the shell and then in oozie, because I launched a .sh file that contains the spark-submit sencence)

Related

ERROR SparkContext: Error initializing SparkContext locally

I'm new to everything related to java/scala/maven/sbt/spark, so please bear with me.
Managed to get everything up and running, or at least so it seems when running the spark-shell locally. The SparkContext gets initialized properly and I can create RDDs.
However, when I call spark-submit locally, I get a SparkContext error.
%SPARK_HOME%\bin\spark-submit --master local --class example.bd.MyApp target\scala-2.12\sparkapp_2.12-1.5.5.jar
This is my code.
package example.bd
import org.apache.spark.SparkConf
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
object MyApp {
def main(args : Array[String]) {
val conf = new SparkConf().setAppName("My first Spark application")
val sc = new SparkContext(conf)
println("hi everyone")
}
}
These are my error logs:
21/10/24 18:38:45 WARN Shell: Did not find winutils.exe: {}
java.io.FileNotFoundException: java.io.FileNotFoundException: HADOOP_HOME and hadoop.home.dir are unset. -see https://wiki.apache.org/hadoop/WindowsProblems
at org.apache.hadoop.util.Shell.fileNotFoundException(Shell.java:548)
at org.apache.hadoop.util.Shell.getHadoopHomeDir(Shell.java:569)
at org.apache.hadoop.util.Shell.getQualifiedBin(Shell.java:592)
at org.apache.hadoop.util.Shell.<clinit>(Shell.java:689)
at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:78)
at org.apache.hadoop.conf.Configuration.getTimeDurationHelper(Configuration.java:1814)
at org.apache.hadoop.conf.Configuration.getTimeDuration(Configuration.java:1791)
at org.apache.hadoop.util.ShutdownHookManager.getShutdownTimeout(ShutdownHookManager.java:183)
at org.apache.hadoop.util.ShutdownHookManager$HookEntry.<init>(ShutdownHookManager.java:207)
at org.apache.hadoop.util.ShutdownHookManager.addShutdownHook(ShutdownHookManager.java:302)
at org.apache.spark.util.SparkShutdownHookManager.install(ShutdownHookManager.scala:181)
at org.apache.spark.util.ShutdownHookManager$.shutdownHooks$lzycompute(ShutdownHookManager.scala:50)
at org.apache.spark.util.ShutdownHookManager$.shutdownHooks(ShutdownHookManager.scala:48)
at org.apache.spark.util.ShutdownHookManager$.addShutdownHook(ShutdownHookManager.scala:153)
at org.apache.spark.util.ShutdownHookManager$.<init>(ShutdownHookManager.scala:58)
at org.apache.spark.util.ShutdownHookManager$.<clinit>(ShutdownHookManager.scala)
at org.apache.spark.util.Utils$.createTempDir(Utils.scala:326)
at org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:343)
at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:894)
at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1039)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1048)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.io.FileNotFoundException: HADOOP_HOME and hadoop.home.dir are unset.
at org.apache.hadoop.util.Shell.checkHadoopHomeInner(Shell.java:468)
at org.apache.hadoop.util.Shell.checkHadoopHome(Shell.java:439)
at org.apache.hadoop.util.Shell.<clinit>(Shell.java:516)
... 21 more
21/10/24 18:38:45 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
21/10/24 18:38:45 INFO SparkContext: Running Spark version 3.1.2
21/10/24 18:38:45 INFO ResourceUtils: ==============================================================
21/10/24 18:38:45 INFO ResourceUtils: No custom resources configured for spark.driver.
21/10/24 18:38:45 INFO ResourceUtils: ==============================================================
21/10/24 18:38:45 INFO SparkContext: Submitted application: My first Spark application
21/10/24 18:38:45 INFO ResourceProfile: Default ResourceProfile created, executor resources: Map(cores -> name: cores, amount: 1, script: , vendor: , memory -> name: memory, amount: 1024, script: , vendor: , offHeap -> name: offHeap, amount: 0, script: , vendor: ), task resources: Map(cpus -> name: cpus, amount: 1.0)
21/10/24 18:38:45 INFO ResourceProfile: Limiting resource is cpu
21/10/24 18:38:45 INFO ResourceProfileManager: Added ResourceProfile id: 0
21/10/24 18:38:45 INFO SecurityManager: Changing view acls to: User
21/10/24 18:38:45 INFO SecurityManager: Changing modify acls to: User
21/10/24 18:38:45 INFO SecurityManager: Changing view acls groups to:
21/10/24 18:38:45 INFO SecurityManager: Changing modify acls groups to:
21/10/24 18:38:45 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(User); groups with view permissions: Set(); users with modify permissions: Set(User); groups with modify permissions: Set()
21/10/24 18:38:46 INFO Utils: Successfully started service 'sparkDriver' on port 56899.
21/10/24 18:38:46 INFO SparkEnv: Registering MapOutputTracker
21/10/24 18:38:46 INFO SparkEnv: Registering BlockManagerMaster
21/10/24 18:38:46 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
21/10/24 18:38:46 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
21/10/24 18:38:46 INFO SparkEnv: Registering BlockManagerMasterHeartbeat
21/10/24 18:38:46 INFO DiskBlockManager: Created local directory at C:\Users\User\AppData\Local\Temp\blockmgr-8df572ec-4206-48ae-b517-bd56242fca4c
21/10/24 18:38:46 INFO MemoryStore: MemoryStore started with capacity 366.3 MiB
21/10/24 18:38:46 INFO SparkEnv: Registering OutputCommitCoordinator
21/10/24 18:38:46 INFO Utils: Successfully started service 'SparkUI' on port 4040.
21/10/24 18:38:46 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://LAPTOP-RKJJV5E4:4040
21/10/24 18:38:46 INFO SparkContext: Added JAR file:/C:/[path]/sparkapp/target/scala-2.12/sparkapp_2.12-1.5.5.jar at spark://LAPTOP-RKJJV5E4:56899/jars/sparkapp_2.12-1.5.5.jar with timestamp 1635093525663
21/10/24 18:38:46 INFO Executor: Starting executor ID driver on host LAPTOP-RKJJV5E4
21/10/24 18:38:46 INFO Executor: Fetching spark://LAPTOP-RKJJV5E4:56899/jars/sparkapp_2.12-1.5.5.jar with timestamp 1635093525663
21/10/24 18:38:46 INFO TransportClientFactory: Successfully created connection to LAPTOP-RKJJV5E4/192.168.0.175:56899 after 30 ms (0 ms spent in bootstraps)
21/10/24 18:38:46 INFO Utils: Fetching spark://LAPTOP-RKJJV5E4:56899/jars/sparkapp_2.12-1.5.5.jar to C:\Users\User\AppData\Local\Temp\spark-9c13f31f-92a7-4fc5-a87d-a0ae6410e6d2\userFiles-5ac95e31-3656-4a4d-a205-e0750c041bcb\fetchFileTemp7994667570718611461.tmp
21/10/24 18:38:46 ERROR SparkContext: Error initializing SparkContext.
java.lang.RuntimeException: java.io.FileNotFoundException: java.io.FileNotFoundException: HADOOP_HOME and hadoop.home.dir are unset. -see https://wiki.apache.org/hadoop/WindowsProblems
at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:736)
at org.apache.hadoop.util.Shell.getSetPermissionCommand(Shell.java:271)
at org.apache.hadoop.fs.FileUtil.chmod(FileUtil.java:1120)
at org.apache.hadoop.fs.FileUtil.chmod(FileUtil.java:1106)
at org.apache.spark.util.Utils$.fetchFile(Utils.scala:563)
at org.apache.spark.executor.Executor.$anonfun$updateDependencies$13(Executor.scala:953)
at org.apache.spark.executor.Executor.$anonfun$updateDependencies$13$adapted(Executor.scala:945)
at scala.collection.TraversableLike$WithFilter.$anonfun$foreach$1(TraversableLike.scala:877)
at scala.collection.mutable.HashMap.$anonfun$foreach$1(HashMap.scala:149)
at scala.collection.mutable.HashTable.foreachEntry(HashTable.scala:237)
at scala.collection.mutable.HashTable.foreachEntry$(HashTable.scala:230)
at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:44)
at scala.collection.mutable.HashMap.foreach(HashMap.scala:149)
at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:876)
at org.apache.spark.executor.Executor.org$apache$spark$executor$Executor$$updateDependencies(Executor.scala:945)
at org.apache.spark.executor.Executor.<init>(Executor.scala:247)
at org.apache.spark.scheduler.local.LocalEndpoint.<init>(LocalSchedulerBackend.scala:64)
at org.apache.spark.scheduler.local.LocalSchedulerBackend.start(LocalSchedulerBackend.scala:132)
at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:220)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:579)
at example.bd.MyApp$.main(MyApp.scala:10)
at example.bd.MyApp.main(MyApp.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:951)
at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1039)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1048)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.io.FileNotFoundException: java.io.FileNotFoundException: HADOOP_HOME and hadoop.home.dir are unset. -see https://wiki.apache.org/hadoop/WindowsProblems
at org.apache.hadoop.util.Shell.fileNotFoundException(Shell.java:548)
at org.apache.hadoop.util.Shell.getHadoopHomeDir(Shell.java:569)
at org.apache.hadoop.util.Shell.getQualifiedBin(Shell.java:592)
at org.apache.hadoop.util.Shell.<clinit>(Shell.java:689)
at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:78)
at org.apache.hadoop.conf.Configuration.getTimeDurationHelper(Configuration.java:1814)
at org.apache.hadoop.conf.Configuration.getTimeDuration(Configuration.java:1791)
at org.apache.hadoop.util.ShutdownHookManager.getShutdownTimeout(ShutdownHookManager.java:183)
at org.apache.hadoop.util.ShutdownHookManager$HookEntry.<init>(ShutdownHookManager.java:207)
at org.apache.hadoop.util.ShutdownHookManager.addShutdownHook(ShutdownHookManager.java:302)
at org.apache.spark.util.SparkShutdownHookManager.install(ShutdownHookManager.scala:181)
at org.apache.spark.util.ShutdownHookManager$.shutdownHooks$lzycompute(ShutdownHookManager.scala:50)
at org.apache.spark.util.ShutdownHookManager$.shutdownHooks(ShutdownHookManager.scala:48)
at org.apache.spark.util.ShutdownHookManager$.addShutdownHook(ShutdownHookManager.scala:153)
at org.apache.spark.util.ShutdownHookManager$.<init>(ShutdownHookManager.scala:58)
at org.apache.spark.util.ShutdownHookManager$.<clinit>(ShutdownHookManager.scala)
at org.apache.spark.util.Utils$.createTempDir(Utils.scala:326)
at org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:343)
at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:894)
... 6 more
Caused by: java.io.FileNotFoundException: HADOOP_HOME and hadoop.home.dir are unset.
at org.apache.hadoop.util.Shell.checkHadoopHomeInner(Shell.java:468)
at org.apache.hadoop.util.Shell.checkHadoopHome(Shell.java:439)
at org.apache.hadoop.util.Shell.<clinit>(Shell.java:516)
... 21 more
21/10/24 18:38:46 INFO SparkUI: Stopped Spark web UI at http://LAPTOP-RKJJV5E4:4040
21/10/24 18:38:46 ERROR Utils: Uncaught exception in thread main
java.lang.NullPointerException
at org.apache.spark.scheduler.local.LocalSchedulerBackend.org$apache$spark$scheduler$local$LocalSchedulerBackend$$stop(LocalSchedulerBackend.scala:173)
at org.apache.spark.scheduler.local.LocalSchedulerBackend.stop(LocalSchedulerBackend.scala:144)
at org.apache.spark.scheduler.TaskSchedulerImpl.stop(TaskSchedulerImpl.scala:881)
at org.apache.spark.scheduler.DAGScheduler.stop(DAGScheduler.scala:2370)
at org.apache.spark.SparkContext.$anonfun$stop$12(SparkContext.scala:2069)
at org.apache.spark.util.Utils$.tryLogNonFatalError(Utils.scala:1419)
at org.apache.spark.SparkContext.stop(SparkContext.scala:2069)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:671)
at example.bd.MyApp$.main(MyApp.scala:10)
at example.bd.MyApp.main(MyApp.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:951)
at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1039)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1048)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
21/10/24 18:38:46 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
21/10/24 18:38:46 INFO MemoryStore: MemoryStore cleared
21/10/24 18:38:46 INFO BlockManager: BlockManager stopped
21/10/24 18:38:46 INFO BlockManagerMaster: BlockManagerMaster stopped
21/10/24 18:38:46 WARN MetricsSystem: Stopping a MetricsSystem that is not running
21/10/24 18:38:46 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
21/10/24 18:38:46 INFO SparkContext: Successfully stopped SparkContext
Exception in thread "main" java.lang.RuntimeException: java.io.FileNotFoundException: java.io.FileNotFoundException: HADOOP_HOME and hadoop.home.dir are unset. -see https://wiki.apache.org/hadoop/WindowsProblems
at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:736)
at org.apache.hadoop.util.Shell.getSetPermissionCommand(Shell.java:271)
at org.apache.hadoop.fs.FileUtil.chmod(FileUtil.java:1120)
at org.apache.hadoop.fs.FileUtil.chmod(FileUtil.java:1106)
at org.apache.spark.util.Utils$.fetchFile(Utils.scala:563)
at org.apache.spark.executor.Executor.$anonfun$updateDependencies$13(Executor.scala:953)
at org.apache.spark.executor.Executor.$anonfun$updateDependencies$13$adapted(Executor.scala:945)
at scala.collection.TraversableLike$WithFilter.$anonfun$foreach$1(TraversableLike.scala:877)
at scala.collection.mutable.HashMap.$anonfun$foreach$1(HashMap.scala:149)
at scala.collection.mutable.HashTable.foreachEntry(HashTable.scala:237)
at scala.collection.mutable.HashTable.foreachEntry$(HashTable.scala:230)
at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:44)
at scala.collection.mutable.HashMap.foreach(HashMap.scala:149)
at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:876)
at org.apache.spark.executor.Executor.org$apache$spark$executor$Executor$$updateDependencies(Executor.scala:945)
at org.apache.spark.executor.Executor.<init>(Executor.scala:247)
at org.apache.spark.scheduler.local.LocalEndpoint.<init>(LocalSchedulerBackend.scala:64)
at org.apache.spark.scheduler.local.LocalSchedulerBackend.start(LocalSchedulerBackend.scala:132)
at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:220)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:579)
at example.bd.MyApp$.main(MyApp.scala:10)
at example.bd.MyApp.main(MyApp.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:951)
at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1039)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1048)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.io.FileNotFoundException: java.io.FileNotFoundException: HADOOP_HOME and hadoop.home.dir are unset. -see https://wiki.apache.org/hadoop/WindowsProblems
at org.apache.hadoop.util.Shell.fileNotFoundException(Shell.java:548)
at org.apache.hadoop.util.Shell.getHadoopHomeDir(Shell.java:569)
at org.apache.hadoop.util.Shell.getQualifiedBin(Shell.java:592)
at org.apache.hadoop.util.Shell.<clinit>(Shell.java:689)
at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:78)
at org.apache.hadoop.conf.Configuration.getTimeDurationHelper(Configuration.java:1814)
at org.apache.hadoop.conf.Configuration.getTimeDuration(Configuration.java:1791)
at org.apache.hadoop.util.ShutdownHookManager.getShutdownTimeout(ShutdownHookManager.java:183)
at org.apache.hadoop.util.ShutdownHookManager$HookEntry.<init>(ShutdownHookManager.java:207)
at org.apache.hadoop.util.ShutdownHookManager.addShutdownHook(ShutdownHookManager.java:302)
at org.apache.spark.util.SparkShutdownHookManager.install(ShutdownHookManager.scala:181)
at org.apache.spark.util.ShutdownHookManager$.shutdownHooks$lzycompute(ShutdownHookManager.scala:50)
at org.apache.spark.util.ShutdownHookManager$.shutdownHooks(ShutdownHookManager.scala:48)
at org.apache.spark.util.ShutdownHookManager$.addShutdownHook(ShutdownHookManager.scala:153)
at org.apache.spark.util.ShutdownHookManager$.<init>(ShutdownHookManager.scala:58)
at org.apache.spark.util.ShutdownHookManager$.<clinit>(ShutdownHookManager.scala)
at org.apache.spark.util.Utils$.createTempDir(Utils.scala:326)
at org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:343)
at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:894)
... 6 more
Caused by: java.io.FileNotFoundException: HADOOP_HOME and hadoop.home.dir are unset.
at org.apache.hadoop.util.Shell.checkHadoopHomeInner(Shell.java:468)
at org.apache.hadoop.util.Shell.checkHadoopHome(Shell.java:439)
at org.apache.hadoop.util.Shell.<clinit>(Shell.java:516)
... 21 more
21/10/24 18:38:47 ERROR Utils: Uncaught exception in thread shutdown-hook-0
java.lang.NullPointerException
at org.apache.spark.executor.Executor.$anonfun$stop$3(Executor.scala:332)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at org.apache.spark.util.Utils$.withContextClassLoader(Utils.scala:222)
at org.apache.spark.executor.Executor.stop(Executor.scala:332)
at org.apache.spark.executor.Executor.$anonfun$new$2(Executor.scala:76)
at org.apache.spark.util.SparkShutdownHook.run(ShutdownHookManager.scala:214)
at org.apache.spark.util.SparkShutdownHookManager.$anonfun$runAll$2(ShutdownHookManager.scala:188)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1996)
at org.apache.spark.util.SparkShutdownHookManager.$anonfun$runAll$1(ShutdownHookManager.scala:188)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at scala.util.Try$.apply(Try.scala:213)
at org.apache.spark.util.SparkShutdownHookManager.runAll(ShutdownHookManager.scala:188)
at org.apache.spark.util.SparkShutdownHookManager$$anon$2.run(ShutdownHookManager.scala:178)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
21/10/24 18:38:47 INFO ShutdownHookManager: Shutdown hook called
21/10/24 18:38:47 INFO ShutdownHookManager: Deleting directory C:\Users\User\AppData\Local\Temp\spark-ac076276-e6ab-4cfb-a153-56ad35c46d56
21/10/24 18:38:47 INFO ShutdownHookManager: Deleting directory C:\Users\User\AppData\Local\Temp\spark-9c13f31f-92a7-4fc5-a87d-a0ae6410e6d2
As far as I can tell the errors are directly related to Hadoop not being set up, but that shouldn't be an issue given that I am trying to run it locally, no?
Any help will be greatly appreciated.

Scala-Ignite fetching data not working - Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/ignite/spark/IgniteDataFrameSettings$

I am trying to access data from ignite table using scala.
I have used pyspark to set the data in the ignite table and I can confirm that it is working and has been set.
This is the scala code that I am using
import org.apache.ignite.cache.query.SqlFieldsQuery
import org.apache.ignite.configuration.CacheConfiguration
import org.apache.ignite.{Ignite, Ignition}
import org.apache.log4j.{Level, Logger}
import org.apache.spark.sql.SparkSession
import org.apache.spark.sql.functions._
import org.apache.ignite.spark.IgniteDataFrameSettings._
import org.apache.ignite.spark.{IgniteContext, IgniteRDD}
object Main {
def main(args: Array[String]) =
{
val spark = SparkSession.builder.appName("App").config("spark.master", "local").getOrCreate()
val df = spark.read.format(FORMAT_IGNITE).option(OPTION_TABLE, "KEY" ).option(OPTION_CONFIG_FILE, "/opt/ignite/apache-ignite/examples/config/example-ignite.xml")
}
}
I am using sbt to package it as a jar and then a spark-submit command to run the application.
Here is the complete StackTrace:
$ /usr/share/spark-2.3.0-bin-hadoop2.6/bin/spark-submit target/scala-2.11/testing_2.11-0.1.0-SNAPSHOT.jar
2020-05-28 07:57:46 WARN NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2020-05-28 07:57:47 INFO SparkContext:54 - Running Spark version 2.3.0
2020-05-28 07:57:47 INFO SparkContext:54 - Submitted application: PyStage
2020-05-28 07:57:47 INFO SecurityManager:54 - Changing view acls to: root
2020-05-28 07:57:47 INFO SecurityManager:54 - Changing modify acls to: root
2020-05-28 07:57:47 INFO SecurityManager:54 - Changing view acls groups to:
2020-05-28 07:57:47 INFO SecurityManager:54 - Changing modify acls groups to:
2020-05-28 07:57:47 INFO SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); groups with view permissi
ons: Set(); users with modify permissions: Set(root); groups with modify permissions: Set()
2020-05-28 07:57:48 INFO Utils:54 - Successfully started service 'sparkDriver' on port 43585.
2020-05-28 07:57:48 INFO SparkEnv:54 - Registering MapOutputTracker
2020-05-28 07:57:48 INFO SparkEnv:54 - Registering BlockManagerMaster
2020-05-28 07:57:48 INFO BlockManagerMasterEndpoint:54 - Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
2020-05-28 07:57:48 INFO BlockManagerMasterEndpoint:54 - BlockManagerMasterEndpoint up
2020-05-28 07:57:48 INFO DiskBlockManager:54 - Created local directory at /tmp/blockmgr-4f5e4f96-3dcf-4757-a5b4-eb96e98ab10b
2020-05-28 07:57:48 INFO MemoryStore:54 - MemoryStore started with capacity 366.3 MB
2020-05-28 07:57:48 INFO SparkEnv:54 - Registering OutputCommitCoordinator
2020-05-28 07:57:48 INFO log:192 - Logging initialized #4539ms
2020-05-28 07:57:48 INFO Server:346 - jetty-9.3.z-SNAPSHOT
2020-05-28 07:57:48 INFO Server:414 - Started #4753ms
2020-05-28 07:57:48 WARN Utils:66 - Service 'SparkUI' could not bind on port 4040. Attempting port 4041.
2020-05-28 07:57:48 WARN Utils:66 - Service 'SparkUI' could not bind on port 4041. Attempting port 4042.
2020-05-28 07:57:48 INFO AbstractConnector:278 - Started ServerConnector#408b35bf{HTTP/1.1,[http/1.1]}{0.0.0.0:4042}
2020-05-28 07:57:48 INFO Utils:54 - Successfully started service 'SparkUI' on port 4042.
2020-05-28 07:57:48 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler#82c57b3{/jobs,null,AVAILABLE,#Spark}
2020-05-28 07:57:48 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler#1e886a5b{/jobs/json,null,AVAILABLE,#Spark}
2020-05-28 07:57:48 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler#d816dde{/jobs/job,null,AVAILABLE,#Spark}
2020-05-28 07:57:48 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler#6c451c9c{/jobs/job/json,null,AVAILABLE,#Spark}
2020-05-28 07:57:48 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler#31c269fd{/stages,null,AVAILABLE,#Spark}
2020-05-28 07:57:48 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler#372b0d86{/stages/json,null,AVAILABLE,#Spark}
2020-05-28 07:57:48 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler#47747fb9{/stages/stage,null,AVAILABLE,#Spark}
2020-05-28 07:57:48 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler#4e9658b5{/stages/stage/json,null,AVAILABLE,#Spark}
2020-05-28 07:57:48 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler#2a7b6f69{/stages/pool,null,AVAILABLE,#Spark}
2020-05-28 07:57:48 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler#20312893{/stages/pool/json,null,AVAILABLE,#Spark}
2020-05-28 07:57:48 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler#70eecdc2{/storage,null,AVAILABLE,#Spark}
2020-05-28 07:57:48 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler#c41709a{/storage/json,null,AVAILABLE,#Spark}
2020-05-28 07:57:48 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler#7db0565c{/storage/rdd,null,AVAILABLE,#Spark}
2020-05-28 07:57:48 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler#54ec8cc9{/storage/rdd/json,null,AVAILABLE,#Spark}
2020-05-28 07:57:48 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler#52eacb4b{/environment,null,AVAILABLE,#Spark}
2020-05-28 07:57:48 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler#5528a42c{/environment/json,null,AVAILABLE,#Spark}
2020-05-28 07:57:48 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler#2a551a63{/executors,null,AVAILABLE,#Spark}
2020-05-28 07:57:48 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler#1a6f5124{/executors/json,null,AVAILABLE,#Spark}
2020-05-28 07:57:48 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler#1edb61b1{/executors/threadDump,null,AVAILABLE,#Spark}
2020-05-28 07:57:48 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler#ec2bf82{/executors/threadDump/json,null,AVAILABLE,#Spark}
2020-05-28 07:57:48 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler#cc62a3b{/static,null,AVAILABLE,#Spark}
2020-05-28 07:57:48 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler#1fe8d51b{/,null,AVAILABLE,#Spark}
2020-05-28 07:57:48 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler#781e7326{/api,null,AVAILABLE,#Spark}
2020-05-28 07:57:48 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler#324dcd31{/jobs/job/kill,null,AVAILABLE,#Spark}
2020-05-28 07:57:48 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler#503d56b5{/stages/stage/kill,null,AVAILABLE,#Spark}
2020-05-28 07:57:48 INFO SparkUI:54 - Bound SparkUI to 0.0.0.0, and started at http://6e4338526cac:4042
2020-05-28 07:57:49 INFO SparkContext:54 - Added JAR file:/home/Documents/PyStage/Job-Manager/testing/target/scala-2.11/testing_2.11-0.1.0-SNAPSHOT.jar at spark://6e4338526cac:43585/jars/testing_2.11-0.1.0-SNAPSHOT.jar with timestamp 1590652669088
2020-05-28 07:57:49 INFO Executor:54 - Starting executor ID driver on host localhost
2020-05-28 07:57:49 INFO Utils:54 - Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 43227.
2020-05-28 07:57:49 INFO NettyBlockTransferService:54 - Server created on 6e4338526cac:43227
2020-05-28 07:57:49 INFO BlockManager:54 - Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
2020-05-28 07:57:49 INFO BlockManagerMaster:54 - Registering BlockManager BlockManagerId(driver, 6e4338526cac, 43227, None)
2020-05-28 07:57:49 INFO BlockManagerMasterEndpoint:54 - Registering block manager 6e4338526cac:43227 with 366.3 MB RAM, BlockManagerId(driver, 6e4338526cac, 43227, None)
2020-05-28 07:57:49 INFO BlockManagerMaster:54 - Registered BlockManager BlockManagerId(driver, 6e4338526cac, 43227, None)
2020-05-28 07:57:49 INFO BlockManager:54 - Initialized BlockManager: BlockManagerId(driver, 6e4338526cac, 43227, None)
2020-05-28 07:57:49 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler#25c5e994{/metrics/json,null,AVAILABLE,#Spark}
2020-05-28 07:57:50 INFO SharedState:54 - Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir ('file:/home/Documents/PyStage/Job-Manager/testing/spark-warehouse/').
2020-05-28 07:57:50 INFO SharedState:54 - Warehouse path is 'file:/home/Documents/PyStage/Job-Manager/testing/spark-warehouse/'.
2020-05-28 07:57:50 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler#5ca1f591{/SQL,null,AVAILABLE,#Spark}
2020-05-28 07:57:50 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler#551de37d{/SQL/json,null,AVAILABLE,#Spark}
2020-05-28 07:57:50 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler#997d532{/SQL/execution,null,AVAILABLE,#Spark}
2020-05-28 07:57:50 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler#273842a6{/SQL/execution/json,null,AVAILABLE,#Spark}
2020-05-28 07:57:50 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler#b558294{/static/sql,null,AVAILABLE,#Spark}
2020-05-28 07:57:51 INFO StateStoreCoordinatorRef:54 - Registered StateStoreCoordinator endpoint
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/ignite/spark/IgniteDataFrameSettings$
at Main$.main(Main.scala:21)
at Main.main(Main.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:879)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:197)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:227)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:136)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException: org.apache.ignite.spark.IgniteDataFrameSettings$
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
... 12 more
2020-05-28 07:57:51 INFO SparkContext:54 - Invoking stop() from shutdown hook
2020-05-28 07:57:51 INFO AbstractConnector:318 - Stopped Spark#408b35bf{HTTP/1.1,[http/1.1]}{0.0.0.0:4042}
2020-05-28 07:57:51 INFO SparkUI:54 - Stopped Spark web UI at http://6e4338526cac:4042
2020-05-28 07:57:51 INFO MapOutputTrackerMasterEndpoint:54 - MapOutputTrackerMasterEndpoint stopped!
2020-05-28 07:57:51 INFO MemoryStore:54 - MemoryStore cleared
2020-05-28 07:57:51 INFO BlockManager:54 - BlockManager stopped
2020-05-28 07:57:51 INFO BlockManagerMaster:54 - BlockManagerMaster stopped
2020-05-28 07:57:51 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint:54 - OutputCommitCoordinator stopped!
2020-05-28 07:57:51 INFO SparkContext:54 - Successfully stopped SparkContext
2020-05-28 07:57:51 INFO ShutdownHookManager:54 - Shutdown hook called
2020-05-28 07:57:51 INFO ShutdownHookManager:54 - Deleting directory /tmp/spark-53a4999c-b458-4dc0-b6d6-af52135c304a
2020-05-28 07:57:51 INFO ShutdownHookManager:54 - Deleting directory /tmp/spark-35f14174-cf0b-43cd-8438-c61323f2bc3d
Version that is being used:
Scala version - 2.11.0
Spark version - 2.3.0
Ignite version - 2.8.0
Sbt version - 1.3.3
Please help me out here, I am a newbie to this and not sure what or how I can resolve this. Thanks
Download Spark-Ignite jar try any of the below options
add it in your Spark's lib folder
add it in --jars option of spark-submit command.
or Add it in --driver-class-path option of spark-submit command.

Issue while setting up nifi cluster

I followed the below tutorial to setup nifi cluster. Performed everything as mentioned, but receiving the below error.
https://pierrevillard.com/2016/08/13/apache-nifi-1-0-0-cluster-setup/
But the zookeeper was unable to elect a leader and getting the following error message.
2018-04-20 09:27:40,942 INFO [main] org.wali.MinimalLockingWriteAheadLog Successfully recovered 0 records in 3 milliseconds 2018-04-20 09:27:41,007 INFO [main] org.wali.MinimalLockingWriteAheadLog org.wali.MinimalLockingWriteAheadLog#27caa186 checkpointed with 0 Records and 0 Swap Files in 65 milliseconds (Stop-the-world time = 1 milliseconds, Clear Edit Logs time = 7 millis), max Transaction ID -1 2018-04-20 09:27:41,300 INFO [main] o.a.z.server.DatadirCleanupManager autopurge.snapRetainCount set to 30 2018-04-20 09:27:41,300 INFO [main] o.a.z.server.DatadirCleanupManager autopurge.purgeInterval set to 24 2018-04-20 09:27:41,311 INFO [main] o.a.n.c.s.server.ZooKeeperStateServer Starting Embedded ZooKeeper Peer 2018-04-20 09:27:41,319 INFO [PurgeTask] o.a.z.server.DatadirCleanupManager Purge task started. 2018-04-20 09:27:41,343 INFO [PurgeTask] o.a.z.server.DatadirCleanupManager Purge task completed. 2018-04-20 09:27:41,621 INFO [main] o.apache.nifi.controller.FlowController Checking if there is already a Cluster Coordinator Elected... 2018-04-20 09:27:41,907 INFO [main] o.a.c.f.imps.CuratorFrameworkImpl Starting 2018-04-20 09:27:49,882 WARN [main] o.a.n.c.l.e.CuratorLeaderElectionManager Unable to determine the Elected Leader for role 'Cluster Coordinator' due to org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /nifi/leaders/Cluster Coordinator; assuming no leader has been elected 2018-04-20 09:27:49,885 INFO [Curator-Framework-0] o.a.c.f.imps.CuratorFrameworkImpl backgroundOperationsLoop exiting 2018-04-20 09:27:49,986 INFO [main] o.apache.nifi.controller.FlowController It appears that no Cluster Coordinator has been Elected yet. Registering for Cluster Coordinator Role. 2018-04-20 09:27:49,988 INFO [main] o.a.n.c.l.e.CuratorLeaderElectionManager CuratorLeaderElectionManager[stopped=true] Registered new Leader Selector for role Cluster Coordinator; this node is an active participant in the election. 2018-04-20 09:27:49,988 INFO [main] o.a.c.f.imps.CuratorFrameworkImpl Starting 2018-04-20 09:27:49,997 INFO [main] o.a.n.c.l.e.CuratorLeaderElectionManager CuratorLeaderElectionManager[stopped=false] Registered new Leader Selector for role Cluster Coordinator; this node is an active participant in the election. 2018-04-20 09:27:49,997 INFO [main] o.a.n.c.l.e.CuratorLeaderElectionManager CuratorLeaderElectionManager[stopped=false] started 2018-04-20 09:27:49,997 INFO [main] o.a.n.c.c.h.AbstractHeartbeatMonitor Heartbeat Monitor started 2018-04-20 09:27:58,179 INFO [main] o.e.jetty.server.handler.ContextHandler Started o.e.j.w.WebAppContext#76d9fe26{/nifi-api,file:///root/nifi-1.6.0/work/jetty/nifi-web-api-1.6.0.war/webapp/,AVAILABLE}{./work/nar/framework/nifi-framework-nar-1.6.0.nar-unpacked/META-INF/bundled-dependencies/nifi-web-api-1.6.0.war} 2018-04-20 09:27:59,657 INFO [main] o.e.j.a.AnnotationConfiguration Scanning elapsed time=861ms 2018-04-20 09:27:59,834 INFO [main] o.e.j.C./nifi-content-viewer No Spring WebApplicationInitializer types detected on classpath 2018-04-20 09:27:59,840 INFO [main] o.e.jetty.server.handler.ContextHandler Started o.e.j.w.WebAppContext#43b41cb9{/nifi-content-viewer,file:///root/nifi-1.6.0/work/jetty/nifi-web-content-viewer-1.6.0.war/webapp/,AVAILABLE}{./work/nar/framework/nifi-framework-nar-1.6.0.nar-unpacked/META-INF/bundled-dependencies/nifi-web-content-viewer-1.6.0.war} 2018-04-20 09:27:59,863 INFO [main] o.e.jetty.server.handler.ContextHandler Started o.e.j.s.h.ContextHandler#49825659{/nifi-docs,null,AVAILABLE} 2018-04-20 09:28:00,079 INFO [main] o.e.j.a.AnnotationConfiguration Scanning elapsed time=69ms 2018-04-20 09:28:00,083 INFO [main] o.e.jetty.ContextHandler./nifi-docs No Spring WebApplicationInitializer types detected on classpath 2018-04-20 09:28:00,171 INFO [main] o.e.jetty.server.handler.ContextHandler Started o.e.j.w.WebAppContext#2ab0ca04{/nifi-docs,file:///root/nifi-1.6.0/work/jetty/nifi-web-docs-1.6.0.war/webapp/,AVAILABLE}{./work/nar/framework/nifi-framework-nar-1.6.0.nar-unpacked/META-INF/bundled-dependencies/nifi-web-docs-1.6.0.war} 2018-04-20 09:28:00,343 INFO [main] o.e.j.a.AnnotationConfiguration Scanning elapsed time=119ms 2018-04-20 09:28:00,344 INFO [main] org.eclipse.jetty.ContextHandler./ No Spring WebApplicationInitializer types detected on classpath 2018-04-20 09:28:00,454 INFO [main] o.e.jetty.server.handler.ContextHandler Started o.e.j.w.WebAppContext#67e255cf{/,file:///root/nifi-1.6.0/work/jetty/nifi-web-error-1.6.0.war/webapp/,AVAILABLE}{./work/nar/framework/nifi-framework-nar-1.6.0.nar-unpacked/META-INF/bundled-dependencies/nifi-web-error-1.6.0.war} 2018-04-20 09:28:00,541 INFO [main] o.eclipse.jetty.server.AbstractConnector Started ServerConnector#6d9f624{HTTP/1.1,[http/1.1]}{node-1:8080} 2018-04-20 09:28:00,559 INFO [main] org.eclipse.jetty.server.Server Started #98628ms 2018-04-20 09:28:00,599 INFO [main] org.apache.nifi.web.server.JettyServer Loading Flow... 2018-04-20 09:28:00,688 INFO [main] org.apache.nifi.io.socket.SocketListener Now listening for connections from nodes on port 9999 2018-04-20 09:28:00,916 INFO [main] o.apache.nifi.controller.FlowController Successfully synchronized controller with proposed flow 2018-04-20 09:28:01,337 INFO [main] o.a.nifi.controller.StandardFlowService Connecting Node: node-1:8080 2018-04-20 09:28:07,922 INFO [Curator-Framework-0] o.a.c.f.state.ConnectionStateManager State change: SUSPENDED 2018-04-20 09:28:07,927 INFO [Curator-ConnectionStateManager-0] o.a.n.c.l.e.CuratorLeaderElectionManager org.apache.nifi.controller.leader.election.CuratorLeaderElectionManager$ElectionListener#188207a7 Connection State changed to SUSPENDED 2018-04-20 09:28:07,930 ERROR [Curator-Framework-0] o.a.c.f.imps.CuratorFrameworkImpl Background operation retry gave up org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss
at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
at org.apache.curator.framework.imps.CuratorFrameworkImpl.checkBackgroundRetry(CuratorFrameworkImpl.java:728)
at org.apache.curator.framework.imps.CuratorFrameworkImpl.performBackgroundOperation(CuratorFrameworkImpl.java:857)
at org.apache.curator.framework.imps.CuratorFrameworkImpl.backgroundOperationsLoop(CuratorFrameworkImpl.java:809)
at org.apache.curator.framework.imps.CuratorFrameworkImpl.access$300(CuratorFrameworkImpl.java:64)
at org.apache.curator.framework.imps.CuratorFrameworkImpl$4.call(CuratorFrameworkImpl.java:267)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2018-04-19 15:56:55,209 ERROR [Curator-Framework-0] o.a.c.f.imps.CuratorFrameworkImpl Background operation retry gave up
org.apache.zookeeper.KeeperException$ConnectionLossException:
KeeperErrorCode = ConnectionLoss
at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
at
org.apache.curator.framework.imps.CuratorFrameworkImpl.checkBackgroundRetry(CuratorFrameworkImpl.java:728)
at org.apache.curator.framework.imps.CuratorFrameworkImpl.performBackgroundOperation(CuratorFrameworkImpl.java:857)
at org.apache.curator.framework.imps.CuratorFrameworkImpl.backgroundOperationsLoop(CuratorFrameworkImpl.java:809)
at org.apache.curator.framework.imps.CuratorFrameworkImpl.access$300(CuratorFrameworkImpl.java:64)
at org.apache.curator.framework.imps.CuratorFrameworkImpl$4.call(CuratorFrameworkImpl.java:267)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
I set the following properties on all 3 nodes:
mkdir ./state
mkdir ./state/zookeeper
echo 1 > ./state/zookeeper/myid (2 & 3 for other nodes)
in zookeeper.properties
server.1=node-1:2888:3888
server.2=node-2:2888:3888
server.3=node-3:2888:3888
in nifi.properties
nifi.state.management.embedded.zookeeper.start=true
nifi.zookeeper.connect.string=node-1:2181,node-2:2181,node-3:2181
nifi.cluster.protocol.is.secure=false
nifi.cluster.is.node=true
nifi.cluster.node.address=node-1(node-2 & node-3)
nifi.cluster.node.protocol.port=9999
nifi.cluster.node.protocol.threads=10
nifi.cluster.node.event.history.size=25
nifi.cluster.node.connection.timeout=5 sec
nifi.cluster.node.read.timeout=5 sec
nifi.cluster.firewall.file=
nifi.remote.input.host=node-1(node-2 & node-3)
nifi.remote.input.secure=false
nifi.remote.input.socket.port=9998
nifi.remote.input.http.enabled=true
nifi.remote.input.http.transaction.ttl=30 sec
nifi.web.http.host=node-1(node-2 & node-3)
After Bryan's comment, got reminded that I forgot to disable the iptables (I guess I can even open the ports but since this is for my practice I am disabling the iptables) and its working now.

Unable to connect to Spark master

I start my DataStax cassandra instance with Spark:
dse cassandra -k
I then run this program (from within Eclipse):
import org.apache.spark.sql.SQLContext
import org.apache.spark.SparkConf
import org.apache.spark.SparkContext
object Start {
def main(args: Array[String]): Unit = {
println("***** 1 *****")
val sparkConf = new SparkConf().setAppName("Start").setMaster("spark://127.0.0.1:7077")
println("***** 2 *****")
val sparkContext = new SparkContext(sparkConf)
println("***** 3 *****")
}
}
And I get the following output
***** 1 *****
***** 2 *****
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
15/12/29 15:27:50 INFO SparkContext: Running Spark version 1.5.2
15/12/29 15:27:51 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/12/29 15:27:51 INFO SecurityManager: Changing view acls to: nayan
15/12/29 15:27:51 INFO SecurityManager: Changing modify acls to: nayan
15/12/29 15:27:51 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(nayan); users with modify permissions: Set(nayan)
15/12/29 15:27:52 INFO Slf4jLogger: Slf4jLogger started
15/12/29 15:27:52 INFO Remoting: Starting remoting
15/12/29 15:27:53 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver#10.0.1.88:55126]
15/12/29 15:27:53 INFO Utils: Successfully started service 'sparkDriver' on port 55126.
15/12/29 15:27:53 INFO SparkEnv: Registering MapOutputTracker
15/12/29 15:27:53 INFO SparkEnv: Registering BlockManagerMaster
15/12/29 15:27:53 INFO DiskBlockManager: Created local directory at /private/var/folders/pd/6rxlm2js10gg6xys5wm90qpm0000gn/T/blockmgr-21a96671-c33e-498c-83a4-bb5c57edbbfb
15/12/29 15:27:53 INFO MemoryStore: MemoryStore started with capacity 983.1 MB
15/12/29 15:27:53 INFO HttpFileServer: HTTP File server directory is /private/var/folders/pd/6rxlm2js10gg6xys5wm90qpm0000gn/T/spark-fce0a058-9264-4f2c-8220-c32d90f11bd8/httpd-2a0efcac-2426-49c5-982a-941cfbb48c88
15/12/29 15:27:53 INFO HttpServer: Starting HTTP Server
15/12/29 15:27:53 INFO Utils: Successfully started service 'HTTP file server' on port 55127.
15/12/29 15:27:53 INFO SparkEnv: Registering OutputCommitCoordinator
15/12/29 15:27:53 INFO Utils: Successfully started service 'SparkUI' on port 4040.
15/12/29 15:27:53 INFO SparkUI: Started SparkUI at http://10.0.1.88:4040
15/12/29 15:27:54 WARN MetricsSystem: Using default name DAGScheduler for source because spark.app.id is not set.
15/12/29 15:27:54 INFO AppClient$ClientEndpoint: Connecting to master spark://127.0.0.1:7077...
15/12/29 15:27:54 WARN ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkMaster#127.0.0.1:7077] has failed, address is now gated for [5000] ms. Reason: [Disassociated]
15/12/29 15:28:14 ERROR SparkUncaughtExceptionHandler: Uncaught exception in thread Thread[appclient-registration-retry-thread,5,main]
java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.FutureTask#1f22aef0 rejected from java.util.concurrent.ThreadPoolExecutor#176cb4af[Running, pool size = 1, active threads = 1, queued tasks = 0, completed tasks = 0]
So something is happening during the creation of the spark context.
When i look in $DSE_HOME/logs/spark, it is empty. Not sure where else to look.
It turns out that the problem was the spark library version AND the Scala version. DataStax was running Spark 1.4.1 and Scala 2.10.5, while my eclipse project was using 1.5.2 & 2.11.7 respectively.
Note that BOTH the Spark library and Scala appear to have to match. I tried other combinations, but it only worked when both matched.
I am getting pretty familiar with this part of your posted error:
WARN ReliableDeliverySupervisor: Association with remote system [akka.tcp://...
It can have numerous causes, pretty much all related to misconfigured IPs. First I would do whatever zero323 says, then here's my two cents: I have solved my own problems recently by using IP addresses, not hostnames, and the only config I use in a simple standalone cluster is SPARK_MASTER_IP.
SPARK_MASTER_IP in the $SPARK_HOME/conf/spark-env.sh on your master then should lead the master webui to show the IP address you set:
spark://your.ip.address.numbers:7077
And your SparkConf setup can refer to that.
Having said that, I am not familiar with your specific implementation but I notice in the error two occurrences containing:
/private/var/folders/pd/6rxlm2js10gg6xys5wm90qpm0000gn/T/
Have you looked there to see if there's a logs directory? Is that where $DSE_HOME points? Alternatively connect to the driver where it creates it's webui:
INFO SparkUI: Started SparkUI at http://10.0.1.88:4040
and you should see a link to an error log there somewhere.
More on the IP vs. hostname thing, this very old bug is marked as Resolved but I have not figured out what they mean by Resolved, so I just tend toward IP addresses.

How to set remoteHost in spark RetryingBlockFetcher IOException

I apologize for such an extremely long post, but I wanted to be better understood.
I have built up my cluster, where master in on another machine than workers. Workers are allocated on a quite efficient machine. Between these two machines no firewall is applied.
URL: spark://MASTER_IP:7077
Workers: 10
Cores: 10 Total, 0 Used
Memory: 40.0 GB Total, 0.0 B Used
Applications: 0 Running, 0 Completed
Drivers: 0 Running, 0 Completed
Status: ALIVE
Before launching the app, in the workers logfile is (an example for one worker)
15/03/06 18:52:19 INFO Worker: Registered signal handlers for [TERM, HUP, INT]
15/03/06 18:52:19 INFO SecurityManager: Changing view acls to: szymon
15/03/06 18:52:19 INFO SecurityManager: Changing modify acls to: szymon
15/03/06 18:52:19 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(szymon); users with modify permissions: Set(szymon)
15/03/06 18:52:20 INFO Slf4jLogger: Slf4jLogger started
15/03/06 18:52:20 INFO Remoting: Starting remoting
15/03/06 18:52:20 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkWorker#WORKER_MACHINE_IP:42240]
15/03/06 18:52:20 INFO Remoting: Remoting now listens on addresses: [akka.tcp://sparkWorker#WORKER_MACHINE_IP:42240]
15/03/06 18:52:20 INFO Utils: Successfully started service 'sparkWorker' on port 42240.
15/03/06 18:52:20 INFO Worker: Starting Spark worker WORKER_MACHINE_IP:42240 with 1 cores, 4.0 GB RAM
15/03/06 18:52:20 INFO Worker: Spark home: /home/szymon/spark
15/03/06 18:52:20 INFO Utils: Successfully started service 'WorkerUI' on port 8081.
15/03/06 18:52:20 INFO WorkerWebUI: Started WorkerWebUI at http://WORKER_MACHINE_IP:8081
15/03/06 18:52:20 INFO Worker: Connecting to master spark://MASTER_IP:7077...
15/03/06 18:52:20 INFO Worker: Successfully registered with master spark://MASTER_IP:7077
I launch my application on a cluster (on the master machine)
./bin/spark-submit --class SimpleApp --master spark://MASTER_IP:7077 --executor-memory 3g --total-executor-cores 10 code/trial_2.11-0.9.jar
My app is then fetched by workers, this is an example of the log output for a worker (#WORKER_MACHINE)
15/03/06 18:07:45 INFO ExecutorRunner: Launch command: "/usr/java/jdk1.8.0_31/bin/java" "-cp" "::/home/machine/spark/conf:/home/machine/spark/assembly/target/scala-2.10/spark-assembly-1.2.1-hadoop2.4.0.jar" "-Dspark.driver.port=56753" "-Dlog4j.configuration=file:////home/machine/spark/conf/log4j.properties" "-Dspark.driver.host=MASTER_IP" "-Xms3072M" "-Xmx3072M" "org.apache.spark.executor.CoarseGrainedExecutorBackend" "akka.tcp://sparkDriver#MASTER_IP:56753/user/CoarseGrainedScheduler" "4" "WORKER_MACHINE_IP" "1" "app-20150306181450-0000" "akka.tcp://sparkWorker#WORKER_MACHINE_IP:45288/user/Worker"
The app wants to connect to localhost at address 127.0.0.1 instead of MASTER_IP (I believe).
How could it be fixed?
15/03/06 18:58:52 ERROR RetryingBlockFetcher: Exception while beginning fetch of 1 outstanding blocks
java.io.IOException: Failed to connect to localhost/127.0.0.1:56545
at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:191)
The problem is caused by createClient method in TransportClientFactory which is in spark-network-common_2.10-1.2.1-sources.jar, String remoteHost is set up as localhost
/**
* Create a {#link TransportClient} connecting to the given remote host / port.
*
* We maintains an array of clients (size determined by spark.shuffle.io.numConnectionsPerPeer)
* and randomly picks one to use. If no client was previously created in the randomly selected
* spot, this function creates a new client and places it there.
*
* Prior to the creation of a new TransportClient, we will execute all
* {#link TransportClientBootstrap}s that are registered with this factory.
*
* This blocks until a connection is successfully established and fully bootstrapped.
*
* Concurrency: This method is safe to call from multiple threads.
*/
public TransportClient createClient(String remoteHost, int remotePort) throws IOException {
// Get connection from the connection pool first.
// If it is not found or not active, create a new one.
final InetSocketAddress address = new InetSocketAddress(remoteHost, remotePort);
.
.
.
clientPool.clients[clientIndex] = createClient(address);
}
Here is the file spark-env.sh on the workers site
export SPARK_HOME=/home/szymon/spark
export SPARK_MASTER_IP=MASTER_IP
export SPARK_MASTER_WEBUI_PORT=8081
export SPARK_LOCAL_IP=WORKER_MACHINE_IP
export SPARK_DRIVER_HOST=WORKER_MACHINE_IP
export SPARK_LOCAL_DIRS=/home/szymon/spark/slaveData
export SPARK_WORKER_INSTANCES=10
export SPARK_WORKER_CORES=1
export SPARK_WORKER_MEMORY=4g
export SPARK_WORKER_DIR=/home/szymon/spark/work
And on the master
export SPARK_MASTER_IP=MASTER_IP
export SPARK_LOCAL_IP=MASTER_IP
export SPARK_MASTER_WEBUI_PORT=8081
export SPARK_JAVA_OPTS="-Dlog4j.configuration=file:////home/szymon/spark/conf/log4j.properties -Dspark.driver.host=MASTER_IP"
export SPARK_MASTER_OPTS="-Dspark.deploy.defaultCores=10"
This is the full log output with more details
15/03/06 18:58:50 INFO Worker: Asked to launch executor app-20150306190555-0000/0 for Simple Application
15/03/06 18:58:50 INFO ExecutorRunner: Launch command: "/usr/java/jdk1.8.0_31/bin/java" "-cp" "::/home/szymon/spark/conf:/home/szymon/spark/assembly/target/scala-2.10/spark-assembly-1.2.1-hadoop2.4.0.jar" "-Dspark.driver.port=49407" "-Dlog4j.configuration=file:////home/szymon/spark/conf/log4j.properties" "-Dspark.driver.host=MASTER_IP" "-Xms3072M" "-Xmx3072M" "org.apache.spark.executor.CoarseGrainedExecutorBackend" "akka.tcp://sparkDriver#MASTER_IP:49407/user/CoarseGrainedScheduler" "0" "WORKER_MACHINE_IP" "1" "app-20150306190555-0000" "akka.tcp://sparkWorker#WORKER_MACHINE_IP:42240/user/Worker"
15/03/06 18:58:50 INFO CoarseGrainedExecutorBackend: Registered signal handlers for [TERM, HUP, INT]
15/03/06 18:58:51 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/03/06 18:58:51 INFO SecurityManager: Changing view acls to: szymon
15/03/06 18:58:51 INFO SecurityManager: Changing modify acls to: szymon
15/03/06 18:58:51 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(szymon); users with modify permissions: Set(szymon)
15/03/06 18:58:51 INFO Slf4jLogger: Slf4jLogger started
15/03/06 18:58:51 INFO Remoting: Starting remoting
15/03/06 18:58:51 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://driverPropsFetcher#WORKER_MACHINE_IP:52038]
15/03/06 18:58:51 INFO Utils: Successfully started service 'driverPropsFetcher' on port 52038.
15/03/06 18:58:52 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
15/03/06 18:58:52 INFO SecurityManager: Changing view acls to: szymon
15/03/06 18:58:52 INFO SecurityManager: Changing modify acls to: szymon
15/03/06 18:58:52 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(szymon); users with modify permissions: Set(szymon)
15/03/06 18:58:52 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
15/03/06 18:58:52 INFO Slf4jLogger: Slf4jLogger started
15/03/06 18:58:52 INFO Remoting: Starting remoting
15/03/06 18:58:52 INFO RemoteActorRefProvider$RemotingTerminator: Remoting shut down.
15/03/06 18:58:52 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkExecutor#WORKER_MACHINE_IP:37114]
15/03/06 18:58:52 INFO Utils: Successfully started service 'sparkExecutor' on port 37114.
15/03/06 18:58:52 INFO CoarseGrainedExecutorBackend: Connecting to driver: akka.tcp://sparkDriver#MASTER_IP:49407/user/CoarseGrainedScheduler
15/03/06 18:58:52 INFO WorkerWatcher: Connecting to worker akka.tcp://sparkWorker#WORKER_MACHINE_IP:42240/user/Worker
15/03/06 18:58:52 INFO WorkerWatcher: Successfully connected to akka.tcp://sparkWorker#WORKER_MACHINE_IP:42240/user/Worker
15/03/06 18:58:52 INFO CoarseGrainedExecutorBackend: Successfully registered with driver
15/03/06 18:58:52 INFO Executor: Starting executor ID 0 on host WORKER_MACHINE_IP
15/03/06 18:58:52 INFO SecurityManager: Changing view acls to: szymon
15/03/06 18:58:52 INFO SecurityManager: Changing modify acls to: szymon
15/03/06 18:58:52 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(szymon); users with modify permissions: Set(szymon)
15/03/06 18:58:52 INFO AkkaUtils: Connecting to MapOutputTracker: akka.tcp://sparkDriver#MASTER_IP:49407/user/MapOutputTracker
15/03/06 18:58:52 INFO AkkaUtils: Connecting to BlockManagerMaster: akka.tcp://sparkDriver#MASTER_IP:49407/user/BlockManagerMaster
15/03/06 18:58:52 INFO DiskBlockManager: Created local directory at /home/szymon/spark/slaveData/spark-b09c3727-8559-4ab8-ab32-1f5ecf7aeaf2/spark-0c892a4d-c8b9-4144-a259-8077f5316b52/spark-89577a43-fb43-4a12-a305-34b267b01f8a/spark-7ad207c4-9d37-42eb-95e4-7b909b71c687
15/03/06 18:58:52 INFO MemoryStore: MemoryStore started with capacity 1589.8 MB
15/03/06 18:58:52 INFO NettyBlockTransferService: Server created on 51205
15/03/06 18:58:52 INFO BlockManagerMaster: Trying to register BlockManager
15/03/06 18:58:52 INFO BlockManagerMaster: Registered BlockManager
15/03/06 18:58:52 INFO AkkaUtils: Connecting to HeartbeatReceiver: akka.tcp://sparkDriver#MASTER_IP:49407/user/HeartbeatReceiver
15/03/06 18:58:52 INFO CoarseGrainedExecutorBackend: Got assigned task 0
15/03/06 18:58:52 INFO Executor: Running task 0.0 in stage 0.0 (TID 0)
15/03/06 18:58:52 INFO Executor: Fetching http://MASTER_IP:57850/jars/trial_2.11-0.9.jar with timestamp 1425665154479
15/03/06 18:58:52 INFO Utils: Fetching http://MASTER_IP:57850/jars/trial_2.11-0.9.jar to /home/szymon/spark/slaveData/spark-b09c3727-8559-4ab8-ab32-1f5ecf7aeaf2/spark-0c892a4d-c8b9-4144-a259-8077f5316b52/spark-411cd372-224e-44c1-84ab-b0c3984a6361/fetchFileTemp7857926599487994869.tmp
15/03/06 18:58:52 INFO Utils: Copying /home/szymon/spark/slaveData/spark-b09c3727-8559-4ab8-ab32-1f5ecf7aeaf2/spark-0c892a4d-c8b9-4144-a259-8077f5316b52/spark-411cd372-224e-44c1-84ab-b0c3984a6361/-19284804851425665154479_cache to /home/szymon/spark/work/app-20150306190555-0000/0/./trial_2.11-0.9.jar
15/03/06 18:58:52 INFO Executor: Adding file:/home/szymon/spark/work/app-20150306190555-0000/0/./trial_2.11-0.9.jar to class loader
15/03/06 18:58:52 INFO TorrentBroadcast: Started reading broadcast variable 0
15/03/06 18:58:52 ERROR RetryingBlockFetcher: Exception while beginning fetch of 1 outstanding blocks
java.io.IOException: Failed to connect to localhost/127.0.0.1:56545
at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:191)
at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:156)
at org.apache.spark.network.netty.NettyBlockTransferService$$anon$1.createAndStart(NettyBlockTransferService.scala:78)
at org.apache.spark.network.shuffle.RetryingBlockFetcher.fetchAllOutstanding(RetryingBlockFetcher.java:140)
at org.apache.spark.network.shuffle.RetryingBlockFetcher.start(RetryingBlockFetcher.java:120)
at org.apache.spark.network.netty.NettyBlockTransferService.fetchBlocks(NettyBlockTransferService.scala:87)
at org.apache.spark.network.BlockTransferService.fetchBlockSync(BlockTransferService.scala:89)
at org.apache.spark.storage.BlockManager$$anonfun$doGetRemote$2.apply(BlockManager.scala:595)
at org.apache.spark.storage.BlockManager$$anonfun$doGetRemote$2.apply(BlockManager.scala:593)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at org.apache.spark.storage.BlockManager.doGetRemote(BlockManager.scala:593)
at org.apache.spark.storage.BlockManager.getRemoteBytes(BlockManager.scala:587)
at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1.org$apache$spark$broadcast$TorrentBroadcast$$anonfun$$getRemote$1(TorrentBroadcast.scala:126)
at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1$$anonfun$1.apply(TorrentBroadcast.scala:136)
at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1$$anonfun$1.apply(TorrentBroadcast.scala:136)
at scala.Option.orElse(Option.scala:257)
at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1.apply$mcVI$sp(TorrentBroadcast.scala:136)
at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1.apply(TorrentBroadcast.scala:119)
at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1.apply(TorrentBroadcast.scala:119)
at scala.collection.immutable.List.foreach(List.scala:318)
at org.apache.spark.broadcast.TorrentBroadcast.org$apache$spark$broadcast$TorrentBroadcast$$readBlocks(TorrentBroadcast.scala:119)
at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$readBroadcastBlock$1.apply(TorrentBroadcast.scala:174)
at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1090)
at org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:164)
at org.apache.spark.broadcast.TorrentBroadcast._value$lzycompute(TorrentBroadcast.scala:64)
at org.apache.spark.broadcast.TorrentBroadcast._value(TorrentBroadcast.scala:64)
at org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:87)
at org.apache.spark.broadcast.Broadcast.value(Broadcast.scala:70)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:58)
at org.apache.spark.scheduler.Task.run(Task.scala:56)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:200)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.ConnectException: Connection refused: localhost/127.0.0.1:56545
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:208)
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:287)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:528)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
... 1 more
15/03/06 18:58:52 INFO RetryingBlockFetcher: Retrying fetch (1/3) for 1 outstanding blocks after 5000 ms
15/03/06 18:58:57 ERROR RetryingBlockFetcher: Exception while beginning fetch of 1 outstanding blocks (after 1 retries)
java.io.IOException: Failed to connect to localhost/127.0.0.1:56545
at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:191)
at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:156)
at org.apache.spark.network.netty.NettyBlockTransferService$$anon$1.createAndStart(NettyBlockTransferService.scala:78)
at org.apache.spark.network.shuffle.RetryingBlockFetcher.fetchAllOutstanding(RetryingBlockFetcher.java:140)
at org.apache.spark.network.shuffle.RetryingBlockFetcher.access$200(RetryingBlockFetcher.java:43)
at org.apache.spark.network.shuffle.RetryingBlockFetcher$1.run(RetryingBlockFetcher.java:170)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.ConnectException: Connection refused: localhost/127.0.0.1:56545
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:716)
at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:208)
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:287)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:528)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
... 1 more
.
.
.
15/03/06 19:00:22 INFO RetryingBlockFetcher: Retrying fetch (1/3) for 1 outstanding blocks after 5000 ms
15/03/06 19:00:24 ERROR CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor#WORKER_MACHINE_IP:37114] -> [akka.tcp://sparkDriver#MASTER_IP:49407] disassociated! Shutting down.
15/03/06 19:00:24 WARN ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkDriver#MASTER_IP:49407] has failed, address is now gated for [5000] ms. Reason is: [Disassociated].
15/03/06 19:00:24 INFO Worker: Asked to kill executor app-20150306190555-0000/0
15/03/06 19:00:24 INFO ExecutorRunner: Runner thread for executor app-20150306190555-0000/0 interrupted
15/03/06 19:00:24 INFO ExecutorRunner: Killing process!
15/03/06 19:00:25 INFO Worker: Executor app-20150306190555-0000/0 finished with state KILLED exitStatus 1
15/03/06 19:00:25 INFO Worker: Cleaning up local directories for application app-20150306190555-0000
15/03/06 19:00:25 WARN ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkExecutor#WORKER_MACHINE_IP:37114] has failed, address is now gated for [5000] ms. Reason is: [Disassociated].
15/03/06 19:00:25 INFO LocalActorRef: Message [akka.remote.transport.ActorTransportAdapter$DisassociateUnderlying] from Actor[akka://sparkWorker/deadLetters] to Actor[akka://sparkWorker/system/transports/akkaprotocolmanager.tcp0/akkaProtocol-tcp%3A%2F%2FsparkWorker%40WORKER_MACHINE_IP%3A45806-2#1549100100] was not delivered. [1] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
There is a warning, which I believe is not the case at this issue
15/03/06 18:07:46 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
You can try to set conf.set("spark.driver.host",""), the client host is a host where you start spark-shell or other script.