RuntimeException when nutch generate - mongodb

I'm new to nutch. I have installed nutch 2.3.1 and configure it to use mongodb. The inject operation was successful but when I try to generate it generate an exception (see below).
NB : This error is generated with a seed file containing 60K urls. So I've tried with 100 urls and everything went well.
Do you have an idea what is the cause of this error ? Thanks !!!
2016-12-30 00:01:48,446 INFO crawl.GeneratorJob - GeneratorJob: starting at 2016-12-30 00:01:48
2016-12-30 00:01:48,447 INFO crawl.GeneratorJob - GeneratorJob: Selecting best-scoring urls due for fetch.
2016-12-30 00:01:48,447 INFO crawl.GeneratorJob - GeneratorJob: starting
2016-12-30 00:01:48,448 INFO crawl.GeneratorJob - GeneratorJob: filtering: true
2016-12-30 00:01:48,448 INFO crawl.GeneratorJob - GeneratorJob: normalizing: true
2016-12-30 00:01:48,448 INFO crawl.GeneratorJob - GeneratorJob: topN: 100000
2016-12-30 00:01:48,816 WARN util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2016-12-30 00:01:48,857 INFO crawl.FetchScheduleFactory - Using FetchSchedule impl: org.apache.nutch.crawl.DefaultFetchSchedule
2016-12-30 00:01:48,867 INFO crawl.AbstractFetchSchedule - defaultInterval=2592000
2016-12-30 00:01:48,867 INFO crawl.AbstractFetchSchedule - maxInterval=7776000
2016-12-30 00:01:51,568 WARN conf.Configuration - file:/tmp/hadoop-mehdi/mapred/staging/mehdi1740651658/.staging/job_local1740651658_0001/job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring.
2016-12-30 00:01:51,573 WARN conf.Configuration - file:/tmp/hadoop-mehdi/mapred/staging/mehdi1740651658/.staging/job_local1740651658_0001/job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring.
2016-12-30 00:01:51,753 WARN conf.Configuration - file:/tmp/hadoop-mehdi/mapred/local/localRunner/mehdi/job_local1740651658_0001/job_local1740651658_0001.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring.
2016-12-30 00:01:51,760 WARN conf.Configuration - file:/tmp/hadoop-mehdi/mapred/local/localRunner/mehdi/job_local1740651658_0001/job_local1740651658_0001.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring.
2016-12-30 00:01:52,408 INFO crawl.FetchScheduleFactory - Using FetchSchedule impl: org.apache.nutch.crawl.DefaultFetchSchedule
2016-12-30 00:01:52,408 INFO crawl.AbstractFetchSchedule - defaultInterval=2592000
2016-12-30 00:01:52,408 INFO crawl.AbstractFetchSchedule - maxInterval=7776000
2016-12-30 00:01:52,591 INFO regex.RegexURLNormalizer - can't find rules for scope 'generate_host_count', using default
2016-12-30 00:02:03,229 ERROR mapreduce.GoraRecordReader - Error reading Gora records: Read operation to server localhost:27017 failed on database nutch
2016-12-30 00:02:04,607 WARN mapred.LocalJobRunner - job_local1740651658_0001
java.lang.Exception: java.lang.RuntimeException: com.mongodb.MongoException$Network: Read operation to server localhost:27017 failed on database nutch
at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.lang.RuntimeException: com.mongodb.MongoException$Network: Read operation to server localhost:27017 failed on database nutch
at org.apache.gora.mapreduce.GoraRecordReader.nextKeyValue(GoraRecordReader.java:122)
at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:533)
at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: com.mongodb.MongoException$Network: Read operation to server localhost:27017 failed on database nutch
at com.mongodb.DBTCPConnector.innerCall(DBTCPConnector.java:298)
at com.mongodb.DBTCPConnector.call(DBTCPConnector.java:269)
at com.mongodb.DBTCPConnector.call(DBTCPConnector.java:235)
at com.mongodb.QueryResultIterator.getMore(QueryResultIterator.java:145)
at com.mongodb.QueryResultIterator.hasNext(QueryResultIterator.java:135)
at com.mongodb.DBCursor._hasNext(DBCursor.java:626)
at com.mongodb.DBCursor.hasNext(DBCursor.java:657)
at org.apache.gora.mongodb.query.MongoDBResult.nextInner(MongoDBResult.java:71)
at org.apache.gora.query.impl.ResultBase.next(ResultBase.java:111)
at org.apache.gora.mapreduce.GoraRecordReader.nextKeyValue(GoraRecordReader.java:118)
... 12 more
Caused by: java.io.EOFException
at org.bson.io.Bits.readFully(Bits.java:75)
at org.bson.io.Bits.readFully(Bits.java:50)
at org.bson.io.Bits.readFully(Bits.java:37)
at com.mongodb.Response.<init>(Response.java:42)
at com.mongodb.DBPort$1.execute(DBPort.java:164)
at com.mongodb.DBPort$1.execute(DBPort.java:158)
at com.mongodb.DBPort.doOperation(DBPort.java:187)
at com.mongodb.DBPort.call(DBPort.java:158)
at com.mongodb.DBTCPConnector.innerCall(DBTCPConnector.java:290)
... 21 more
2016-12-30 00:02:04,846 ERROR crawl.GeneratorJob - GeneratorJob: java.lang.RuntimeException: job failed: name=nutch-maven-1.0-SNAPSHOT.jar, jobid=job_local1740651658_0001
at org.apache.nutch.util.NutchJob.waitForCompletion(NutchJob.java:120)
at org.apache.nutch.crawl.GeneratorJob.run(GeneratorJob.java:227)
at org.apache.nutch.crawl.GeneratorJob.generate(GeneratorJob.java:256)
at org.apache.nutch.crawl.GeneratorJob.run(GeneratorJob.java:322)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.nutch.crawl.GeneratorJob.main(GeneratorJob.java:330)

I figured out that the problem becomes from mongodb version. Nutch uses mongo-java-driver-2.13.1.jar ad I've installed mongodb 3.4.1. So I've installed mongo 2.6.7 and now it works fine. I'll try to update the driver in Nutch and tell you if it works with the new version of mongodb.

Related

neo4j - graphaware plugins

I downloaded the plugins of graphaware nlp,open-nlp,framework and copied the jar files to the plugins directory.
And as per the steps in neo4j , i included the following lines in neo4j.config file
dbms.unmanaged_extension_classes=com.graphaware.server=/graphaware
com.graphaware.runtime.enabled=true
com.graphaware.module.NLP.2=com.graphaware.nlp.module.NLPBootstrapper
After inserting this the localhost:7474 is not starting.
But when i comment these lines localhost starts and works properly but doesnt include the plugins.
Version : enterprise 3.1.3
Error in LocalLost after commenting those lines:
Failed to invoke procedure `ga.nlp.annotate`: Caused by: java.lang.RuntimeException: java.lang.IllegalStateException: No GraphAware Runtime is registered with the given database
Error in log file:
2017-11-07 10:41:03.839+0000 INFO ======== Neo4j 3.1.3 ========
2017-11-07 10:41:04.120+0000 INFO Starting...
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/share/neo4j/lib/slf4j-nop-1.7.22.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/var/lib/neo4j/plugins/nlp-opennlp-3.1.3.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.helpers.NOPLoggerFactory]
2017-11-07 10:41:04.985+0000 INFO Bolt enabled on localhost:7687.
2017-11-07 10:41:05.010+0000 INFO Initiating metrics...
2017-11-07 10:41:07.374+0000 INFO [c.g.r.b.RuntimeKernelExtension] GraphAware Runtime enabled, bootstrapping...
2017-11-07 10:41:07.444+0000 INFO [c.g.r.b.RuntimeKernelExtension] Bootstrapping module with order 2, ID NLP, using com.graphaware.nlp.module.NLPBootstrapper
2017-11-07 10:41:07.523+0000 INFO Registering module NLP with GraphAware Runtime.
2017-11-07 10:41:07.523+0000 INFO [c.g.r.b.RuntimeKernelExtension] GraphAware Runtime bootstrapped, starting the Runtime...
2017-11-07 10:41:21.893+0000 INFO Starting GraphAware...
2017-11-07 10:41:21.894+0000 INFO Loading module metadata...
2017-11-07 10:41:21.894+0000 INFO Loading metadata for module NLP
2017-11-07 10:41:21.946+0000 INFO Module NLP seems to have been registered for the first time.
2017-11-07 10:41:21.947+0000 INFO Module NLP seems to have been registered for the first time, will try to initialize...
2017-11-07 10:41:21.947+0000 INFO InitializeUntil set to 9223372036854775807 and it is 1510051281947. Will initialize.
2017-11-07 10:41:24.709+0000 INFO Started.
2017-11-07 10:41:24.811+0000 INFO Mounted REST API at: /db/manage
2017-11-07 10:41:24.823+0000 INFO [c.g.s.f.b.GraphAwareServerBootstrapper] started
2017-11-07 10:41:24.825+0000 INFO Mounted unmanaged extension [com.graphaware.server] at [/graphaware]
Exception in thread "GraphAware Starter" java.lang.RuntimeException: Error while initializing model of class: class opennlp.tools.namefind.TokenNameFinderModel
at com.graphaware.nlp.processor.opennlp.OpenNLPPipeline.loadModel(OpenNLPPipeline.java:503)
at com.graphaware.nlp.processor.opennlp.OpenNLPPipeline.lambda$loadNamedEntitiesFinders$2(OpenNLPPipeline.java:161)
at java.util.HashMap$EntrySpliterator.forEachRemaining(HashMap.java:1691)
at java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:580)
at com.graphaware.nlp.processor.opennlp.OpenNLPPipeline.loadNamedEntitiesFinders(OpenNLPPipeline.java:158)
at com.graphaware.nlp.processor.opennlp.OpenNLPPipeline.init(OpenNLPPipeline.java:118)
at com.graphaware.nlp.processor.opennlp.OpenNLPPipeline.<init>(OpenNLPPipeline.java:108)
at com.graphaware.nlp.processor.opennlp.PipelineBuilder.build(PipelineBuilder.java:79)
at com.graphaware.nlp.processor.opennlp.OpenNLPTextProcessor.createPhrasePipeline(OpenNLPTextProcessor.java:106)
at com.graphaware.nlp.processor.opennlp.OpenNLPTextProcessor.init(OpenNLPTextProcessor.java:56)
at com.graphaware.nlp.processor.TextProcessorsManager.lambda$initiateTextProcessors$0(TextProcessorsManager.java:61)
at java.util.HashMap$Values.forEach(HashMap.java:980)
at com.graphaware.nlp.processor.TextProcessorsManager.initiateTextProcessors(TextProcessorsManager.java:60)
at com.graphaware.nlp.processor.TextProcessorsManager.<init>(TextProcessorsManager.java:37)
at com.graphaware.nlp.NLPManager.init(NLPManager.java:95)
at com.graphaware.nlp.module.NLPModule.initialize(NLPModule.java:52)
at com.graphaware.runtime.manager.ProductionTxDrivenModuleManager.initialize(ProductionTxDrivenModuleManager.java:57)
at com.graphaware.runtime.manager.BaseTxDrivenModuleManager.initializeIfAllowed(BaseTxDrivenModuleManager.java:128)
at com.graphaware.runtime.manager.BaseTxDrivenModuleManager.handleNoMetadata(BaseTxDrivenModuleManager.java:72)
at com.graphaware.runtime.manager.BaseTxDrivenModuleManager.handleNoMetadata(BaseTxDrivenModuleManager.java:39)
at com.graphaware.runtime.manager.BaseModuleManager.loadMetadata(BaseModuleManager.java:143)
at com.graphaware.runtime.manager.BaseModuleManager.loadMetadata(BaseModuleManager.java:125)
at com.graphaware.runtime.TxDrivenRuntime.loadMetadata(TxDrivenRuntime.java:130)
at com.graphaware.runtime.ProductionRuntime.loadMetadata(ProductionRuntime.java:80)
at com.graphaware.runtime.BaseGraphAwareRuntime.startModules(BaseGraphAwareRuntime.java:154)
at com.graphaware.runtime.TxDrivenRuntime.startModules(TxDrivenRuntime.java:146)
at com.graphaware.runtime.ProductionRuntime.startModules(ProductionRuntime.java:70)
at com.graphaware.runtime.BaseGraphAwareRuntime.start(BaseGraphAwareRuntime.java:134)
at com.graphaware.runtime.bootstrap.RuntimeKernelExtension.lambda$start$8(RuntimeKernelExtension.java:117)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.GeneratedConstructorAccessor29.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at com.graphaware.nlp.processor.opennlp.OpenNLPPipeline.loadModel(OpenNLPPipeline.java:499)
... 29 more
Caused by: java.lang.OutOfMemoryError: Java heap space
at opennlp.tools.ml.model.AbstractModelReader.getParameters(AbstractModelReader.java:140)
at opennlp.tools.ml.maxent.io.GISModelReader.constructModel(GISModelReader.java:78)
at opennlp.tools.ml.model.GenericModelReader.constructModel(GenericModelReader.java:62)
at opennlp.tools.ml.model.AbstractModelReader.getModel(AbstractModelReader.java:85)
at opennlp.tools.util.model.GenericModelSerializer.create(GenericModelSerializer.java:32)
at opennlp.tools.util.model.GenericModelSerializer.create(GenericModelSerializer.java:29)
at opennlp.tools.util.model.BaseModel.finishLoadingArtifacts(BaseModel.java:309)
at opennlp.tools.util.model.BaseModel.loadModel(BaseModel.java:239)
at opennlp.tools.util.model.BaseModel.<init>(BaseModel.java:173)
at opennlp.tools.namefind.TokenNameFinderModel.<init>(TokenNameFinderModel.java:103)
at sun.reflect.GeneratedConstructorAccessor29.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at com.graphaware.nlp.processor.opennlp.OpenNLPPipeline.loadModel(OpenNLPPipeline.java:499)
at com.graphaware.nlp.processor.opennlp.OpenNLPPipeline.lambda$loadNamedEntitiesFinders$2(OpenNLPPipeline.java:161)
at com.graphaware.nlp.processor.opennlp.OpenNLPPipeline$$Lambda$239/1188677545.accept(Unknown Source)
at java.util.HashMap$EntrySpliterator.forEachRemaining(HashMap.java:1691)
at java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:580)
at com.graphaware.nlp.processor.opennlp.OpenNLPPipeline.loadNamedEntitiesFinders(OpenNLPPipeline.java:158)
at com.graphaware.nlp.processor.opennlp.OpenNLPPipeline.init(OpenNLPPipeline.java:118)
at com.graphaware.nlp.processor.opennlp.OpenNLPPipeline.<init>(OpenNLPPipeline.java:108)
at com.graphaware.nlp.processor.opennlp.PipelineBuilder.build(PipelineBuilder.java:79)
at com.graphaware.nlp.processor.opennlp.OpenNLPTextProcessor.createPhrasePipeline(OpenNLPTextProcessor.java:106)
at com.graphaware.nlp.processor.opennlp.OpenNLPTextProcessor.init(OpenNLPTextProcessor.java:56)
at com.graphaware.nlp.processor.TextProcessorsManager.lambda$initiateTextProcessors$0(TextProcessorsManager.java:61)
at com.graphaware.nlp.processor.TextProcessorsManager$$Lambda$234/2094381213.accept(Unknown Source)
at java.util.HashMap$Values.forEach(HashMap.java:980)
at com.graphaware.nlp.processor.TextProcessorsManager.initiateTextProcessors(TextProcessorsManager.java:60)
at com.graphaware.nlp.processor.TextProcessorsManager.<init>(TextProcessorsManager.java:37)
at com.graphaware.nlp.NLPManager.init(NLPManager.java:95)
at com.graphaware.nlp.module.NLPModule.initialize(NLPModule.java:52)
at com.graphaware.runtime.manager.ProductionTxDrivenModuleManager.initialize(ProductionTxDrivenModuleManager.java:57)
please help me out
You do not have sufficient memory for the NLP plugins to load, hence the NLP module is not registered and thus not available once that database has started.
As stated in the NLP plugin README, you need at least 4GB of heap for the modules to run, adapt it in your neo4j.conf and restart.

Spark submit runs successfully but when submitted through oozie it fails to connect to hive

I am using CDH 5.9.0, Spark 1.6 and Scala 2.10.0. I have created a scala and spark program to create a table and load data from a file to hive. When I run it using spark submit, it completes. But the same program when submitted through oozie, it throws the below exception.
Below is the exception.
Log Type: stdout
Log Upload Time: Fri Oct 27 10:08:28 -0400 2017
Log Length: 172584
2017-10-27 10:08:20,652 INFO [main] yarn.ApplicationMaster (SignalLogger.scala:register(47)) - Registered signal handlers for [TERM, HUP, INT]
2017-10-27 10:08:21,306 INFO [main] yarn.ApplicationMaster (Logging.scala:logInfo(58)) - ApplicationAttemptId: appattempt_1507999204018_0292_000001
2017-10-27 10:08:21,952 INFO [main] spark.SecurityManager (Logging.scala:logInfo(58)) - Changing view acls to: username
2017-10-27 10:08:21,953 INFO [main] spark.SecurityManager (Logging.scala:logInfo(58)) - Changing modify acls to: username
2017-10-27 10:08:21,956 INFO [main] spark.SecurityManager (Logging.scala:logInfo(58)) - SecurityManager: authentication enabled; ui acls disabled; users with view permissions: Set(username); users with modify permissions: Set(username)
2017-10-27 10:08:21,970 INFO [main] yarn.ApplicationMaster (Logging.scala:logInfo(58)) - Starting the user application in a separate Thread
2017-10-27 10:08:21,997 INFO [main] yarn.ApplicationMaster (Logging.scala:logInfo(58)) - Waiting for spark context initialization
2017-10-27 10:08:21,998 INFO [main] yarn.ApplicationMaster (Logging.scala:logInfo(58)) - Waiting for spark context initialization ...
2017-10-27 10:08:22,308 WARN [Driver] security.UserGroupInformation (UserGroupInformation.java:doAs(1701)) - PriviledgedActionException as:username (auth:SIMPLE) cause:org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category READ is not supported in state standby. Visit https://s.apache.org/sbnn-error
2017-10-27 10:08:22,309 WARN [Driver] ipc.Client (Client.java:run(682)) - Exception encountered while connecting to the server : org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category READ is not supported in state standby. Visit https://s.apache.org/sbnn-error
2017-10-27 10:08:22,310 WARN [Driver] security.UserGroupInformation (UserGroupInformation.java:doAs(1701)) - PriviledgedActionException as:username (auth:SIMPLE) cause:org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category READ is not supported in state standby. Visit https://s.apache.org/sbnn-error
2017-10-27 10:08:22,391 INFO [Driver] spark.SparkContext (Logging.scala:logInfo(58)) - Running Spark version 1.6.0
2017-10-27 10:08:22,417 INFO [Driver] spark.SecurityManager (Logging.scala:logInfo(58)) - Changing view acls to: username
2017-10-27 10:08:22,418 INFO [Driver] spark.SecurityManager (Logging.scala:logInfo(58)) - Changing modify acls to: username
2017-10-27 10:08:22,418 INFO [Driver] spark.SecurityManager (Logging.scala:logInfo(58)) - SecurityManager: authentication enabled; ui acls disabled; users with view permissions: Set(username); users with modify permissions: Set(username)
2017-10-27 10:08:22,572 INFO [Driver] util.Utils (Logging.scala:logInfo(58)) - Successfully started service 'sparkDriver' on port 44049.
2017-10-27 10:08:22,901 INFO [sparkDriverActorSystem-akka.actor.default-dispatcher-4] slf4j.Slf4jLogger (Slf4jLogger.scala:applyOrElse(80)) - Slf4jLogger started
2017-10-27 10:08:22,936 INFO [sparkDriverActorSystem-akka.actor.default-dispatcher-4] Remoting (Slf4jLogger.scala:apply$mcV$sp(74)) - Starting remoting
2017-10-27 10:08:23,062 INFO [sparkDriverActorSystem-akka.actor.default-dispatcher-4] Remoting (Slf4jLogger.scala:apply$mcV$sp(74)) - Remoting started; listening on addresses :[akka.tcp://sparkDriverActorSystem#a.b.c.d:38305]
2017-10-27 10:08:23,064 INFO [sparkDriverActorSystem-akka.actor.default-dispatcher-4] Remoting (Slf4jLogger.scala:apply$mcV$sp(74)) - Remoting now listens on addresses: [akka.tcp://sparkDriverActorSystem#a.b.c.d:38305]
2017-10-27 10:08:23,174 INFO [Driver] util.Utils (Logging.scala:logInfo(58)) - Successfully started service 'sparkDriverActorSystem' on port 38305.
2017-10-27 10:08:23,195 INFO [Driver] spark.SparkEnv (Logging.scala:logInfo(58)) - Registering MapOutputTracker
2017-10-27 10:08:23,207 INFO [Driver] spark.SparkEnv (Logging.scala:logInfo(58)) - Registering BlockManagerMaster
2017-10-27 10:08:23,216 INFO [Driver] storage.DiskBlockManager (Logging.scala:logInfo(58)) - Created local directory at /data/01/yarn/nm/usercache/username/appcache/application_1507999204018_0292/blockmgr-ba42749b-3498-4c1d-ba8b-dc6720e815a0
2017-10-27 10:08:23,217 INFO [Driver] storage.DiskBlockManager (Logging.scala:logInfo(58)) - Created local directory at /data/02/yarn/nm/usercache/username/appcache/application_1507999204018_0292/blockmgr-d9375d30-699d-4e40-8b42-559f79f27f85
2017-10-27 10:08:23,217 INFO [Driver] storage.DiskBlockManager (Logging.scala:logInfo(58)) - Created local directory at /data/03/yarn/nm/usercache/username/appcache/application_1507999204018_0292/blockmgr-fc2caf3b-3fa0-4f1e-be01-b33b6f6d52d5
2017-10-27 10:08:23,217 INFO [Driver] storage.DiskBlockManager (Logging.scala:logInfo(58)) - Created local directory at /data/04/yarn/nm/usercache/username/appcache/application_1507999204018_0292/blockmgr-450319a4-2d4f-4159-a633-3dd2a71bafe1
2017-10-27 10:08:23,217 INFO [Driver] storage.DiskBlockManager (Logging.scala:logInfo(58)) - Created local directory at /data/05/yarn/nm/usercache/username/appcache/application_1507999204018_0292/blockmgr-c3dbf9b3-cb95-4104-b4bf-9e7b1987e210
2017-10-27 10:08:23,217 INFO [Driver] storage.DiskBlockManager (Logging.scala:logInfo(58)) - Created local directory at /data/06/yarn/nm/usercache/username/appcache/application_1507999204018_0292/blockmgr-5d9c58a6-29bb-4e8e-a8fb-3720db0004d4
2017-10-27 10:08:23,218 INFO [Driver] storage.DiskBlockManager (Logging.scala:logInfo(58)) - Created local directory at /data/07/yarn/nm/usercache/username/appcache/application_1507999204018_0292/blockmgr-999eecaf-f183-4ede-8845-eeb57a87276b
2017-10-27 10:08:23,218 INFO [Driver] storage.DiskBlockManager (Logging.scala:logInfo(58)) - Created local directory at /data/08/yarn/nm/usercache/username/appcache/application_1507999204018_0292/blockmgr-216d2449-14b1-45aa-b6c6-d6271815f485
2017-10-27 10:08:23,221 INFO [Driver] storage.MemoryStore (Logging.scala:logInfo(58)) - MemoryStore started with capacity 491.7 MB
2017-10-27 10:08:23,283 INFO [Driver] spark.SparkEnv (Logging.scala:logInfo(58)) - Registering OutputCommitCoordinator
2017-10-27 10:08:23,394 INFO [Driver] ui.JettyUtils (Logging.scala:logInfo(58)) - Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
2017-10-27 10:08:23,413 INFO [Driver] server.Server (Server.java:doStart(272)) - jetty-8.y.z-SNAPSHOT
2017-10-27 10:08:23,448 INFO [Driver] server.AbstractConnector (AbstractConnector.java:doStart(338)) - Started SelectChannelConnector#0.0.0.0:36123
2017-10-27 10:08:23,448 INFO [Driver] util.Utils (Logging.scala:logInfo(58)) - Successfully started service 'SparkUI' on port 36123.
2017-10-27 10:08:23,449 INFO [Driver] ui.SparkUI (Logging.scala:logInfo(58)) - Started SparkUI at http://a.b.c.d:36123
2017-10-27 10:08:23,498 INFO [Driver] cluster.YarnClusterScheduler (Logging.scala:logInfo(58)) - Created YarnClusterScheduler
2017-10-27 10:08:23,524 INFO [Driver] util.Utils (Logging.scala:logInfo(58)) - Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 44418.
2017-10-27 10:08:23,525 INFO [Driver] netty.NettyBlockTransferService (Logging.scala:logInfo(58)) - Server created on 44418
2017-10-27 10:08:23,527 INFO [Driver] storage.BlockManager (Logging.scala:logInfo(58)) - external shuffle service port = 7337
2017-10-27 10:08:23,527 INFO [Driver] storage.BlockManagerMaster (Logging.scala:logInfo(58)) - Trying to register BlockManager
2017-10-27 10:08:23,530 INFO [dispatcher-event-loop-11] storage.BlockManagerMasterEndpoint (Logging.scala:logInfo(58)) - Registering block manager a.b.c.d:44418 with 491.7 MB RAM, BlockManagerId(driver, a.b.c.d, 44418)
2017-10-27 10:08:23,533 INFO [Driver] storage.BlockManagerMaster (Logging.scala:logInfo(58)) - Registered BlockManager
2017-10-27 10:08:24,106 INFO [Driver] scheduler.EventLoggingListener (Logging.scala:logInfo(58)) - Logging events to hdfs://.../user/spark/applicationHistory/application_1507999204018_0292_1
2017-10-27 10:08:24,133 INFO [Driver] cluster.YarnClusterSchedulerBackend (Logging.scala:logInfo(58)) - SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.8
2017-10-27 10:08:24,133 INFO [Driver] cluster.YarnClusterScheduler (Logging.scala:logInfo(58)) - YarnClusterScheduler.postStartHook done
2017-10-27 10:08:24,140 INFO [dispatcher-event-loop-13] cluster.YarnSchedulerBackend$YarnSchedulerEndpoint (Logging.scala:logInfo(58)) - ApplicationMaster registered as NettyRpcEndpointRef(spark://YarnAM#a.b.c.d:44049)
2017-10-27 10:08:24,191 INFO [main] yarn.YarnRMClient (Logging.scala:logInfo(58)) - Registering the ApplicationMaster
2017-10-27 10:08:24,295 INFO [main] yarn.ApplicationMaster (Logging.scala:logInfo(58)) - Started progress reporter thread with (heartbeat : 3000, initial allocation : 200) intervals
2017-10-27 10:08:25,107 INFO [Driver] hive.HiveContext (Logging.scala:logInfo(58)) - Initializing execution hive, version 1.1.0
2017-10-27 10:08:25,146 INFO [Driver] client.ClientWrapper (Logging.scala:logInfo(58)) - Inspected Hadoop version: 2.6.0-cdh5.9.0
2017-10-27 10:08:25,147 INFO [Driver] client.ClientWrapper (Logging.scala:logInfo(58)) - Loaded org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version 2.6.0-cdh5.9.0
2017-10-27 10:08:25,582 INFO [Driver] metastore.HiveMetaStore (HiveMetaStore.java:newRawStore(644)) - 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
2017-10-27 10:08:25,600 INFO [Driver] metastore.ObjectStore (ObjectStore.java:initialize(333)) - ObjectStore, initialize called
2017-10-27 10:08:25,671 WARN [Driver] DataNucleus.General (Log4JLogger.java:warn(96)) - Plugin (Bundle) "org.datanucleus" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/data/03/yarn/nm/usercache/username/appcache/application_1507999204018_0292/container_e69_1507999204018_0292_01_000001/datanucleus-core-3.2.2.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/data/05/yarn/nm/filecache/507/datanucleus-core-3.2.2.jar."
2017-10-27 10:08:25,687 WARN [Driver] DataNucleus.General (Log4JLogger.java:warn(96)) - Plugin (Bundle) "org.datanucleus.api.jdo" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/data/03/yarn/nm/usercache/username/appcache/application_1507999204018_0292/container_e69_1507999204018_0292_01_000001/datanucleus-api-jdo-3.2.1.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/data/07/yarn/nm/filecache/582/datanucleus-api-jdo-3.2.1.jar."
2017-10-27 10:08:25,688 WARN [Driver] DataNucleus.General (Log4JLogger.java:warn(96)) - Plugin (Bundle) "org.datanucleus.store.rdbms" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/data/08/yarn/nm/filecache/554/datanucleus-rdbms-3.2.1.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/data/03/yarn/nm/usercache/username/appcache/application_1507999204018_0292/container_e69_1507999204018_0292_01_000001/datanucleus-rdbms-3.2.1.jar."
2017-10-27 10:08:25,709 INFO [Driver] DataNucleus.Persistence (Log4JLogger.java:info(77)) - Property hive.metastore.integral.jdo.pushdown unknown - will be ignored
2017-10-27 10:08:25,710 INFO [Driver] DataNucleus.Persistence (Log4JLogger.java:info(77)) - Property datanucleus.cache.level2 unknown - will be ignored
2017-10-27 10:08:26,178 WARN [Driver] bonecp.BoneCPConfig (BoneCPConfig.java:sanitize(1537)) - Max Connections < 1. Setting to 20
2017-10-27 10:08:26,180 ERROR [Driver] Datastore.Schema (Log4JLogger.java:error(125)) - Failed initialising database.
Unable to open a test connection to the given database. JDBC url = jdbc:derby:;databaseName=/data/03/yarn/nm/usercache/username/appcache/application_1507999204018_0292/container_e69_1507999204018_0292_01_000001/tmp/spark-633fb1f8-1f38-44ac-a54e-81465354bedc/metastore;create=true, username = APP. Terminating connection pool. Original Exception: ------
java.sql.SQLException: No suitable driver found for jdbc:derby:;databaseName=/data/03/yarn/nm/usercache/username/appcache/application_1507999204018_0292/container_e69_1507999204018_0292_01_000001/tmp/spark-633fb1f8-1f38-44ac-a54e-81465354bedc/metastore;create=true
at java.sql.DriverManager.getConnection(DriverManager.java:689)
at java.sql.DriverManager.getConnection(DriverManager.java:208)
at com.jolbox.bonecp.BoneCP.obtainRawInternalConnection(BoneCP.java:254)
at com.jolbox.bonecp.BoneCP.<init>(BoneCP.java:305)
at com.jolbox.bonecp.BoneCPDataSource.maybeInit(BoneCPDataSource.java:150)
at com.jolbox.bonecp.BoneCPDataSource.getConnection(BoneCPDataSource.java:112)
at org.datanucleus.store.rdbms.ConnectionFactoryImpl$ManagedConnectionImpl.getConnection(ConnectionFactoryImpl.java:479)
at org.datanucleus.store.rdbms.RDBMSStoreManager.<init>(RDBMSStoreManager.java:304)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.datanucleus.plugin.NonManagedPluginRegistry.createExecutableExtension(NonManagedPluginRegistry.java:631)
at org.datanucleus.plugin.PluginManager.createExecutableExtension(PluginManager.java:301)
at org.datanucleus.NucleusContext.createStoreManagerForProperties(NucleusContext.java:1069)
at org.datanucleus.NucleusContext.initialise(NucleusContext.java:359)
at org.datanucleus.api.jdo.JDOPersistenceManagerFactory.freezeConfiguration(JDOPersistenceManagerFactory.java:768)
at org.datanucleus.api.jdo.JDOPersistenceManagerFactory.createPersistenceManagerFactory(JDOPersistenceManagerFactory.java:326)
at org.datanucleus.api.jdo.JDOPersistenceManagerFactory.getPersistenceManagerFactory(JDOPersistenceManagerFactory.java:195)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at javax.jdo.JDOHelper$16.run(JDOHelper.java:1965)
at java.security.AccessController.doPrivileged(Native Method)
at javax.jdo.JDOHelper.invoke(JDOHelper.java:1960)
at javax.jdo.JDOHelper.invokeGetPersistenceManagerFactoryOnImplementation(JDOHelper.java:1166)
at javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:808)
at javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:701)
at org.apache.hadoop.hive.metastore.ObjectStore.getPMF(ObjectStore.java:411)
at org.apache.hadoop.hive.metastore.ObjectStore.getPersistenceManager(ObjectStore.java:440)
at org.apache.hadoop.hive.metastore.ObjectStore.initialize(ObjectStore.java:335)
at org.apache.hadoop.hive.metastore.ObjectStore.setConf(ObjectStore.java:291)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
at org.apache.hadoop.hive.metastore.RawStoreProxy.<init>(RawStoreProxy.java:57)
at org.apache.hadoop.hive.metastore.RawStoreProxy.getProxy(RawStoreProxy.java:66)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newRawStore(HiveMetaStore.java:648)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:626)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:675)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:484)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:78)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:84)
at org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:5999)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:203)
at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<init>(SessionHiveMetaStoreClient.java:74)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1528)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:67)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:82)
at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3037)
at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3056)
at org.apache.hadoop.hive.ql.metadata.Hive.getAllFunctions(Hive.java:3281)
at org.apache.hadoop.hive.ql.metadata.Hive.reloadFunctions(Hive.java:217)
at org.apache.hadoop.hive.ql.metadata.Hive.registerAllFunctionsOnce(Hive.java:201)
at org.apache.hadoop.hive.ql.metadata.Hive.<init>(Hive.java:324)
at org.apache.hadoop.hive.ql.metadata.Hive.get(Hive.java:285)
at org.apache.hadoop.hive.ql.metadata.Hive.get(Hive.java:260)
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:514)
at org.apache.spark.sql.hive.client.ClientWrapper.<init>(ClientWrapper.scala:194)
at org.apache.spark.sql.hive.client.IsolatedClientLoader.createClient(IsolatedClientLoader.scala:238)
at org.apache.spark.sql.hive.HiveContext.executionHive$lzycompute(HiveContext.scala:220)
at org.apache.spark.sql.hive.HiveContext.executionHive(HiveContext.scala:210)
at org.apache.spark.sql.hive.HiveContext.functionRegistry$lzycompute(HiveContext.scala:464)
at org.apache.spark.sql.hive.HiveContext.functionRegistry(HiveContext.scala:463)
at org.apache.spark.sql.UDFRegistration.<init>(UDFRegistration.scala:40)
at org.apache.spark.sql.SQLContext.<init>(SQLContext.scala:330)
at org.apache.spark.sql.hive.HiveContext.<init>(HiveContext.scala:90)
at org.apache.spark.sql.hive.HiveContext.<init>(HiveContext.scala:101)
at prfrx.externaltableerror$.main(externaltableerror.scala:28)
at prfrx.externaltableerror.main(externaltableerror.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:542)
at com.jolbox.bonecp.BoneCP.<init>(BoneCP.java:312)
at com.jolbox.bonecp.BoneCPDataSource.maybeInit(BoneCPDataSource.java:150)
at com.jolbox.bonecp.BoneCPDataSource.getConnection(BoneCPDataSource.java:112)
at org.datanucleus.store.rdbms.ConnectionFactoryImpl$ManagedConnectionImpl.getConnection(ConnectionFactoryImpl.java:479)
at org.datanucleus.store.rdbms.RDBMSStoreManager.<init>(RDBMSStoreManager.java:304)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.datanucleus.plugin.NonManagedPluginRegistry.createExecutableExtension(NonManagedPluginRegistry.java:631)
at org.datanucleus.plugin.PluginManager.createExecutableExtension(PluginManager.java:301)
at org.datanucleus.NucleusContext.createStoreManagerForProperties(NucleusContext.java:1069)
at org.datanucleus.NucleusContext.initialise(NucleusContext.java:359)
at org.datanucleus.api.jdo.JDOPersistenceManagerFactory.freezeConfiguration(JDOPersistenceManagerFactory.java:768)
... 62 more
Caused by: java.sql.SQLException: No suitable driver found for jdbc:derby:;databaseName=/data/03/yarn/nm/usercache/username/appcache/application_1507999204018_0292/container_e69_1507999204018_0292_01_000001/tmp/spark-633fb1f8-1f38-44ac-a54e-81465354bedc/metastore;create=true
at java.sql.DriverManager.getConnection(DriverManager.java:689)
at java.sql.DriverManager.getConnection(DriverManager.java:208)
at com.jolbox.bonecp.BoneCP.obtainRawInternalConnection(BoneCP.java:254)
at com.jolbox.bonecp.BoneCP.<init>(BoneCP.java:305)
Below is the code I am using.
object externaltableerror {
def main(args: Array[String]) {
val conf = new Configuration();
conf.set("fs.defaultFS", "hdfs://...")
conf.addResource("hdfs://.../core-site.xml");
conf.addResource("hdfs://.../hdfs-site.xml");
conf.addResource("hdfs://.../hive-site.xml");
val fs = FileSystem.get(conf)
val os = fs.create(new Path("/.../Error.txt"))
try {
//System.setProperty("hive.metastore.uris", "thrift://...");
val sc = new SparkContext(new SparkConf().setAppName("withhive"))
val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
val files = sc.textFile("hdfs://.../Example.txt").first()
val rdd = sc.parallelize(List(files))
val fm = rdd.flatMap(line => line.split("\t")).map(x => x.concat(" string"))
val alternative = fm.reduce((s1, s2) => s1 + "," + s2)
val ddl = "Create external table table_name(" + alternative + ") ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' LOCATION 'hdfs://.../' tblproperties (\"skip.header.line.count\"=\"1\")"
hiveContext.sql(ddl)
sc.stop()
} catch{
// case e : Exception => new PrintWriter("hdfs://.../Error.txt") { write(e.getStackTrace.mkString("\n")); close }
// println("H" + e.getStackTrace)
case e : Exception => os.write(e.getStackTrace.mkString("\n").getBytes)
}
}
}
Any suggestions on how to get the job running with oozie will be of great help. Thanks!
I had the same issue - I fixed it by using the parameter --files /etc/hive/conf/hive-site.xml in my spark-submit job. (first I tried it in the shell and then in oozie, because I launched a .sh file that contains the spark-submit sencence)

GraphML inport into Titan

I'm new in Titan world. I would like to import data stored in GraphML file into a database.
I downloaded titan-1.0.0-hadoop1
I run ./titan.sh
I run ./gremlin.sh
In Gremlin console I wrote:
:remote connect tinkerpop.server ../conf/remote.yaml
Next, I wrote:
graph.io(IoCore.graphml()).readGraph("/tmp/file.graphml")
I got message:
No such property: graph for class: groovysh_evaluate
Could you help me?
IMO the most interesting logs from gremlin-server.log:
84 [main] INFO org.apache.tinkerpop.gremlin.server.GremlinServer - Configuring Gremlin Server from conf/gremlin-server/gremlin-server.yaml
158 [main] INFO org.apache.tinkerpop.gremlin.server.util.MetricManager - Configured Metrics ConsoleReporter configured with report interval=180000ms
160 [main] INFO org.apache.tinkerpop.gremlin.server.util.MetricManager - Configured Metrics CsvReporter configured with report interval=180000ms to fileName=/tmp/gremlin-server-metrics.csv
196 [main] INFO org.apache.tinkerpop.gremlin.server.util.MetricManager - Configured Metrics JmxReporter configured with domain= and agentId=
197 [main] INFO org.apache.tinkerpop.gremlin.server.util.MetricManager - Configured Metrics Slf4jReporter configured with interval=180000ms and loggerName=org.apache.tinkerpop.gremlin.server.Settings$Slf4jReporterMetrics
1111 [main] WARN org.apache.tinkerpop.gremlin.server.GremlinServer - Graph [graph] configured at [conf/gremlin-server/titan-berkeleyje-server.properties] could not be instantiated and will not be available in Gremlin Server. GraphFactory message: GraphFactory could not instantiate this Graph implementation [class com.thinkaurelius.titan.core.TitanFactory]
java.lang.RuntimeException: GraphFactory could not instantiate this Graph implementation [class com.thinkaurelius.titan.core.TitanFactory]
...
1113 [main] INFO org.apache.tinkerpop.gremlin.server.util.ServerGremlinExecutor - Initialized Gremlin thread pool. Threads in pool named with pattern gremlin-*
1499 [main] INFO org.apache.tinkerpop.gremlin.groovy.engine.ScriptEngines - Loaded nashorn ScriptEngine
2044 [main] INFO org.apache.tinkerpop.gremlin.groovy.engine.ScriptEngines - Loaded gremlin-groovy ScriptEngine
2488 [main] WARN org.apache.tinkerpop.gremlin.groovy.engine.GremlinExecutor - Could not initialize gremlin-groovy ScriptEngine with scripts/empty-sample.groovy as script could not be evaluated - javax.script.ScriptException: groovy.lang.MissingPropertyException: No such property: graph for class: Script1
2488 [main] INFO org.apache.tinkerpop.gremlin.server.util.ServerGremlinExecutor - Initialized GremlinExecutor and configured ScriptEngines.
2581 [main] WARN org.apache.tinkerpop.gremlin.server.AbstractChannelizer - Could not instantiate configured serializer class - org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0 - it will not be available. There is no graph named [graph] configured to be used in the useMapperFromGraph setting
2582 [main] INFO org.apache.tinkerpop.gremlin.server.AbstractChannelizer - Configured application/vnd.gremlin-v1.0+gryo-stringd with org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0
2719 [main] WARN org.apache.tinkerpop.gremlin.server.AbstractChannelizer - Could not instantiate configured serializer class - org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerGremlinV1d0 - it will not be available. There is no graph named [graph] configured to be used in the useMapperFromGraph setting
2720 [main] WARN org.apache.tinkerpop.gremlin.server.AbstractChannelizer - Could not instantiate configured serializer class - org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV1d0 - it will not be available. There is no graph named [graph] configured to be used in the useMapperFromGraph setting
...
You need to create a graph. the graph keyword isn't declared anywhere in your script.
This is briefly covered in the Titan Server documentation, but it is easily overlooked.
The :> is the "submit" command which sends the Gremlin on that line to the currently active remote.
In step 5, you need to submit your script command to the remote server. In the Gremlin Console, you do this by starting your command with :submit or :> for shorthand.
:> graph.io(IoCore.graphml()).readGraph("/tmp/file.graphml")
If you don't submit the script to the remote server, the Gremlin Console will attempt to process the script within the console's JVM. graph is not defined locally, and that is why you saw the error in step 6.
Update: Based on your gremlin-server.log it looks like the issue is that the user that starts Titan with ./bin/titan.sh start doesn't have the appropriate file permissions to create the directory (db/berkeley) used by the default graph configuration (titan-berkeleyje-server.properties). Try updating the file permissions on the $TITAN_HOME directory.

Spark - Actor not found for: ActorSelection

I just cloned the master repository of Spark from Github. I am running it on OSX 10.9, Spark 1.4.1 and Scala 2.10.4
I just tried to run the SparkPi example program using IntelliJ Idea but get the error : akka.actor.ActorNotFound: Actor not found for: ActorSelection[Anchor(akka.tcp://sparkMaster#myhost:7077/)
I did checkout a similar post at the mailing list but found no solution.
Find the complete stack trace below. Any help would be really appreciated.
2015-07-28 22:16:45,888 INFO [main] spark.SparkContext (Logging.scala:logInfo(59)) - Running Spark version 1.5.0-SNAPSHOT
2015-07-28 22:16:47,125 WARN [main] util.NativeCodeLoader (NativeCodeLoader.java:<clinit>(62)) - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2015-07-28 22:16:47,753 INFO [main] spark.SecurityManager (Logging.scala:logInfo(59)) - Changing view acls to: mac
2015-07-28 22:16:47,755 INFO [main] spark.SecurityManager (Logging.scala:logInfo(59)) - Changing modify acls to: mac
2015-07-28 22:16:47,756 INFO [main] spark.SecurityManager (Logging.scala:logInfo(59)) - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(mac); users with modify permissions: Set(mac)
2015-07-28 22:16:49,454 INFO [sparkDriver-akka.actor.default-dispatcher-2] slf4j.Slf4jLogger (Slf4jLogger.scala:applyOrElse(80)) - Slf4jLogger started
2015-07-28 22:16:49,695 INFO [sparkDriver-akka.actor.default-dispatcher-2] Remoting (Slf4jLogger.scala:apply$mcV$sp(74)) - Starting remoting
2015-07-28 22:16:50,167 INFO [sparkDriver-akka.actor.default-dispatcher-2] Remoting (Slf4jLogger.scala:apply$mcV$sp(74)) - Remoting started; listening on addresses :[akka.tcp://sparkDriver#192.168.2.105:49981]
2015-07-28 22:16:50,215 INFO [main] util.Utils (Logging.scala:logInfo(59)) - Successfully started service 'sparkDriver' on port 49981.
2015-07-28 22:16:50,372 INFO [main] spark.SparkEnv (Logging.scala:logInfo(59)) - Registering MapOutputTracker
2015-07-28 22:16:50,596 INFO [main] spark.SparkEnv (Logging.scala:logInfo(59)) - Registering BlockManagerMaster
2015-07-28 22:16:50,948 INFO [main] storage.DiskBlockManager (Logging.scala:logInfo(59)) - Created local directory at /private/var/folders/8k/jfw576r50m97rlk5qpj1n4l80000gn/T/blockmgr-309db4d1-d129-43e5-a90e-12cf51ad491f
2015-07-28 22:16:51,198 INFO [main] storage.MemoryStore (Logging.scala:logInfo(59)) - MemoryStore started with capacity 491.7 MB
2015-07-28 22:16:51,707 INFO [main] spark.HttpFileServer (Logging.scala:logInfo(59)) - HTTP File server directory is /private/var/folders/8k/jfw576r50m97rlk5qpj1n4l80000gn/T/spark-f28e24e7-b798-4365-8209-409d8b27ad2f/httpd-ce32c41d-b618-49e9-bec1-f409454f3679
2015-07-28 22:16:51,777 INFO [main] spark.HttpServer (Logging.scala:logInfo(59)) - Starting HTTP Server
2015-07-28 22:16:52,091 INFO [main] server.Server (Server.java:doStart(272)) - jetty-8.1.14.v20131031
2015-07-28 22:16:52,116 INFO [main] server.AbstractConnector (AbstractConnector.java:doStart(338)) - Started SocketConnector#0.0.0.0:49982
2015-07-28 22:16:52,116 INFO [main] util.Utils (Logging.scala:logInfo(59)) - Successfully started service 'HTTP file server' on port 49982.
2015-07-28 22:16:52,249 INFO [main] spark.SparkEnv (Logging.scala:logInfo(59)) - Registering OutputCommitCoordinator
2015-07-28 22:16:54,253 INFO [main] server.Server (Server.java:doStart(272)) - jetty-8.1.14.v20131031
2015-07-28 22:16:54,315 INFO [main] server.AbstractConnector (AbstractConnector.java:doStart(338)) - Started SelectChannelConnector#0.0.0.0:4040
2015-07-28 22:16:54,317 INFO [main] util.Utils (Logging.scala:logInfo(59)) - Successfully started service 'SparkUI' on port 4040.
2015-07-28 22:16:54,386 INFO [main] ui.SparkUI (Logging.scala:logInfo(59)) - Started SparkUI at http://192.168.2.105:4040
2015-07-28 22:16:54,924 WARN [main] metrics.MetricsSystem (Logging.scala:logWarning(71)) - Using default name DAGScheduler for source because spark.app.id is not set.
2015-07-28 22:16:55,132 INFO [appclient-register-master-threadpool-0] client.AppClient$ClientEndpoint (Logging.scala:logInfo(59)) - Connecting to master spark://myhost:7077...
2015-07-28 22:16:55,392 WARN [sparkDriver-akka.actor.default-dispatcher-14] client.AppClient$ClientEndpoint (Logging.scala:logWarning(71)) - Could not connect to myhost:7077: akka.remote.EndpointAssociationException: Association failed with [akka.tcp://sparkMaster#myhost:7077]
2015-07-28 22:16:55,412 WARN [sparkDriver-akka.actor.default-dispatcher-14] remote.ReliableDeliverySupervisor (Slf4jLogger.scala:apply$mcV$sp(71)) - Association with remote system [akka.tcp://sparkMaster#myhost:7077] has failed, address is now gated for [5000] ms. Reason: [Association failed with [akka.tcp://sparkMaster#myhost:7077]] Caused by: [myhost: unknown error]
2015-07-28 22:16:55,447 WARN [appclient-register-master-threadpool-0] client.AppClient$ClientEndpoint (Logging.scala:logWarning(92)) - Failed to connect to master myhost:7077
akka.actor.ActorNotFound: Actor not found for: ActorSelection[Anchor(akka.tcp://sparkMaster#myhost:7077/), Path(/user/Master)]
at akka.actor.ActorSelection$$anonfun$resolveOne$1.apply(ActorSelection.scala:65)
at akka.actor.ActorSelection$$anonfun$resolveOne$1.apply(ActorSelection.scala:63)
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:32)
at akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55)
at akka.dispatch.BatchingExecutor$Batch.run(BatchingExecutor.scala:73)
at akka.dispatch.ExecutionContexts$sameThreadExecutionContext$.unbatchedExecute(Future.scala:74)
at akka.dispatch.BatchingExecutor$class.execute(BatchingExecutor.scala:120)
at akka.dispatch.ExecutionContexts$sameThreadExecutionContext$.execute(Future.scala:73)
at scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:40)
at scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:248)
at akka.pattern.PromiseActorRef.$bang(AskSupport.scala:266)
at akka.actor.EmptyLocalActorRef.specialHandle(ActorRef.scala:533)
at akka.actor.DeadLetterActorRef.specialHandle(ActorRef.scala:569)
at akka.actor.DeadLetterActorRef.$bang(ActorRef.scala:559)
at akka.remote.RemoteActorRefProvider$RemoteDeadLetterActorRef.$bang(RemoteActorRefProvider.scala:87)
at akka.remote.EndpointWriter.postStop(Endpoint.scala:557)
at akka.actor.Actor$class.aroundPostStop(Actor.scala:477)
at akka.remote.EndpointActor.aroundPostStop(Endpoint.scala:411)
at akka.actor.dungeon.FaultHandling$class.akka$actor$dungeon$FaultHandling$$finishTerminate(FaultHandling.scala:210)
at akka.actor.dungeon.FaultHandling$class.terminate(FaultHandling.scala:172)
at akka.actor.ActorCell.terminate(ActorCell.scala:369)
at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:462)
at akka.actor.ActorCell.systemInvoke(ActorCell.scala:478)
at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:263)
at akka.dispatch.Mailbox.run(Mailbox.scala:219)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
2015-07-28 22:17:15,459 INFO [appclient-register-master-threadpool-0] client.AppClient$ClientEndpoint (Logging.scala:logInfo(59)) - Connecting to master spark://myhost:7077...
2015-07-28 22:17:15,463 WARN [sparkDriver-akka.actor.default-dispatcher-14] client.AppClient$ClientEndpoint (Logging.scala:logWarning(71)) - Could not connect to myhost:7077: akka.remote.EndpointAssociationException: Association failed with [akka.tcp://sparkMaster#myhost:7077]
2015-07-28 22:17:15,464 WARN [sparkDriver-akka.actor.default-dispatcher-2] remote.ReliableDeliverySupervisor (Slf4jLogger.scala:apply$mcV$sp(71)) - Association with remote system [akka.tcp://sparkMaster#myhost:7077] has failed, address is now gated for [5000] ms. Reason: [Association failed with [akka.tcp://sparkMaster#myhost:7077]] Caused by: [myhost: unknown error]
2015-07-28 22:17:15,464 WARN [appclient-register-master-threadpool-0] client.AppClient$ClientEndpoint (Logging.scala:logWarning(92)) - Failed to connect to master myhost:7077
akka.actor.ActorNotFound: Actor not found for: ActorSelection[Anchor(akka.tcp://sparkMaster#myhost:7077/), Path(/user/Master)]
at akka.actor.ActorSelection$$anonfun$resolveOne$1.apply(ActorSelection.scala:65)
at akka.actor.ActorSelection$$anonfun$resolveOne$1.apply(ActorSelection.scala:63)
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:32)
at akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55)
at akka.dispatch.BatchingExecutor$Batch.run(BatchingExecutor.scala:73)
at akka.dispatch.ExecutionContexts$sameThreadExecutionContext$.unbatchedExecute(Future.scala:74)
at akka.dispatch.BatchingExecutor$class.execute(BatchingExecutor.scala:120)
at akka.dispatch.ExecutionContexts$sameThreadExecutionContext$.execute(Future.scala:73)
at scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:40)
at scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:248)
at akka.pattern.PromiseActorRef.$bang(AskSupport.scala:266)
at akka.actor.EmptyLocalActorRef.specialHandle(ActorRef.scala:533)
at akka.actor.DeadLetterActorRef.specialHandle(ActorRef.scala:569)
at akka.actor.DeadLetterActorRef.$bang(ActorRef.scala:559)
at akka.remote.RemoteActorRefProvider$RemoteDeadLetterActorRef.$bang(RemoteActorRefProvider.scala:87)
at akka.remote.EndpointWriter.postStop(Endpoint.scala:557)
at akka.actor.Actor$class.aroundPostStop(Actor.scala:477)
at akka.remote.EndpointActor.aroundPostStop(Endpoint.scala:411)
at akka.actor.dungeon.FaultHandling$class.akka$actor$dungeon$FaultHandling$$finishTerminate(FaultHandling.scala:210)
at akka.actor.dungeon.FaultHandling$class.terminate(FaultHandling.scala:172)
at akka.actor.ActorCell.terminate(ActorCell.scala:369)
at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:462)
at akka.actor.ActorCell.systemInvoke(ActorCell.scala:478)
at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:263)
at akka.dispatch.Mailbox.run(Mailbox.scala:219)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
2015-07-28 22:17:35,136 INFO [appclient-register-master-threadpool-0] client.AppClient$ClientEndpoint (Logging.scala:logInfo(59)) - Connecting to master spark://myhost:7077...
2015-07-28 22:17:35,141 WARN [sparkDriver-akka.actor.default-dispatcher-13] client.AppClient$ClientEndpoint (Logging.scala:logWarning(71)) - Could not connect to myhost:7077: akka.remote.EndpointAssociationException: Association failed with [akka.tcp://sparkMaster#myhost:7077]
2015-07-28 22:17:35,142 WARN [sparkDriver-akka.actor.default-dispatcher-13] remote.ReliableDeliverySupervisor (Slf4jLogger.scala:apply$mcV$sp(71)) - Association with remote system [akka.tcp://sparkMaster#myhost:7077] has failed, address is now gated for [5000] ms. Reason: [Association failed with [akka.tcp://sparkMaster#myhost:7077]] Caused by: [myhost: unknown error]
2015-07-28 22:17:35,142 WARN [appclient-register-master-threadpool-0] client.AppClient$ClientEndpoint (Logging.scala:logWarning(92)) - Failed to connect to master myhost:7077
akka.actor.ActorNotFound: Actor not found for: ActorSelection[Anchor(akka.tcp://sparkMaster#myhost:7077/), Path(/user/Master)]
at akka.actor.ActorSelection$$anonfun$resolveOne$1.apply(ActorSelection.scala:65)
at akka.actor.ActorSelection$$anonfun$resolveOne$1.apply(ActorSelection.scala:63)
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:32)
at akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55)
at akka.dispatch.BatchingExecutor$Batch.run(BatchingExecutor.scala:73)
at akka.dispatch.ExecutionContexts$sameThreadExecutionContext$.unbatchedExecute(Future.scala:74)
at akka.dispatch.BatchingExecutor$class.execute(BatchingExecutor.scala:120)
at akka.dispatch.ExecutionContexts$sameThreadExecutionContext$.execute(Future.scala:73)
at scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:40)
at scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:248)
at akka.pattern.PromiseActorRef.$bang(AskSupport.scala:266)
at akka.actor.EmptyLocalActorRef.specialHandle(ActorRef.scala:533)
at akka.actor.DeadLetterActorRef.specialHandle(ActorRef.scala:569)
at akka.actor.DeadLetterActorRef.$bang(ActorRef.scala:559)
at akka.remote.RemoteActorRefProvider$RemoteDeadLetterActorRef.$bang(RemoteActorRefProvider.scala:87)
at akka.remote.EndpointWriter.postStop(Endpoint.scala:557)
at akka.actor.Actor$class.aroundPostStop(Actor.scala:477)
at akka.remote.EndpointActor.aroundPostStop(Endpoint.scala:411)
at akka.actor.dungeon.FaultHandling$class.akka$actor$dungeon$FaultHandling$$finishTerminate(FaultHandling.scala:210)
at akka.actor.dungeon.FaultHandling$class.terminate(FaultHandling.scala:172)
at akka.actor.ActorCell.terminate(ActorCell.scala:369)
at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:462)
at akka.actor.ActorCell.systemInvoke(ActorCell.scala:478)
at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:263)
at akka.dispatch.Mailbox.run(Mailbox.scala:219)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
2015-07-28 22:17:35,462 INFO [appclient-register-master-threadpool-0] client.AppClient$ClientEndpoint (Logging.scala:logInfo(59)) - Connecting to master spark://myhost:7077...
2015-07-28 22:17:35,464 WARN [appclient-register-master-threadpool-0] client.AppClient$ClientEndpoint (Logging.scala:logWarning(92)) - Failed to connect to master myhost:7077
akka.actor.ActorNotFound: Actor not found for: ActorSelection[Anchor(akka.tcp://sparkMaster#myhost:7077/), Path(/user/Master)]
at akka.actor.ActorSelection$$anonfun$resolveOne$1.apply(ActorSelection.scala:65)
at akka.actor.ActorSelection$$anonfun$resolveOne$1.apply(ActorSelection.scala:63)
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:32)
at akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55)
at akka.dispatch.BatchingExecutor$Batch.run(BatchingExecutor.scala:73)
at akka.dispatch.ExecutionContexts$sameThreadExecutionContext$.unbatchedExecute(Future.scala:74)
at akka.dispatch.BatchingExecutor$class.execute(BatchingExecutor.scala:120)
at akka.dispatch.ExecutionContexts$sameThreadExecutionContext$.execute(Future.scala:73)
at scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:40)
at scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:248)
at akka.pattern.PromiseActorRef.$bang(AskSupport.scala:266)
at akka.actor.EmptyLocalActorRef.specialHandle(ActorRef.scala:533)
at akka.actor.DeadLetterActorRef.specialHandle(ActorRef.scala:569)
at akka.actor.DeadLetterActorRef.$bang(ActorRef.scala:559)
at akka.remote.RemoteActorRefProvider$RemoteDeadLetterActorRef.$bang(RemoteActorRefProvider.scala:87)
at akka.remote.ReliableDeliverySupervisor$$anonfun$gated$1.applyOrElse(Endpoint.scala:335)
at akka.actor.Actor$class.aroundReceive(Actor.scala:467)
at akka.remote.ReliableDeliverySupervisor.aroundReceive(Endpoint.scala:188)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
at akka.actor.ActorCell.invoke(ActorCell.scala:487)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238)
at akka.dispatch.Mailbox.run(Mailbox.scala:220)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
2015-07-28 22:17:55,135 INFO [appclient-register-master-threadpool-0] client.AppClient$ClientEndpoint (Logging.scala:logInfo(59)) - Connecting to master spark://myhost:7077...
2015-07-28 22:17:55,140 WARN [sparkDriver-akka.actor.default-dispatcher-19] client.AppClient$ClientEndpoint (Logging.scala:logWarning(71)) - Could not connect to myhost:7077: akka.remote.EndpointAssociationException: Association failed with [akka.tcp://sparkMaster#myhost:7077]
2015-07-28 22:17:55,140 WARN [sparkDriver-akka.actor.default-dispatcher-3] remote.ReliableDeliverySupervisor (Slf4jLogger.scala:apply$mcV$sp(71)) - Association with remote system [akka.tcp://sparkMaster#myhost:7077] has failed, address is now gated for [5000] ms. Reason: [Association failed with [akka.tcp://sparkMaster#myhost:7077]] Caused by: [myhost: unknown error]
2015-07-28 22:17:55,178 ERROR [appclient-registration-retry-thread] util.SparkUncaughtExceptionHandler (Logging.scala:logError(96)) - Uncaught exception in thread Thread[appclient-registration-retry-thread,5,main]
java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.FutureTask#3db0c61c rejected from java.util.concurrent.ThreadPoolExecutor#33773fda[Running, pool size = 1, active threads = 1, queued tasks = 0, completed tasks = 4]
at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2047)
at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:823)
at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1369)
at java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:112)
at org.apache.spark.deploy.client.AppClient$ClientEndpoint$$anonfun$tryRegisterAllMasters$1.apply(AppClient.scala:96)
at org.apache.spark.deploy.client.AppClient$ClientEndpoint$$anonfun$tryRegisterAllMasters$1.apply(AppClient.scala:95)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108)
at org.apache.spark.deploy.client.AppClient$ClientEndpoint.tryRegisterAllMasters(AppClient.scala:95)
at org.apache.spark.deploy.client.AppClient$ClientEndpoint.org$apache$spark$deploy$client$AppClient$ClientEndpoint$$registerWithMaster(AppClient.scala:121)
at org.apache.spark.deploy.client.AppClient$ClientEndpoint$$anon$2$$anonfun$run$1.apply$mcV$sp(AppClient.scala:132)
at org.apache.spark.util.Utils$.tryOrExit(Utils.scala:1218)
at org.apache.spark.deploy.client.AppClient$ClientEndpoint$$anon$2.run(AppClient.scala:124)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
2015-07-28 22:17:55,224 INFO [Thread-0] storage.DiskBlockManager (Logging.scala:logInfo(59)) - Shutdown hook called
2015-07-28 22:17:55,241 INFO [Thread-0] util.Utils (Logging.scala:logInfo(59)) - Shutdown hook called
2015-07-28 22:17:55,243 INFO [Thread-0] util.Utils (Logging.scala:logInfo(59)) - Deleting directory /private/var/folders/8k/jfw576r50m97rlk5qpj1n4l80000gn/T/spark-f28e24e7-b798-4365-8209-409d8b27ad2f/userFiles-5ccb1927-1499-4deb-b4b2-92a24d8ab7a3
The problem was that I was trying to start the example app in standalone cluster mode by passing in
-Dspark.master=spark://myhost:7077
as an argument to the JVM. I launched the example app locally using
-Dspark.master=local
and it worked.
I know this is an old question ,
just in case, for users come here after installing spark chart on Kubernetis cluster :
after chart installation open Spark UI on localhost:8080
figure out spark master name , for example : Spark Master at spark://newbie-cricket-master:7077
then on master cmd /bin/spark-shell --master spark://newbie-cricket-master:7077

Sqoop installation export and import from postgresql

I v'e just installed sqoop and was testing it . I tried to export some data from hdfs to postgresql using sqoop. When I run it it throws the following exception : java.io.IOException: Can't export data, please check task tracker logs . I think there may also have been a problem in installation.
The File content is :
ustNU 45
MB1bA 0
gNbCO 76
iZP10 39
B2aoo 45
SI7eG 93
5sC4k 60
2IhFV 2
u2A48 16
yvy6R 51
LNhsV 26
mZ2yn 65
80Gp3 43
Wk5Ag 85
VUfyp 93
P077j 94
f1Oj5 11
LxJkg 72
0H7NP 99
Dk406 25
g4KRp 76
Fw3U0 80
6LD59 1
07KHx 91
F1S88 72
Bnb0v 85
A2qM7 79
Z6cAt 81
0M3DO 23
m0s09 44
KIvwd 13
GNUD0 78
um93a 20
19bHv 75
4Of3s 75
5hFen 16
This is the posgres table:
Table "public.mysort"
Column | Type | Modifiers
--------+---------+-----------
name | text |
marks | integer |
The sqoop command is:
sqoop export --connect jdbc:postgresql://localhost/testdb --username akshay --password akshay --table mysort -m 1 --export-dir MySort/input
Followed by the error:
Warning: /usr/lib/hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
14/06/11 18:28:06 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
14/06/11 18:28:06 INFO manager.SqlManager: Using default fetchSize of 1000
14/06/11 18:28:06 INFO tool.CodeGenTool: Beginning code generation
14/06/11 18:28:06 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM "mysort" AS t LIMIT 1
14/06/11 18:28:06 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/local/hadoop
Note: /tmp/sqoop-hduser/compile/0402ad4b5cf7980040264af35de406cb/mysort.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
14/06/11 18:28:07 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hduser/compile/0402ad4b5cf7980040264af35de406cb/mysort.jar
14/06/11 18:28:07 INFO mapreduce.ExportJobBase: Beginning export of mysort
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/lib/hbase/lib/slf4j-log4j12-1.6.4.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Java HotSpot(TM) 64-Bit Server VM warning: You have loaded library /usr/local/hadoop/lib/native/libhadoop.so.1.0.0 which might have disabled stack guard. The VM will try to fix the stack guard now.
It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'.
14/06/11 18:28:22 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
14/06/11 18:28:22 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
14/06/11 18:28:23 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative
14/06/11 18:28:23 INFO Configuration.deprecation: mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative
14/06/11 18:28:23 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
14/06/11 18:28:23 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
14/06/11 18:28:24 INFO input.FileInputFormat: Total input paths to process : 1
14/06/11 18:28:24 INFO input.FileInputFormat: Total input paths to process : 1
14/06/11 18:28:25 INFO mapreduce.JobSubmitter: number of splits:1
14/06/11 18:28:25 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1402488523460_0003
14/06/11 18:28:25 INFO impl.YarnClientImpl: Submitted application application_1402488523460_0003
14/06/11 18:28:25 INFO mapreduce.Job: The url to track the job: http://localhost:8088/proxy/application_1402488523460_0003/
14/06/11 18:28:25 INFO mapreduce.Job: Running job: job_1402488523460_0003
14/06/11 18:28:46 INFO mapreduce.Job: Job job_1402488523460_0003 running in uber mode : false
14/06/11 18:28:46 INFO mapreduce.Job: map 0% reduce 0%
14/06/11 18:29:04 INFO mapreduce.Job: Task Id : attempt_1402488523460_0003_m_000000_0, Status : FAILED
Error: java.io.IOException: Can't export data, please check task tracker logs
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Caused by: java.util.NoSuchElementException
at java.util.ArrayList$Itr.next(ArrayList.java:839)
at mysort.__loadFromFields(mysort.java:198)
at mysort.parse(mysort.java:147)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:83)
... 10 more
14/06/11 18:29:23 INFO mapreduce.Job: Task Id : attempt_1402488523460_0003_m_000000_1, Status : FAILED
Error: java.io.IOException: Can't export data, please check task tracker logs
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Caused by: java.util.NoSuchElementException
at java.util.ArrayList$Itr.next(ArrayList.java:839)
at mysort.__loadFromFields(mysort.java:198)
at mysort.parse(mysort.java:147)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:83)
... 10 more
14/06/11 18:29:42 INFO mapreduce.Job: Task Id : attempt_1402488523460_0003_m_000000_2, Status : FAILED
Error: java.io.IOException: Can't export data, please check task tracker logs
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Caused by: java.util.NoSuchElementException
at java.util.ArrayList$Itr.next(ArrayList.java:839)
at mysort.__loadFromFields(mysort.java:198)
at mysort.parse(mysort.java:147)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:83)
... 10 more
14/06/11 18:30:03 INFO mapreduce.Job: map 100% reduce 0%
14/06/11 18:30:03 INFO mapreduce.Job: Job job_1402488523460_0003 failed with state FAILED due to: Task failed task_1402488523460_0003_m_000000
Job failed as tasks failed. failedMaps:1 failedReduces:0
14/06/11 18:30:03 INFO mapreduce.Job: Counters: 9
Job Counters
Failed map tasks=4
Launched map tasks=4
Other local map tasks=3
Data-local map tasks=1
Total time spent by all maps in occupied slots (ms)=69336
Total time spent by all reduces in occupied slots (ms)=0
Total time spent by all map tasks (ms)=69336
Total vcore-seconds taken by all map tasks=69336
Total megabyte-seconds taken by all map tasks=71000064
14/06/11 18:30:03 WARN mapreduce.Counters: Group FileSystemCounters is deprecated. Use org.apache.hadoop.mapreduce.FileSystemCounter instead
14/06/11 18:30:03 INFO mapreduce.ExportJobBase: Transferred 0 bytes in 100.1476 seconds (0 bytes/sec)
14/06/11 18:30:03 WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead
14/06/11 18:30:03 INFO mapreduce.ExportJobBase: Exported 0 records.
14/06/11 18:30:03 ERROR tool.ExportTool: Error during export: Export job failed!
This is the log file :
2014-06-11 17:54:37,601 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring.
2014-06-11 17:54:37,602 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring.
2014-06-11 17:54:52,678 WARN [main] org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2014-06-11 17:54:52,777 INFO [main] org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
2014-06-11 17:54:52,846 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2014-06-11 17:54:52,847 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system started
2014-06-11 17:54:52,855 INFO [main] org.apache.hadoop.mapred.YarnChild: Executing with tokens:
2014-06-11 17:54:52,855 INFO [main] org.apache.hadoop.mapred.YarnChild: Kind: mapreduce.job, Service: job_1402488523460_0002, Ident: (org.apache.hadoop.mapreduce.security.token.JobTokenIdentifier#971d0d8)
2014-06-11 17:54:52,901 INFO [main] org.apache.hadoop.mapred.YarnChild: Sleeping for 0ms before retrying again. Got null now.
2014-06-11 17:54:53,165 INFO [main] org.apache.hadoop.mapred.YarnChild: mapreduce.cluster.local.dir for child: /tmp/hadoop-hduser/nm-local-dir/usercache/hduser/appcache/application_1402488523460_0002
2014-06-11 17:54:53,249 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring.
2014-06-11 17:54:53,249 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring.
2014-06-11 17:54:53,393 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id
2014-06-11 17:54:53,689 INFO [main] org.apache.hadoop.mapred.Task: Using ResourceCalculatorProcessTree : [ ]
2014-06-11 17:54:53,899 INFO [main] org.apache.hadoop.mapred.MapTask: Processing split: Paths:/user/hduser/MySort/input/data.txt:0+891082
2014-06-11 17:54:53,904 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: map.input.file is deprecated. Instead, use mapreduce.map.input.file
2014-06-11 17:54:53,904 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: map.input.start is deprecated. Instead, use mapreduce.map.input.start
2014-06-11 17:54:53,904 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: map.input.length is deprecated. Instead, use mapreduce.map.input.length
2014-06-11 17:54:54,028 ERROR [main] org.apache.sqoop.mapreduce.TextExportMapper:
2014-06-11 17:54:54,028 ERROR [main] org.apache.sqoop.mapreduce.TextExportMapper: Exception raised during data export
2014-06-11 17:54:54,028 ERROR [main] org.apache.sqoop.mapreduce.TextExportMapper:
2014-06-11 17:54:54,028 ERROR [main] org.apache.sqoop.mapreduce.TextExportMapper: Exception:
java.util.NoSuchElementException
at java.util.ArrayList$Itr.next(ArrayList.java:839)
at mysort.__loadFromFields(mysort.java:198)
at mysort.parse(mysort.java:147)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:83)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
2014-06-11 17:54:54,030 ERROR [main] org.apache.sqoop.mapreduce.TextExportMapper: On input: ustNU 45
2014-06-11 17:54:54,031 ERROR [main] org.apache.sqoop.mapreduce.TextExportMapper: On input file: hdfs://localhost:9000/user/hduser/MySort/input/data.txt
2014-06-11 17:54:54,031 ERROR [main] org.apache.sqoop.mapreduce.TextExportMapper: At position 0
2014-06-11 17:54:54,031 ERROR [main] org.apache.sqoop.mapreduce.TextExportMapper:
2014-06-11 17:54:54,031 ERROR [main] org.apache.sqoop.mapreduce.TextExportMapper: Currently processing split:
2014-06-11 17:54:54,031 ERROR [main] org.apache.sqoop.mapreduce.TextExportMapper: Paths:/user/hduser/MySort/input/data.txt:0+891082
2014-06-11 17:54:54,031 ERROR [main] org.apache.sqoop.mapreduce.TextExportMapper:
2014-06-11 17:54:54,031 ERROR [main] org.apache.sqoop.mapreduce.TextExportMapper: This issue might not necessarily be caused by current input
2014-06-11 17:54:54,031 ERROR [main] org.apache.sqoop.mapreduce.TextExportMapper: due to the batching nature of export.
2014-06-11 17:54:54,031 ERROR [main] org.apache.sqoop.mapreduce.TextExportMapper:
2014-06-11 17:54:54,032 INFO [Thread-12] org.apache.sqoop.mapreduce.AutoProgressMapper: Auto-progress thread is finished. keepGoing=false
2014-06-11 17:54:54,033 WARN [main] org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:hduser (auth:SIMPLE) cause:java.io.IOException: Can't export data, please check task tracker logs
2014-06-11 17:54:54,033 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.io.IOException: Can't export data, please check task tracker logs
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Caused by: java.util.NoSuchElementException
at java.util.ArrayList$Itr.next(ArrayList.java:839)
at mysort.__loadFromFields(mysort.java:198)
at mysort.parse(mysort.java:147)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:83)
... 10 more
2014-06-11 17:54:54,037 INFO [main] org.apache.hadoop.mapred.Task: Runnning cleanup for the task
Any help in resolving the issue is appreciated.
Here is the complete procedure for installation and import and export commands for Sqoop. Hope fully it may be helpful to some one. This one is tried and tested by me and actually works.
Download : apache.mirrors.tds.net/sqoop/1.4.4/sqoop-1.4.4.bin__hadoop-2.0.4-alpha.tar.gz
sudo mv sqoop-1.4.4.bin__hadoop-2.0.4-alpha.tar.gz /usr/lib/sqoop
copy paste followingtwo lines in .bashrc
export SQOOP_HOME=/usr/lib/sqoop
export PATH=$PATH:$SQOOP_HOME/bin
Go to /usr/lib/sqoop/conf folder and copy sqoop-env-template.sh to new file sqoop-env.sh and modify export HADOOP_HOME ,HBASE_HOME,etc to the installation directory
Download the postgresql conector jar file from jdbc.postgresql.org/download/postgresql-9.3-1101.jdbc41.jar
create a directory manager.d in sqoop/conf/
create a file postgresql in conf/ and add the following line in it
org.postgresql.Driver=/usr/lib/sqoop/lib/postgresql-9.3-1101.jdbc41.jar
name the connector.jar file accordingly
For Export
Create a user in postgres:
createuser -P -s -e ace
Enter password for new role: ace
Enter it again: ace
CREATE DATABASE testdb OWNER ace TABLESPACE ace;
create table stud1(id int,name text);
Create a file student.txt
Add lines such as:
1,Ace
2,iloveapis
hadoop fs -put student.txt
sqoop export --connect jdbc:postgresql://localhost:5432/testdb --username ace --password ace --table stud1 -m 1 --export-dir student.txt
check in postgres: Select * from stud1;
For Import:
sqoop import --connect jdbc:postgresql://localhost:5432/testdb --username akshay --password akshay --table stud1 --m 1
hadoop fs -ls -R stud1
Expected Output:
-rw-r--r-- 1 hduser supergroup 0 2014-06-13 18:10 stud1/_SUCCESS
-rw-r--r-- 1 hduser supergroup 21 2014-06-13 18:10 stud1/part-m-00000
hadoop fs -cat stud1/part-m-00000
Expected Output:
1,Ace
2,iloveapis
hadoop fs -copyToLocal stud1/part-m-00000 $HOME/imported_data.txt