Storing HDFS data into MongoDB using Pig - mongodb
I am new to Hadoop and having a requirement to store the Hadoop data into MongoDB. Here I am using Pig to store the data in Hadoop into MongoDB.
I downloaded and registered the following drivers to do this in Pig Grunt shell with the help of given command,
REGISTER /home/miracle/Downloads/mongo-hadoop-pig-2.0.2.jar
REGISTER /home/miracle/Downloads/mongo-java-driver-3.4.2.jar
REGISTER /home/miracle/Downloads/mongo-hadoop-core-2.0.2.jar
After this I successfully got the data from MongoDB using the following command.
raw = LOAD 'mongodb://localhost:27017/pavan.pavan.in' USING com.mongodb.hadoop.pig.MongoLoader;
Then I tried the following command to insert the data from pig bag to MongoDB and got succeeded.
STORE student INTO 'mongodb://localhost:27017/pavan.persons_info' USING com.mongodb.hadoop.pig.MongoInsertStorage('','');
Then I am trying the Mongo Update using the below command.
STORE student INTO 'mongodb://localhost:27017/pavan.persons_info1' USING com.mongodb.hadoop.pig.MongoUpdateStorage(' ','{first:"\$firstname", last:"\$lastname", phone:"\$phone", city:"\$city"}','firstname: chararray,lastname: chararray,phone: chararray,city: chararray');
But I am getting the below error while performing the above command.
2017-03-22 11:16:42,516 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2017-03-22 11:16:43,064 [main] INFO com.mongodb.hadoop.pig.MongoUpdateStorage - Store location config: Configuration: ; for namespace: pavan.persons_info1; hosts: [localhost:27017]
2017-03-22 11:16:43,180 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: UNKNOWN
2017-03-22 11:16:43,306 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2017-03-22 11:16:43,308 [main] WARN org.apache.pig.data.SchemaTupleBackend - SchemaTupleBackend has already been initialized
2017-03-22 11:16:43,309 [main] INFO org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - {RULES_ENABLED=[AddForEach, ColumnMapKeyPrune, ConstantCalculator, GroupByConstParallelSetter, LimitOptimizer, LoadTypeCastInserter, MergeFilter, MergeForEach, PartitionFilterOptimizer, PredicatePushdownOptimizer, PushDownForEachFlatten, PushUpFilter, SplitFilter, StreamTypeCastInserter]}
2017-03-22 11:16:43,310 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2017-03-22 11:16:43,314 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2017-03-22 11:16:43,314 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2017-03-22 11:16:43,415 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2017-03-22 11:16:43,419 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
2017-03-22 11:16:43,423 [main] INFO org.apache.pig.tools.pigstats.mapreduce.MRScriptState - Pig script settings are added to the job
2017-03-22 11:16:43,425 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2017-03-22 11:16:43,438 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - This job cannot be converted run in-process
2017-03-22 11:16:43,603 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/miracle/Downloads/mongo-java-driver-3.0.4.jar to DistributedCache through /tmp/temp159471787/tmp643027494/mongo-java-driver-3.0.4.jar
2017-03-22 11:16:43,687 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/miracle/Downloads/mongo-hadoop-core-2.0.2.jar to DistributedCache through /tmp/temp159471787/tmp-1745369112/mongo-hadoop-core-2.0.2.jar
2017-03-22 11:16:43,822 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/miracle/Downloads/mongo-hadoop-pig-2.0.2.jar to DistributedCache through /tmp/temp159471787/tmp116725398/mongo-hadoop-pig-2.0.2.jar
2017-03-22 11:16:44,693 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/usr/local/pig/pig-0.16.0/pig-0.16.0-core-h2.jar to DistributedCache through /tmp/temp159471787/tmp499355324/pig-0.16.0-core-h2.jar
2017-03-22 11:16:44,762 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/usr/local/pig/pig-0.16.0/lib/automaton-1.11-8.jar to DistributedCache through /tmp/temp159471787/tmp413788756/automaton-1.11-8.jar
2017-03-22 11:16:44,830 [DataStreamer for file /tmp/temp159471787/tmp-380031198/antlr-runtime-3.4.jar block BP-1303579226-127.0.1.1-1489750707340:blk_1073742392_1568] WARN org.apache.hadoop.hdfs.DFSClient - Caught exception
java.lang.InterruptedException
at java.lang.Object.wait(Native Method)
at java.lang.Thread.join(Thread.java:1249)
at java.lang.Thread.join(Thread.java:1323)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.closeResponder(DFSOutputStream.java:609)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.closeInternal(DFSOutputStream.java:577)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:573)
2017-03-22 11:16:44,856 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/usr/local/pig/pig-0.16.0/lib/antlr-runtime-3.4.jar to DistributedCache through /tmp/temp159471787/tmp-380031198/antlr-runtime-3.4.jar
2017-03-22 11:16:44,960 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/usr/local/pig/pig-0.16.0/lib/joda-time-2.9.3.jar to DistributedCache through /tmp/temp159471787/tmp1163422388/joda-time-2.9.3.jar
2017-03-22 11:16:44,996 [main] INFO com.mongodb.hadoop.pig.MongoUpdateStorage - Store location config: Configuration: ; for namespace: pavan.persons_info1; hosts: [localhost:27017]
2017-03-22 11:16:45,004 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2017-03-22 11:16:45,147 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
2017-03-22 11:16:45,166 [JobControl] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
2017-03-22 11:16:45,253 [JobControl] INFO com.mongodb.hadoop.pig.MongoUpdateStorage - Store location config: Configuration: ; for namespace: pavan.persons_info1; hosts: [localhost:27017]
2017-03-22 11:16:45,318 [JobControl] WARN org.apache.hadoop.mapreduce.JobResourceUploader - No job jar file set. User classes may not be found. See Job or Job#setJar(String).
2017-03-22 11:16:45,572 [JobControl] INFO org.apache.pig.builtin.PigStorage - Using PigTextInputFormat
2017-03-22 11:16:45,579 [JobControl] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2017-03-22 11:16:45,581 [JobControl] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
2017-03-22 11:16:45,593 [JobControl] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 1
2017-03-22 11:16:45,690 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter - number of splits:1
2017-03-22 11:16:45,884 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter - Submitting tokens for job: job_local1070093788_0006
2017-03-22 11:16:47,476 [JobControl] INFO org.apache.hadoop.mapred.LocalDistributedCacheManager - Creating symlink: /tmp/hadoop-miracle/mapred/local/1490206606120/mongo-java-driver-3.0.4.jar <- /home/miracle/mongo-java-driver-3.0.4.jar
2017-03-22 11:16:47,534 [JobControl] INFO org.apache.hadoop.mapred.LocalDistributedCacheManager - Localized hdfs://localhost:9000/tmp/temp159471787/tmp643027494/mongo-java-driver-3.0.4.jar as file:/tmp/hadoop-miracle/mapred/local/1490206606120/mongo-java-driver-3.0.4.jar
2017-03-22 11:16:47,534 [JobControl] INFO org.apache.hadoop.mapred.LocalDistributedCacheManager - Creating symlink: /tmp/hadoop-miracle/mapred/local/1490206606121/mongo-hadoop-core-2.0.2.jar <- /home/miracle/mongo-hadoop-core-2.0.2.jar
2017-03-22 11:16:47,674 [JobControl] INFO org.apache.hadoop.mapred.LocalDistributedCacheManager - Localized hdfs://localhost:9000/tmp/temp159471787/tmp-1745369112/mongo-hadoop-core-2.0.2.jar as file:/tmp/hadoop-miracle/mapred/local/1490206606121/mongo-hadoop-core-2.0.2.jar
2017-03-22 11:16:48,194 [JobControl] INFO org.apache.hadoop.mapred.LocalDistributedCacheManager - Creating symlink: /tmp/hadoop-miracle/mapred/local/1490206606122/mongo-hadoop-pig-2.0.2.jar <- /home/miracle/mongo-hadoop-pig-2.0.2.jar
2017-03-22 11:16:48,201 [JobControl] INFO org.apache.hadoop.mapred.LocalDistributedCacheManager - Localized hdfs://localhost:9000/tmp/temp159471787/tmp116725398/mongo-hadoop-pig-2.0.2.jar as file:/tmp/hadoop-miracle/mapred/local/1490206606122/mongo-hadoop-pig-2.0.2.jar
2017-03-22 11:16:48,329 [JobControl] INFO org.apache.hadoop.mapred.LocalDistributedCacheManager - Creating symlink: /tmp/hadoop-miracle/mapred/local/1490206606123/pig-0.16.0-core-h2.jar <- /home/miracle/pig-0.16.0-core-h2.jar
2017-03-22 11:16:48,337 [JobControl] INFO org.apache.hadoop.mapred.LocalDistributedCacheManager - Localized hdfs://localhost:9000/tmp/temp159471787/tmp499355324/pig-0.16.0-core-h2.jar as file:/tmp/hadoop-miracle/mapred/local/1490206606123/pig-0.16.0-core-h2.jar
2017-03-22 11:16:48,338 [JobControl] INFO org.apache.hadoop.mapred.LocalDistributedCacheManager - Creating symlink: /tmp/hadoop-miracle/mapred/local/1490206606124/automaton-1.11-8.jar <- /home/miracle/automaton-1.11-8.jar
2017-03-22 11:16:48,370 [JobControl] INFO org.apache.hadoop.mapred.LocalDistributedCacheManager - Localized hdfs://localhost:9000/tmp/temp159471787/tmp413788756/automaton-1.11-8.jar as file:/tmp/hadoop-miracle/mapred/local/1490206606124/automaton-1.11-8.jar
2017-03-22 11:16:48,371 [JobControl] INFO org.apache.hadoop.mapred.LocalDistributedCacheManager - Creating symlink: /tmp/hadoop-miracle/mapred/local/1490206606125/antlr-runtime-3.4.jar <- /home/miracle/antlr-runtime-3.4.jar
2017-03-22 11:16:48,384 [JobControl] INFO org.apache.hadoop.mapred.LocalDistributedCacheManager - Localized hdfs://localhost:9000/tmp/temp159471787/tmp-380031198/antlr-runtime-3.4.jar as file:/tmp/hadoop-miracle/mapred/local/1490206606125/antlr-runtime-3.4.jar
2017-03-22 11:16:48,389 [JobControl] INFO org.apache.hadoop.mapred.LocalDistributedCacheManager - Creating symlink: /tmp/hadoop-miracle/mapred/local/1490206606126/joda-time-2.9.3.jar <- /home/miracle/joda-time-2.9.3.jar
2017-03-22 11:16:48,409 [JobControl] INFO org.apache.hadoop.mapred.LocalDistributedCacheManager - Localized hdfs://localhost:9000/tmp/temp159471787/tmp1163422388/joda-time-2.9.3.jar as file:/tmp/hadoop-miracle/mapred/local/1490206606126/joda-time-2.9.3.jar
2017-03-22 11:16:48,798 [JobControl] INFO org.apache.hadoop.mapred.LocalDistributedCacheManager - file:/tmp/hadoop-miracle/mapred/local/1490206606120/mongo-java-driver-3.0.4.jar
2017-03-22 11:16:48,803 [JobControl] INFO org.apache.hadoop.mapred.LocalDistributedCacheManager - file:/tmp/hadoop-miracle/mapred/local/1490206606121/mongo-hadoop-core-2.0.2.jar
2017-03-22 11:16:48,803 [JobControl] INFO org.apache.hadoop.mapred.LocalDistributedCacheManager - file:/tmp/hadoop-miracle/mapred/local/1490206606122/mongo-hadoop-pig-2.0.2.jar
2017-03-22 11:16:48,804 [JobControl] INFO org.apache.hadoop.mapred.LocalDistributedCacheManager - file:/tmp/hadoop-miracle/mapred/local/1490206606123/pig-0.16.0-core-h2.jar
2017-03-22 11:16:48,806 [JobControl] INFO org.apache.hadoop.mapred.LocalDistributedCacheManager - file:/tmp/hadoop-miracle/mapred/local/1490206606124/automaton-1.11-8.jar
2017-03-22 11:16:48,807 [JobControl] INFO org.apache.hadoop.mapred.LocalDistributedCacheManager - file:/tmp/hadoop-miracle/mapred/local/1490206606125/antlr-runtime-3.4.jar
2017-03-22 11:16:48,807 [JobControl] INFO org.apache.hadoop.mapred.LocalDistributedCacheManager - file:/tmp/hadoop-miracle/mapred/local/1490206606126/joda-time-2.9.3.jar
2017-03-22 11:16:48,807 [JobControl] INFO org.apache.hadoop.mapreduce.Job - The url to track the job: http://localhost:8080/
2017-03-22 11:16:48,809 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_local1070093788_0006
2017-03-22 11:16:48,812 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases student1
2017-03-22 11:16:48,812 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: student1[7,11] C: R:
2017-03-22 11:16:48,889 [Thread-455] INFO org.apache.hadoop.mapred.LocalJobRunner - OutputCommitter set in config null
2017-03-22 11:16:48,915 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
2017-03-22 11:16:48,915 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_local1070093788_0006]
2017-03-22 11:16:48,999 [Thread-455] INFO com.mongodb.hadoop.pig.MongoUpdateStorage - Store location config: Configuration: ; for namespace: pavan.persons_info1; hosts: [localhost:27017]
2017-03-22 11:16:49,011 [Thread-455] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2017-03-22 11:16:49,013 [Thread-455] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.job.reduce.markreset.buffer.percent is deprecated. Instead, use mapreduce.reduce.markreset.buffer.percent
2017-03-22 11:16:49,013 [Thread-455] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2017-03-22 11:16:49,054 [Thread-455] INFO org.apache.hadoop.mapred.LocalJobRunner - OutputCommitter is org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputCommitter
2017-03-22 11:16:49,094 [Thread-455] INFO com.mongodb.hadoop.pig.MongoUpdateStorage - Store location config: Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, file:/tmp/hadoop-miracle/mapred/local/localRunner/miracle/job_local1070093788_0006/job_local1070093788_0006.xml; for namespace: pavan.persons_info1; hosts: [localhost:27017]
2017-03-22 11:16:49,104 [Thread-455] INFO com.mongodb.hadoop.output.MongoOutputCommitter - Setting up job.
2017-03-22 11:16:49,126 [Thread-455] INFO org.apache.hadoop.mapred.LocalJobRunner - Waiting for map tasks
2017-03-22 11:16:49,127 [LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapred.LocalJobRunner - Starting task: attempt_local1070093788_0006_m_000000_0
2017-03-22 11:16:49,253 [LocalJobRunner Map Task Executor #0] INFO com.mongodb.hadoop.pig.MongoUpdateStorage - Store location config: Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, file:/tmp/hadoop-miracle/mapred/local/localRunner/miracle/job_local1070093788_0006/job_local1070093788_0006.xml; for namespace: pavan.persons_info1; hosts: [localhost:27017]
2017-03-22 11:16:49,279 [LocalJobRunner Map Task Executor #0] INFO com.mongodb.hadoop.pig.MongoUpdateStorage - Store location config: Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, file:/tmp/hadoop-miracle/mapred/local/localRunner/miracle/job_local1070093788_0006/job_local1070093788_0006.xml; for namespace: pavan.persons_info1; hosts: [localhost:27017]
2017-03-22 11:16:49,290 [LocalJobRunner Map Task Executor #0] INFO com.mongodb.hadoop.output.MongoOutputCommitter - Setting up task.
2017-03-22 11:16:49,296 [LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapred.Task - Using ResourceCalculatorProcessTree : [ ]
2017-03-22 11:16:49,340 [LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapred.MapTask - Processing split: Number of splits :1
Total Length = 212
Input split[0]:
Length = 212
ClassName: org.apache.hadoop.mapreduce.lib.input.FileSplit
Locations:
-----------------------
2017-03-22 11:16:49,415 [LocalJobRunner Map Task Executor #0] INFO org.apache.pig.builtin.PigStorage - Using PigTextInputFormat
2017-03-22 11:16:49,417 [LocalJobRunner Map Task Executor #0] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader - Current split being processed hdfs://localhost:9000/input/student_dir/student_Info.txt:0+212
2017-03-22 11:16:49,459 [LocalJobRunner Map Task Executor #0] INFO com.mongodb.hadoop.pig.MongoUpdateStorage - Store location config: Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, file:/tmp/hadoop-miracle/mapred/local/localRunner/miracle/job_local1070093788_0006/job_local1070093788_0006.xml; for namespace: pavan.persons_info1; hosts: [localhost:27017]
2017-03-22 11:16:49,684 [LocalJobRunner Map Task Executor #0] INFO org.mongodb.driver.cluster - Cluster created with settings {hosts=[localhost:27017], mode=SINGLE, requiredClusterType=UNKNOWN, serverSelectionTimeout='30000 ms', maxWaitQueueSize=500}
2017-03-22 11:16:50,484 [LocalJobRunner Map Task Executor #0] INFO com.mongodb.hadoop.output.MongoRecordWriter - Writing to temporary file: /tmp/hadoop-miracle/attempt_local1070093788_0006_m_000000_0/_MONGO_OUT_TEMP/_out
2017-03-22 11:16:50,516 [LocalJobRunner Map Task Executor #0] INFO com.mongodb.hadoop.pig.MongoUpdateStorage - Preparing to write to com.mongodb.hadoop.output.MongoRecordWriter#1fd6ae6
2017-03-22 11:16:50,736 [LocalJobRunner Map Task Executor #0] INFO org.apache.pig.impl.util.SpillableMemoryManager - Selected heap (Tenured Gen) of size 699072512 to monitor. collectionUsageThreshold = 489350752, usageThreshold = 489350752
2017-03-22 11:16:50,739 [LocalJobRunner Map Task Executor #0] WARN org.apache.pig.data.SchemaTupleBackend - SchemaTupleBackend has already been initialized
2017-03-22 11:16:50,746 [LocalJobRunner Map Task Executor #0] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map - Aliases being processed per job phase (AliasName[line,offset]): M: student1[7,11] C: R:
2017-03-22 11:16:50,880 [Thread-455] INFO org.apache.hadoop.mapred.LocalJobRunner - map task executor complete.
2017-03-22 11:16:50,919 [Thread-455] INFO com.mongodb.hadoop.pig.MongoUpdateStorage - Store location config: Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, file:/tmp/hadoop-miracle/mapred/local/localRunner/miracle/job_local1070093788_0006/job_local1070093788_0006.xml; for namespace: pavan.persons_info1; hosts: [localhost:27017]
2017-03-22 11:16:50,963 [Thread-455] WARN org.apache.hadoop.mapred.LocalJobRunner - job_local1070093788_0006
java.lang.Exception: java.io.IOException: java.io.IOException: Couldn't convert tuple to bson:
at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.io.IOException: java.io.IOException: Couldn't convert tuple to bson:
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.StoreFuncDecorator.putNext(StoreFuncDecorator.java:83)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:144)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:97)
at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:658)
at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.collect(PigMapOnly.java:48)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:261)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:65)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Couldn't convert tuple to bson:
at com.mongodb.hadoop.pig.MongoUpdateStorage.putNext(MongoUpdateStorage.java:165)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.StoreFuncDecorator.putNext(StoreFuncDecorator.java:75)
... 17 more
Caused by: java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
at java.util.ArrayList.rangeCheck(ArrayList.java:653)
at java.util.ArrayList.get(ArrayList.java:429)
at org.apache.pig.data.DefaultTuple.get(DefaultTuple.java:117)
at com.mongodb.hadoop.pig.JSONPigReplace.substitute(JSONPigReplace.java:120)
at com.mongodb.hadoop.pig.MongoUpdateStorage.putNext(MongoUpdateStorage.java:142)
... 18 more
2017-03-22 11:16:53,944 [main] WARN org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Ooops! Some job has failed! Specify -stop_on_failure if you want Pig to stop immediately on failure.
2017-03-22 11:16:53,944 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - job job_local1070093788_0006 has failed! Stop running all dependent jobs
2017-03-22 11:16:53,945 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2017-03-22 11:16:53,949 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
2017-03-22 11:16:53,954 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
2017-03-22 11:16:53,962 [main] ERROR org.apache.pig.tools.pigstats.mapreduce.MRPigStatsUtil - 1 map reduce job(s) failed!
2017-03-22 11:16:53,981 [main] INFO org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script Statistics:
HadoopVersion PigVersion UserId StartedAt FinishedAt Features
2.7.3 0.16.0 miracle 2017-03-22 11:16:43 2017-03-22 11:16:53 UNKNOWN
Failed!
Failed Jobs:
JobId Alias Feature Message Outputs
job_local1070093788_0006 student1 MAP_ONLY Message: Job failed! mongodb://localhost:27017/pavan.persons_info1,
Input(s):
Failed to read data from "hdfs://localhost:9000/input/student_dir/student_Info.txt"
Output(s):
Failed to produce result in "mongodb://localhost:27017/pavan.persons_info1"
Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0
Job DAG:
job_local1070093788_0006
2017-03-22 11:16:53,983 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed!
2017-03-22 11:16:54,004 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1002: Unable to store alias student1
Details at logfile: /home/miracle/pig_1490205716326.log
Here is the input that I have dumped here,
Input(s):
Successfully read 6 records (5378419 bytes) from: "hdfs://localhost:9000/input/student_dir/student_Info.txt"
Output(s):
Successfully stored 6 records (5378449 bytes) in: "hdfs://localhost:9000/tmp/temp-1419179625/tmp882976412"
Counters:
Total records written : 6
Total bytes written : 5378449
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0
Job DAG:
job_local1866034015_0001
2017-03-23 02:43:37,677 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
2017-03-23 02:43:37,681 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
2017-03-23 02:43:37,689 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
2017-03-23 02:43:37,736 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success!
2017-03-23 02:43:37,748 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2017-03-23 02:43:37,751 [main] WARN org.apache.pig.data.SchemaTupleBackend - SchemaTupleBackend has already been initialized
2017-03-23 02:43:37,793 [main] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2017-03-23 02:43:37,793 [main] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
(Rajiv,Reddy,9848022337,Hyderabad)
(siddarth,Battacharya,9848022338,Kolkata)
(Rajesh,Khanna,9848022339,Delhi)
(Preethi,Agarwal,9848022330,Pune)
(Trupthi,Mohanthy,9848022336,Bhuwaneshwar)
(Archana,Mishra,9848022335,Chennai.)
I don't know What to do next please give me any suggestions on this.
I do not use Mongo , but from HBase experience and From the error it looks like few fields are not flatten 'ed enough for it to fit the column family. And also few of columns you are trying to STORE don't exist.
See to DUMP the data instead of STORE and check if it is matching the STORE structure you have applied.
Caused by: java.io.IOException: Couldn't convert tuple to bson: at
com.mongodb.hadoop.pig.MongoUpdateStorage.putNext(MongoUpdateStorage.java:165)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.StoreFuncDecorator.putNext(StoreFuncDecorator.java:75)
... 17 more Caused by: java.lang.IndexOutOfBoundsException: Index: 1,
Size: 1 at java.util.ArrayList.rangeCheck(ArrayList.java:653) at
java.util.ArrayList.get(ArrayList.java:429) at
org.apache.pig.data.DefaultTuple.get(DefaultTuple.java:117) at
com.mongodb.hadoop.pig.JSONPigReplace.substitute(JSONPigReplace.java:120)
at
com.mongodb.hadoop.pig.MongoUpdateStorage.putNext(MongoUpdateStorage.java:142)
... 18 more
Related
occurred error while use zkcopy copy data from zookeeper
1、using zkcopy and run cmd as following: java -jar target/zkcopy.jar --source xxxx:2181/clickhouse --target xxxx:2181/clickhouse 2、cmd output as following: 2021-09-07 21:47:33,542 [main] INFO com.github.ksprojects.ZkCopy - using 10 concurrent workers to copy data 2021-09-07 21:47:33,542 [main] INFO com.github.ksprojects.ZkCopy - delete nodes = true 2021-09-07 21:47:33,542 [main] INFO com.github.ksprojects.ZkCopy - ignore ephemeral nodes = true 2021-09-07 21:47:33,543 [main] INFO com.github.ksprojects.zkcopy.reader.Reader - Reading /clickhouse from 10.201.226.32:2181 2021-09-07 21:47:34,590 [main] INFO com.github.ksprojects.zkcopy.reader.Reader - Processing, total=12417, processed=3142 2021-09-07 21:47:35,655 [main] INFO com.github.ksprojects.zkcopy.reader.Reader - Processing, total=12417, processed=6059 2021-09-07 21:47:36,655 [main] INFO com.github.ksprojects.zkcopy.reader.Reader - Processing, total=12417, processed=9172 2021-09-07 21:47:37,655 [main] INFO com.github.ksprojects.zkcopy.reader.Reader - Processing, total=12497, processed=12497 2021-09-07 21:47:37,657 [main] INFO com.github.ksprojects.zkcopy.reader.Reader - Completed. 2021-09-07 21:47:37,687 [main] INFO com.github.ksprojects.zkcopy.writer.Writer - Writing data... 2021-09-07 21:47:38,338 [main] INFO com.github.ksprojects.zkcopy.writer.Writer - Committing transaction 2021-09-07 21:47:38,954 [main] INFO com.github.ksprojects.zkcopy.writer.Writer - Committing transaction Exception in thread "main" picocli.CommandLine$ExecutionException: Error while calling command (com.github.ksprojects.ZkCopy#7960847b): java.lang.RuntimeException: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss at picocli.CommandLine.execute(CommandLine.java:458) at picocli.CommandLine.access$300(CommandLine.java:134) at picocli.CommandLine$RunLast.handleParseResult(CommandLine.java:538) at picocli.CommandLine.parseWithHandlers(CommandLine.java:656) at picocli.CommandLine.call(CommandLine.java:883) at picocli.CommandLine.call(CommandLine.java:834) at com.github.ksprojects.ZkCopy.main(ZkCopy.java:69) Caused by: java.lang.RuntimeException: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss at com.github.ksprojects.zkcopy.writer.AutoCommitTransactionWrapper.maybeCommitTransaction(AutoCommitTransactionWrapper.java:74) at com.github.ksprojects.zkcopy.writer.AutoCommitTransactionWrapper.create(AutoCommitTransactionWrapper.java:39) at com.github.ksprojects.zkcopy.writer.Writer.upsertNode(Writer.java:147) at com.github.ksprojects.zkcopy.writer.Writer.update(Writer.java:96) at com.github.ksprojects.zkcopy.writer.Writer.update(Writer.java:104) at com.github.ksprojects.zkcopy.writer.Writer.update(Writer.java:104) at com.github.ksprojects.zkcopy.writer.Writer.update(Writer.java:104) at com.github.ksprojects.zkcopy.writer.Writer.update(Writer.java:104) at com.github.ksprojects.zkcopy.writer.Writer.update(Writer.java:104) at com.github.ksprojects.zkcopy.writer.Writer.write(Writer.java:65) at com.github.ksprojects.ZkCopy.call(ZkCopy.java:86) at com.github.ksprojects.ZkCopy.call(ZkCopy.java:14) at picocli.CommandLine.execute(CommandLine.java:456)
Issue while setting up nifi cluster
I followed the below tutorial to setup nifi cluster. Performed everything as mentioned, but receiving the below error. https://pierrevillard.com/2016/08/13/apache-nifi-1-0-0-cluster-setup/ But the zookeeper was unable to elect a leader and getting the following error message. 2018-04-20 09:27:40,942 INFO [main] org.wali.MinimalLockingWriteAheadLog Successfully recovered 0 records in 3 milliseconds 2018-04-20 09:27:41,007 INFO [main] org.wali.MinimalLockingWriteAheadLog org.wali.MinimalLockingWriteAheadLog#27caa186 checkpointed with 0 Records and 0 Swap Files in 65 milliseconds (Stop-the-world time = 1 milliseconds, Clear Edit Logs time = 7 millis), max Transaction ID -1 2018-04-20 09:27:41,300 INFO [main] o.a.z.server.DatadirCleanupManager autopurge.snapRetainCount set to 30 2018-04-20 09:27:41,300 INFO [main] o.a.z.server.DatadirCleanupManager autopurge.purgeInterval set to 24 2018-04-20 09:27:41,311 INFO [main] o.a.n.c.s.server.ZooKeeperStateServer Starting Embedded ZooKeeper Peer 2018-04-20 09:27:41,319 INFO [PurgeTask] o.a.z.server.DatadirCleanupManager Purge task started. 2018-04-20 09:27:41,343 INFO [PurgeTask] o.a.z.server.DatadirCleanupManager Purge task completed. 2018-04-20 09:27:41,621 INFO [main] o.apache.nifi.controller.FlowController Checking if there is already a Cluster Coordinator Elected... 2018-04-20 09:27:41,907 INFO [main] o.a.c.f.imps.CuratorFrameworkImpl Starting 2018-04-20 09:27:49,882 WARN [main] o.a.n.c.l.e.CuratorLeaderElectionManager Unable to determine the Elected Leader for role 'Cluster Coordinator' due to org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /nifi/leaders/Cluster Coordinator; assuming no leader has been elected 2018-04-20 09:27:49,885 INFO [Curator-Framework-0] o.a.c.f.imps.CuratorFrameworkImpl backgroundOperationsLoop exiting 2018-04-20 09:27:49,986 INFO [main] o.apache.nifi.controller.FlowController It appears that no Cluster Coordinator has been Elected yet. Registering for Cluster Coordinator Role. 2018-04-20 09:27:49,988 INFO [main] o.a.n.c.l.e.CuratorLeaderElectionManager CuratorLeaderElectionManager[stopped=true] Registered new Leader Selector for role Cluster Coordinator; this node is an active participant in the election. 2018-04-20 09:27:49,988 INFO [main] o.a.c.f.imps.CuratorFrameworkImpl Starting 2018-04-20 09:27:49,997 INFO [main] o.a.n.c.l.e.CuratorLeaderElectionManager CuratorLeaderElectionManager[stopped=false] Registered new Leader Selector for role Cluster Coordinator; this node is an active participant in the election. 2018-04-20 09:27:49,997 INFO [main] o.a.n.c.l.e.CuratorLeaderElectionManager CuratorLeaderElectionManager[stopped=false] started 2018-04-20 09:27:49,997 INFO [main] o.a.n.c.c.h.AbstractHeartbeatMonitor Heartbeat Monitor started 2018-04-20 09:27:58,179 INFO [main] o.e.jetty.server.handler.ContextHandler Started o.e.j.w.WebAppContext#76d9fe26{/nifi-api,file:///root/nifi-1.6.0/work/jetty/nifi-web-api-1.6.0.war/webapp/,AVAILABLE}{./work/nar/framework/nifi-framework-nar-1.6.0.nar-unpacked/META-INF/bundled-dependencies/nifi-web-api-1.6.0.war} 2018-04-20 09:27:59,657 INFO [main] o.e.j.a.AnnotationConfiguration Scanning elapsed time=861ms 2018-04-20 09:27:59,834 INFO [main] o.e.j.C./nifi-content-viewer No Spring WebApplicationInitializer types detected on classpath 2018-04-20 09:27:59,840 INFO [main] o.e.jetty.server.handler.ContextHandler Started o.e.j.w.WebAppContext#43b41cb9{/nifi-content-viewer,file:///root/nifi-1.6.0/work/jetty/nifi-web-content-viewer-1.6.0.war/webapp/,AVAILABLE}{./work/nar/framework/nifi-framework-nar-1.6.0.nar-unpacked/META-INF/bundled-dependencies/nifi-web-content-viewer-1.6.0.war} 2018-04-20 09:27:59,863 INFO [main] o.e.jetty.server.handler.ContextHandler Started o.e.j.s.h.ContextHandler#49825659{/nifi-docs,null,AVAILABLE} 2018-04-20 09:28:00,079 INFO [main] o.e.j.a.AnnotationConfiguration Scanning elapsed time=69ms 2018-04-20 09:28:00,083 INFO [main] o.e.jetty.ContextHandler./nifi-docs No Spring WebApplicationInitializer types detected on classpath 2018-04-20 09:28:00,171 INFO [main] o.e.jetty.server.handler.ContextHandler Started o.e.j.w.WebAppContext#2ab0ca04{/nifi-docs,file:///root/nifi-1.6.0/work/jetty/nifi-web-docs-1.6.0.war/webapp/,AVAILABLE}{./work/nar/framework/nifi-framework-nar-1.6.0.nar-unpacked/META-INF/bundled-dependencies/nifi-web-docs-1.6.0.war} 2018-04-20 09:28:00,343 INFO [main] o.e.j.a.AnnotationConfiguration Scanning elapsed time=119ms 2018-04-20 09:28:00,344 INFO [main] org.eclipse.jetty.ContextHandler./ No Spring WebApplicationInitializer types detected on classpath 2018-04-20 09:28:00,454 INFO [main] o.e.jetty.server.handler.ContextHandler Started o.e.j.w.WebAppContext#67e255cf{/,file:///root/nifi-1.6.0/work/jetty/nifi-web-error-1.6.0.war/webapp/,AVAILABLE}{./work/nar/framework/nifi-framework-nar-1.6.0.nar-unpacked/META-INF/bundled-dependencies/nifi-web-error-1.6.0.war} 2018-04-20 09:28:00,541 INFO [main] o.eclipse.jetty.server.AbstractConnector Started ServerConnector#6d9f624{HTTP/1.1,[http/1.1]}{node-1:8080} 2018-04-20 09:28:00,559 INFO [main] org.eclipse.jetty.server.Server Started #98628ms 2018-04-20 09:28:00,599 INFO [main] org.apache.nifi.web.server.JettyServer Loading Flow... 2018-04-20 09:28:00,688 INFO [main] org.apache.nifi.io.socket.SocketListener Now listening for connections from nodes on port 9999 2018-04-20 09:28:00,916 INFO [main] o.apache.nifi.controller.FlowController Successfully synchronized controller with proposed flow 2018-04-20 09:28:01,337 INFO [main] o.a.nifi.controller.StandardFlowService Connecting Node: node-1:8080 2018-04-20 09:28:07,922 INFO [Curator-Framework-0] o.a.c.f.state.ConnectionStateManager State change: SUSPENDED 2018-04-20 09:28:07,927 INFO [Curator-ConnectionStateManager-0] o.a.n.c.l.e.CuratorLeaderElectionManager org.apache.nifi.controller.leader.election.CuratorLeaderElectionManager$ElectionListener#188207a7 Connection State changed to SUSPENDED 2018-04-20 09:28:07,930 ERROR [Curator-Framework-0] o.a.c.f.imps.CuratorFrameworkImpl Background operation retry gave up org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss at org.apache.zookeeper.KeeperException.create(KeeperException.java:99) at org.apache.curator.framework.imps.CuratorFrameworkImpl.checkBackgroundRetry(CuratorFrameworkImpl.java:728) at org.apache.curator.framework.imps.CuratorFrameworkImpl.performBackgroundOperation(CuratorFrameworkImpl.java:857) at org.apache.curator.framework.imps.CuratorFrameworkImpl.backgroundOperationsLoop(CuratorFrameworkImpl.java:809) at org.apache.curator.framework.imps.CuratorFrameworkImpl.access$300(CuratorFrameworkImpl.java:64) at org.apache.curator.framework.imps.CuratorFrameworkImpl$4.call(CuratorFrameworkImpl.java:267) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) 2018-04-19 15:56:55,209 ERROR [Curator-Framework-0] o.a.c.f.imps.CuratorFrameworkImpl Background operation retry gave up org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss at org.apache.zookeeper.KeeperException.create(KeeperException.java:99) at org.apache.curator.framework.imps.CuratorFrameworkImpl.checkBackgroundRetry(CuratorFrameworkImpl.java:728) at org.apache.curator.framework.imps.CuratorFrameworkImpl.performBackgroundOperation(CuratorFrameworkImpl.java:857) at org.apache.curator.framework.imps.CuratorFrameworkImpl.backgroundOperationsLoop(CuratorFrameworkImpl.java:809) at org.apache.curator.framework.imps.CuratorFrameworkImpl.access$300(CuratorFrameworkImpl.java:64) at org.apache.curator.framework.imps.CuratorFrameworkImpl$4.call(CuratorFrameworkImpl.java:267) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) I set the following properties on all 3 nodes: mkdir ./state mkdir ./state/zookeeper echo 1 > ./state/zookeeper/myid (2 & 3 for other nodes) in zookeeper.properties server.1=node-1:2888:3888 server.2=node-2:2888:3888 server.3=node-3:2888:3888 in nifi.properties nifi.state.management.embedded.zookeeper.start=true nifi.zookeeper.connect.string=node-1:2181,node-2:2181,node-3:2181 nifi.cluster.protocol.is.secure=false nifi.cluster.is.node=true nifi.cluster.node.address=node-1(node-2 & node-3) nifi.cluster.node.protocol.port=9999 nifi.cluster.node.protocol.threads=10 nifi.cluster.node.event.history.size=25 nifi.cluster.node.connection.timeout=5 sec nifi.cluster.node.read.timeout=5 sec nifi.cluster.firewall.file= nifi.remote.input.host=node-1(node-2 & node-3) nifi.remote.input.secure=false nifi.remote.input.socket.port=9998 nifi.remote.input.http.enabled=true nifi.remote.input.http.transaction.ttl=30 sec nifi.web.http.host=node-1(node-2 & node-3)
After Bryan's comment, got reminded that I forgot to disable the iptables (I guess I can even open the ports but since this is for my practice I am disabling the iptables) and its working now.
Spark submit runs successfully but when submitted through oozie it fails to connect to hive
I am using CDH 5.9.0, Spark 1.6 and Scala 2.10.0. I have created a scala and spark program to create a table and load data from a file to hive. When I run it using spark submit, it completes. But the same program when submitted through oozie, it throws the below exception. Below is the exception. Log Type: stdout Log Upload Time: Fri Oct 27 10:08:28 -0400 2017 Log Length: 172584 2017-10-27 10:08:20,652 INFO [main] yarn.ApplicationMaster (SignalLogger.scala:register(47)) - Registered signal handlers for [TERM, HUP, INT] 2017-10-27 10:08:21,306 INFO [main] yarn.ApplicationMaster (Logging.scala:logInfo(58)) - ApplicationAttemptId: appattempt_1507999204018_0292_000001 2017-10-27 10:08:21,952 INFO [main] spark.SecurityManager (Logging.scala:logInfo(58)) - Changing view acls to: username 2017-10-27 10:08:21,953 INFO [main] spark.SecurityManager (Logging.scala:logInfo(58)) - Changing modify acls to: username 2017-10-27 10:08:21,956 INFO [main] spark.SecurityManager (Logging.scala:logInfo(58)) - SecurityManager: authentication enabled; ui acls disabled; users with view permissions: Set(username); users with modify permissions: Set(username) 2017-10-27 10:08:21,970 INFO [main] yarn.ApplicationMaster (Logging.scala:logInfo(58)) - Starting the user application in a separate Thread 2017-10-27 10:08:21,997 INFO [main] yarn.ApplicationMaster (Logging.scala:logInfo(58)) - Waiting for spark context initialization 2017-10-27 10:08:21,998 INFO [main] yarn.ApplicationMaster (Logging.scala:logInfo(58)) - Waiting for spark context initialization ... 2017-10-27 10:08:22,308 WARN [Driver] security.UserGroupInformation (UserGroupInformation.java:doAs(1701)) - PriviledgedActionException as:username (auth:SIMPLE) cause:org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category READ is not supported in state standby. Visit https://s.apache.org/sbnn-error 2017-10-27 10:08:22,309 WARN [Driver] ipc.Client (Client.java:run(682)) - Exception encountered while connecting to the server : org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category READ is not supported in state standby. Visit https://s.apache.org/sbnn-error 2017-10-27 10:08:22,310 WARN [Driver] security.UserGroupInformation (UserGroupInformation.java:doAs(1701)) - PriviledgedActionException as:username (auth:SIMPLE) cause:org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category READ is not supported in state standby. Visit https://s.apache.org/sbnn-error 2017-10-27 10:08:22,391 INFO [Driver] spark.SparkContext (Logging.scala:logInfo(58)) - Running Spark version 1.6.0 2017-10-27 10:08:22,417 INFO [Driver] spark.SecurityManager (Logging.scala:logInfo(58)) - Changing view acls to: username 2017-10-27 10:08:22,418 INFO [Driver] spark.SecurityManager (Logging.scala:logInfo(58)) - Changing modify acls to: username 2017-10-27 10:08:22,418 INFO [Driver] spark.SecurityManager (Logging.scala:logInfo(58)) - SecurityManager: authentication enabled; ui acls disabled; users with view permissions: Set(username); users with modify permissions: Set(username) 2017-10-27 10:08:22,572 INFO [Driver] util.Utils (Logging.scala:logInfo(58)) - Successfully started service 'sparkDriver' on port 44049. 2017-10-27 10:08:22,901 INFO [sparkDriverActorSystem-akka.actor.default-dispatcher-4] slf4j.Slf4jLogger (Slf4jLogger.scala:applyOrElse(80)) - Slf4jLogger started 2017-10-27 10:08:22,936 INFO [sparkDriverActorSystem-akka.actor.default-dispatcher-4] Remoting (Slf4jLogger.scala:apply$mcV$sp(74)) - Starting remoting 2017-10-27 10:08:23,062 INFO [sparkDriverActorSystem-akka.actor.default-dispatcher-4] Remoting (Slf4jLogger.scala:apply$mcV$sp(74)) - Remoting started; listening on addresses :[akka.tcp://sparkDriverActorSystem#a.b.c.d:38305] 2017-10-27 10:08:23,064 INFO [sparkDriverActorSystem-akka.actor.default-dispatcher-4] Remoting (Slf4jLogger.scala:apply$mcV$sp(74)) - Remoting now listens on addresses: [akka.tcp://sparkDriverActorSystem#a.b.c.d:38305] 2017-10-27 10:08:23,174 INFO [Driver] util.Utils (Logging.scala:logInfo(58)) - Successfully started service 'sparkDriverActorSystem' on port 38305. 2017-10-27 10:08:23,195 INFO [Driver] spark.SparkEnv (Logging.scala:logInfo(58)) - Registering MapOutputTracker 2017-10-27 10:08:23,207 INFO [Driver] spark.SparkEnv (Logging.scala:logInfo(58)) - Registering BlockManagerMaster 2017-10-27 10:08:23,216 INFO [Driver] storage.DiskBlockManager (Logging.scala:logInfo(58)) - Created local directory at /data/01/yarn/nm/usercache/username/appcache/application_1507999204018_0292/blockmgr-ba42749b-3498-4c1d-ba8b-dc6720e815a0 2017-10-27 10:08:23,217 INFO [Driver] storage.DiskBlockManager (Logging.scala:logInfo(58)) - Created local directory at /data/02/yarn/nm/usercache/username/appcache/application_1507999204018_0292/blockmgr-d9375d30-699d-4e40-8b42-559f79f27f85 2017-10-27 10:08:23,217 INFO [Driver] storage.DiskBlockManager (Logging.scala:logInfo(58)) - Created local directory at /data/03/yarn/nm/usercache/username/appcache/application_1507999204018_0292/blockmgr-fc2caf3b-3fa0-4f1e-be01-b33b6f6d52d5 2017-10-27 10:08:23,217 INFO [Driver] storage.DiskBlockManager (Logging.scala:logInfo(58)) - Created local directory at /data/04/yarn/nm/usercache/username/appcache/application_1507999204018_0292/blockmgr-450319a4-2d4f-4159-a633-3dd2a71bafe1 2017-10-27 10:08:23,217 INFO [Driver] storage.DiskBlockManager (Logging.scala:logInfo(58)) - Created local directory at /data/05/yarn/nm/usercache/username/appcache/application_1507999204018_0292/blockmgr-c3dbf9b3-cb95-4104-b4bf-9e7b1987e210 2017-10-27 10:08:23,217 INFO [Driver] storage.DiskBlockManager (Logging.scala:logInfo(58)) - Created local directory at /data/06/yarn/nm/usercache/username/appcache/application_1507999204018_0292/blockmgr-5d9c58a6-29bb-4e8e-a8fb-3720db0004d4 2017-10-27 10:08:23,218 INFO [Driver] storage.DiskBlockManager (Logging.scala:logInfo(58)) - Created local directory at /data/07/yarn/nm/usercache/username/appcache/application_1507999204018_0292/blockmgr-999eecaf-f183-4ede-8845-eeb57a87276b 2017-10-27 10:08:23,218 INFO [Driver] storage.DiskBlockManager (Logging.scala:logInfo(58)) - Created local directory at /data/08/yarn/nm/usercache/username/appcache/application_1507999204018_0292/blockmgr-216d2449-14b1-45aa-b6c6-d6271815f485 2017-10-27 10:08:23,221 INFO [Driver] storage.MemoryStore (Logging.scala:logInfo(58)) - MemoryStore started with capacity 491.7 MB 2017-10-27 10:08:23,283 INFO [Driver] spark.SparkEnv (Logging.scala:logInfo(58)) - Registering OutputCommitCoordinator 2017-10-27 10:08:23,394 INFO [Driver] ui.JettyUtils (Logging.scala:logInfo(58)) - Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter 2017-10-27 10:08:23,413 INFO [Driver] server.Server (Server.java:doStart(272)) - jetty-8.y.z-SNAPSHOT 2017-10-27 10:08:23,448 INFO [Driver] server.AbstractConnector (AbstractConnector.java:doStart(338)) - Started SelectChannelConnector#0.0.0.0:36123 2017-10-27 10:08:23,448 INFO [Driver] util.Utils (Logging.scala:logInfo(58)) - Successfully started service 'SparkUI' on port 36123. 2017-10-27 10:08:23,449 INFO [Driver] ui.SparkUI (Logging.scala:logInfo(58)) - Started SparkUI at http://a.b.c.d:36123 2017-10-27 10:08:23,498 INFO [Driver] cluster.YarnClusterScheduler (Logging.scala:logInfo(58)) - Created YarnClusterScheduler 2017-10-27 10:08:23,524 INFO [Driver] util.Utils (Logging.scala:logInfo(58)) - Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 44418. 2017-10-27 10:08:23,525 INFO [Driver] netty.NettyBlockTransferService (Logging.scala:logInfo(58)) - Server created on 44418 2017-10-27 10:08:23,527 INFO [Driver] storage.BlockManager (Logging.scala:logInfo(58)) - external shuffle service port = 7337 2017-10-27 10:08:23,527 INFO [Driver] storage.BlockManagerMaster (Logging.scala:logInfo(58)) - Trying to register BlockManager 2017-10-27 10:08:23,530 INFO [dispatcher-event-loop-11] storage.BlockManagerMasterEndpoint (Logging.scala:logInfo(58)) - Registering block manager a.b.c.d:44418 with 491.7 MB RAM, BlockManagerId(driver, a.b.c.d, 44418) 2017-10-27 10:08:23,533 INFO [Driver] storage.BlockManagerMaster (Logging.scala:logInfo(58)) - Registered BlockManager 2017-10-27 10:08:24,106 INFO [Driver] scheduler.EventLoggingListener (Logging.scala:logInfo(58)) - Logging events to hdfs://.../user/spark/applicationHistory/application_1507999204018_0292_1 2017-10-27 10:08:24,133 INFO [Driver] cluster.YarnClusterSchedulerBackend (Logging.scala:logInfo(58)) - SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.8 2017-10-27 10:08:24,133 INFO [Driver] cluster.YarnClusterScheduler (Logging.scala:logInfo(58)) - YarnClusterScheduler.postStartHook done 2017-10-27 10:08:24,140 INFO [dispatcher-event-loop-13] cluster.YarnSchedulerBackend$YarnSchedulerEndpoint (Logging.scala:logInfo(58)) - ApplicationMaster registered as NettyRpcEndpointRef(spark://YarnAM#a.b.c.d:44049) 2017-10-27 10:08:24,191 INFO [main] yarn.YarnRMClient (Logging.scala:logInfo(58)) - Registering the ApplicationMaster 2017-10-27 10:08:24,295 INFO [main] yarn.ApplicationMaster (Logging.scala:logInfo(58)) - Started progress reporter thread with (heartbeat : 3000, initial allocation : 200) intervals 2017-10-27 10:08:25,107 INFO [Driver] hive.HiveContext (Logging.scala:logInfo(58)) - Initializing execution hive, version 1.1.0 2017-10-27 10:08:25,146 INFO [Driver] client.ClientWrapper (Logging.scala:logInfo(58)) - Inspected Hadoop version: 2.6.0-cdh5.9.0 2017-10-27 10:08:25,147 INFO [Driver] client.ClientWrapper (Logging.scala:logInfo(58)) - Loaded org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version 2.6.0-cdh5.9.0 2017-10-27 10:08:25,582 INFO [Driver] metastore.HiveMetaStore (HiveMetaStore.java:newRawStore(644)) - 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore 2017-10-27 10:08:25,600 INFO [Driver] metastore.ObjectStore (ObjectStore.java:initialize(333)) - ObjectStore, initialize called 2017-10-27 10:08:25,671 WARN [Driver] DataNucleus.General (Log4JLogger.java:warn(96)) - Plugin (Bundle) "org.datanucleus" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/data/03/yarn/nm/usercache/username/appcache/application_1507999204018_0292/container_e69_1507999204018_0292_01_000001/datanucleus-core-3.2.2.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/data/05/yarn/nm/filecache/507/datanucleus-core-3.2.2.jar." 2017-10-27 10:08:25,687 WARN [Driver] DataNucleus.General (Log4JLogger.java:warn(96)) - Plugin (Bundle) "org.datanucleus.api.jdo" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/data/03/yarn/nm/usercache/username/appcache/application_1507999204018_0292/container_e69_1507999204018_0292_01_000001/datanucleus-api-jdo-3.2.1.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/data/07/yarn/nm/filecache/582/datanucleus-api-jdo-3.2.1.jar." 2017-10-27 10:08:25,688 WARN [Driver] DataNucleus.General (Log4JLogger.java:warn(96)) - Plugin (Bundle) "org.datanucleus.store.rdbms" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/data/08/yarn/nm/filecache/554/datanucleus-rdbms-3.2.1.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/data/03/yarn/nm/usercache/username/appcache/application_1507999204018_0292/container_e69_1507999204018_0292_01_000001/datanucleus-rdbms-3.2.1.jar." 2017-10-27 10:08:25,709 INFO [Driver] DataNucleus.Persistence (Log4JLogger.java:info(77)) - Property hive.metastore.integral.jdo.pushdown unknown - will be ignored 2017-10-27 10:08:25,710 INFO [Driver] DataNucleus.Persistence (Log4JLogger.java:info(77)) - Property datanucleus.cache.level2 unknown - will be ignored 2017-10-27 10:08:26,178 WARN [Driver] bonecp.BoneCPConfig (BoneCPConfig.java:sanitize(1537)) - Max Connections < 1. Setting to 20 2017-10-27 10:08:26,180 ERROR [Driver] Datastore.Schema (Log4JLogger.java:error(125)) - Failed initialising database. Unable to open a test connection to the given database. JDBC url = jdbc:derby:;databaseName=/data/03/yarn/nm/usercache/username/appcache/application_1507999204018_0292/container_e69_1507999204018_0292_01_000001/tmp/spark-633fb1f8-1f38-44ac-a54e-81465354bedc/metastore;create=true, username = APP. Terminating connection pool. Original Exception: ------ java.sql.SQLException: No suitable driver found for jdbc:derby:;databaseName=/data/03/yarn/nm/usercache/username/appcache/application_1507999204018_0292/container_e69_1507999204018_0292_01_000001/tmp/spark-633fb1f8-1f38-44ac-a54e-81465354bedc/metastore;create=true at java.sql.DriverManager.getConnection(DriverManager.java:689) at java.sql.DriverManager.getConnection(DriverManager.java:208) at com.jolbox.bonecp.BoneCP.obtainRawInternalConnection(BoneCP.java:254) at com.jolbox.bonecp.BoneCP.<init>(BoneCP.java:305) at com.jolbox.bonecp.BoneCPDataSource.maybeInit(BoneCPDataSource.java:150) at com.jolbox.bonecp.BoneCPDataSource.getConnection(BoneCPDataSource.java:112) at org.datanucleus.store.rdbms.ConnectionFactoryImpl$ManagedConnectionImpl.getConnection(ConnectionFactoryImpl.java:479) at org.datanucleus.store.rdbms.RDBMSStoreManager.<init>(RDBMSStoreManager.java:304) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.datanucleus.plugin.NonManagedPluginRegistry.createExecutableExtension(NonManagedPluginRegistry.java:631) at org.datanucleus.plugin.PluginManager.createExecutableExtension(PluginManager.java:301) at org.datanucleus.NucleusContext.createStoreManagerForProperties(NucleusContext.java:1069) at org.datanucleus.NucleusContext.initialise(NucleusContext.java:359) at org.datanucleus.api.jdo.JDOPersistenceManagerFactory.freezeConfiguration(JDOPersistenceManagerFactory.java:768) at org.datanucleus.api.jdo.JDOPersistenceManagerFactory.createPersistenceManagerFactory(JDOPersistenceManagerFactory.java:326) at org.datanucleus.api.jdo.JDOPersistenceManagerFactory.getPersistenceManagerFactory(JDOPersistenceManagerFactory.java:195) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at javax.jdo.JDOHelper$16.run(JDOHelper.java:1965) at java.security.AccessController.doPrivileged(Native Method) at javax.jdo.JDOHelper.invoke(JDOHelper.java:1960) at javax.jdo.JDOHelper.invokeGetPersistenceManagerFactoryOnImplementation(JDOHelper.java:1166) at javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:808) at javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:701) at org.apache.hadoop.hive.metastore.ObjectStore.getPMF(ObjectStore.java:411) at org.apache.hadoop.hive.metastore.ObjectStore.getPersistenceManager(ObjectStore.java:440) at org.apache.hadoop.hive.metastore.ObjectStore.initialize(ObjectStore.java:335) at org.apache.hadoop.hive.metastore.ObjectStore.setConf(ObjectStore.java:291) at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) at org.apache.hadoop.hive.metastore.RawStoreProxy.<init>(RawStoreProxy.java:57) at org.apache.hadoop.hive.metastore.RawStoreProxy.getProxy(RawStoreProxy.java:66) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newRawStore(HiveMetaStore.java:648) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:626) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:675) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:484) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:78) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:84) at org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:5999) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:203) at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<init>(SessionHiveMetaStoreClient.java:74) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1528) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:67) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:82) at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3037) at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3056) at org.apache.hadoop.hive.ql.metadata.Hive.getAllFunctions(Hive.java:3281) at org.apache.hadoop.hive.ql.metadata.Hive.reloadFunctions(Hive.java:217) at org.apache.hadoop.hive.ql.metadata.Hive.registerAllFunctionsOnce(Hive.java:201) at org.apache.hadoop.hive.ql.metadata.Hive.<init>(Hive.java:324) at org.apache.hadoop.hive.ql.metadata.Hive.get(Hive.java:285) at org.apache.hadoop.hive.ql.metadata.Hive.get(Hive.java:260) at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:514) at org.apache.spark.sql.hive.client.ClientWrapper.<init>(ClientWrapper.scala:194) at org.apache.spark.sql.hive.client.IsolatedClientLoader.createClient(IsolatedClientLoader.scala:238) at org.apache.spark.sql.hive.HiveContext.executionHive$lzycompute(HiveContext.scala:220) at org.apache.spark.sql.hive.HiveContext.executionHive(HiveContext.scala:210) at org.apache.spark.sql.hive.HiveContext.functionRegistry$lzycompute(HiveContext.scala:464) at org.apache.spark.sql.hive.HiveContext.functionRegistry(HiveContext.scala:463) at org.apache.spark.sql.UDFRegistration.<init>(UDFRegistration.scala:40) at org.apache.spark.sql.SQLContext.<init>(SQLContext.scala:330) at org.apache.spark.sql.hive.HiveContext.<init>(HiveContext.scala:90) at org.apache.spark.sql.hive.HiveContext.<init>(HiveContext.scala:101) at prfrx.externaltableerror$.main(externaltableerror.scala:28) at prfrx.externaltableerror.main(externaltableerror.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:542) at com.jolbox.bonecp.BoneCP.<init>(BoneCP.java:312) at com.jolbox.bonecp.BoneCPDataSource.maybeInit(BoneCPDataSource.java:150) at com.jolbox.bonecp.BoneCPDataSource.getConnection(BoneCPDataSource.java:112) at org.datanucleus.store.rdbms.ConnectionFactoryImpl$ManagedConnectionImpl.getConnection(ConnectionFactoryImpl.java:479) at org.datanucleus.store.rdbms.RDBMSStoreManager.<init>(RDBMSStoreManager.java:304) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.datanucleus.plugin.NonManagedPluginRegistry.createExecutableExtension(NonManagedPluginRegistry.java:631) at org.datanucleus.plugin.PluginManager.createExecutableExtension(PluginManager.java:301) at org.datanucleus.NucleusContext.createStoreManagerForProperties(NucleusContext.java:1069) at org.datanucleus.NucleusContext.initialise(NucleusContext.java:359) at org.datanucleus.api.jdo.JDOPersistenceManagerFactory.freezeConfiguration(JDOPersistenceManagerFactory.java:768) ... 62 more Caused by: java.sql.SQLException: No suitable driver found for jdbc:derby:;databaseName=/data/03/yarn/nm/usercache/username/appcache/application_1507999204018_0292/container_e69_1507999204018_0292_01_000001/tmp/spark-633fb1f8-1f38-44ac-a54e-81465354bedc/metastore;create=true at java.sql.DriverManager.getConnection(DriverManager.java:689) at java.sql.DriverManager.getConnection(DriverManager.java:208) at com.jolbox.bonecp.BoneCP.obtainRawInternalConnection(BoneCP.java:254) at com.jolbox.bonecp.BoneCP.<init>(BoneCP.java:305) Below is the code I am using. object externaltableerror { def main(args: Array[String]) { val conf = new Configuration(); conf.set("fs.defaultFS", "hdfs://...") conf.addResource("hdfs://.../core-site.xml"); conf.addResource("hdfs://.../hdfs-site.xml"); conf.addResource("hdfs://.../hive-site.xml"); val fs = FileSystem.get(conf) val os = fs.create(new Path("/.../Error.txt")) try { //System.setProperty("hive.metastore.uris", "thrift://..."); val sc = new SparkContext(new SparkConf().setAppName("withhive")) val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc) val files = sc.textFile("hdfs://.../Example.txt").first() val rdd = sc.parallelize(List(files)) val fm = rdd.flatMap(line => line.split("\t")).map(x => x.concat(" string")) val alternative = fm.reduce((s1, s2) => s1 + "," + s2) val ddl = "Create external table table_name(" + alternative + ") ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' LOCATION 'hdfs://.../' tblproperties (\"skip.header.line.count\"=\"1\")" hiveContext.sql(ddl) sc.stop() } catch{ // case e : Exception => new PrintWriter("hdfs://.../Error.txt") { write(e.getStackTrace.mkString("\n")); close } // println("H" + e.getStackTrace) case e : Exception => os.write(e.getStackTrace.mkString("\n").getBytes) } } } Any suggestions on how to get the job running with oozie will be of great help. Thanks!
I had the same issue - I fixed it by using the parameter --files /etc/hive/conf/hive-site.xml in my spark-submit job. (first I tried it in the shell and then in oozie, because I launched a .sh file that contains the spark-submit sencence)
Sqoop installation export and import from postgresql
I v'e just installed sqoop and was testing it . I tried to export some data from hdfs to postgresql using sqoop. When I run it it throws the following exception : java.io.IOException: Can't export data, please check task tracker logs . I think there may also have been a problem in installation. The File content is : ustNU 45 MB1bA 0 gNbCO 76 iZP10 39 B2aoo 45 SI7eG 93 5sC4k 60 2IhFV 2 u2A48 16 yvy6R 51 LNhsV 26 mZ2yn 65 80Gp3 43 Wk5Ag 85 VUfyp 93 P077j 94 f1Oj5 11 LxJkg 72 0H7NP 99 Dk406 25 g4KRp 76 Fw3U0 80 6LD59 1 07KHx 91 F1S88 72 Bnb0v 85 A2qM7 79 Z6cAt 81 0M3DO 23 m0s09 44 KIvwd 13 GNUD0 78 um93a 20 19bHv 75 4Of3s 75 5hFen 16 This is the posgres table: Table "public.mysort" Column | Type | Modifiers --------+---------+----------- name | text | marks | integer | The sqoop command is: sqoop export --connect jdbc:postgresql://localhost/testdb --username akshay --password akshay --table mysort -m 1 --export-dir MySort/input Followed by the error: Warning: /usr/lib/hcatalog does not exist! HCatalog jobs will fail. Please set $HCAT_HOME to the root of your HCatalog installation. 14/06/11 18:28:06 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 14/06/11 18:28:06 INFO manager.SqlManager: Using default fetchSize of 1000 14/06/11 18:28:06 INFO tool.CodeGenTool: Beginning code generation 14/06/11 18:28:06 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM "mysort" AS t LIMIT 1 14/06/11 18:28:06 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/local/hadoop Note: /tmp/sqoop-hduser/compile/0402ad4b5cf7980040264af35de406cb/mysort.java uses or overrides a deprecated API. Note: Recompile with -Xlint:deprecation for details. 14/06/11 18:28:07 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hduser/compile/0402ad4b5cf7980040264af35de406cb/mysort.jar 14/06/11 18:28:07 INFO mapreduce.ExportJobBase: Beginning export of mysort SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/lib/hbase/lib/slf4j-log4j12-1.6.4.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] Java HotSpot(TM) 64-Bit Server VM warning: You have loaded library /usr/local/hadoop/lib/native/libhadoop.so.1.0.0 which might have disabled stack guard. The VM will try to fix the stack guard now. It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'. 14/06/11 18:28:22 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 14/06/11 18:28:22 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar 14/06/11 18:28:23 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative 14/06/11 18:28:23 INFO Configuration.deprecation: mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative 14/06/11 18:28:23 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps 14/06/11 18:28:23 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032 14/06/11 18:28:24 INFO input.FileInputFormat: Total input paths to process : 1 14/06/11 18:28:24 INFO input.FileInputFormat: Total input paths to process : 1 14/06/11 18:28:25 INFO mapreduce.JobSubmitter: number of splits:1 14/06/11 18:28:25 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1402488523460_0003 14/06/11 18:28:25 INFO impl.YarnClientImpl: Submitted application application_1402488523460_0003 14/06/11 18:28:25 INFO mapreduce.Job: The url to track the job: http://localhost:8088/proxy/application_1402488523460_0003/ 14/06/11 18:28:25 INFO mapreduce.Job: Running job: job_1402488523460_0003 14/06/11 18:28:46 INFO mapreduce.Job: Job job_1402488523460_0003 running in uber mode : false 14/06/11 18:28:46 INFO mapreduce.Job: map 0% reduce 0% 14/06/11 18:29:04 INFO mapreduce.Job: Task Id : attempt_1402488523460_0003_m_000000_0, Status : FAILED Error: java.io.IOException: Can't export data, please check task tracker logs at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112) at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145) at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) Caused by: java.util.NoSuchElementException at java.util.ArrayList$Itr.next(ArrayList.java:839) at mysort.__loadFromFields(mysort.java:198) at mysort.parse(mysort.java:147) at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:83) ... 10 more 14/06/11 18:29:23 INFO mapreduce.Job: Task Id : attempt_1402488523460_0003_m_000000_1, Status : FAILED Error: java.io.IOException: Can't export data, please check task tracker logs at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112) at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145) at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) Caused by: java.util.NoSuchElementException at java.util.ArrayList$Itr.next(ArrayList.java:839) at mysort.__loadFromFields(mysort.java:198) at mysort.parse(mysort.java:147) at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:83) ... 10 more 14/06/11 18:29:42 INFO mapreduce.Job: Task Id : attempt_1402488523460_0003_m_000000_2, Status : FAILED Error: java.io.IOException: Can't export data, please check task tracker logs at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112) at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145) at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) Caused by: java.util.NoSuchElementException at java.util.ArrayList$Itr.next(ArrayList.java:839) at mysort.__loadFromFields(mysort.java:198) at mysort.parse(mysort.java:147) at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:83) ... 10 more 14/06/11 18:30:03 INFO mapreduce.Job: map 100% reduce 0% 14/06/11 18:30:03 INFO mapreduce.Job: Job job_1402488523460_0003 failed with state FAILED due to: Task failed task_1402488523460_0003_m_000000 Job failed as tasks failed. failedMaps:1 failedReduces:0 14/06/11 18:30:03 INFO mapreduce.Job: Counters: 9 Job Counters Failed map tasks=4 Launched map tasks=4 Other local map tasks=3 Data-local map tasks=1 Total time spent by all maps in occupied slots (ms)=69336 Total time spent by all reduces in occupied slots (ms)=0 Total time spent by all map tasks (ms)=69336 Total vcore-seconds taken by all map tasks=69336 Total megabyte-seconds taken by all map tasks=71000064 14/06/11 18:30:03 WARN mapreduce.Counters: Group FileSystemCounters is deprecated. Use org.apache.hadoop.mapreduce.FileSystemCounter instead 14/06/11 18:30:03 INFO mapreduce.ExportJobBase: Transferred 0 bytes in 100.1476 seconds (0 bytes/sec) 14/06/11 18:30:03 WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead 14/06/11 18:30:03 INFO mapreduce.ExportJobBase: Exported 0 records. 14/06/11 18:30:03 ERROR tool.ExportTool: Error during export: Export job failed! This is the log file : 2014-06-11 17:54:37,601 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring. 2014-06-11 17:54:37,602 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring. 2014-06-11 17:54:52,678 WARN [main] org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2014-06-11 17:54:52,777 INFO [main] org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties 2014-06-11 17:54:52,846 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s). 2014-06-11 17:54:52,847 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system started 2014-06-11 17:54:52,855 INFO [main] org.apache.hadoop.mapred.YarnChild: Executing with tokens: 2014-06-11 17:54:52,855 INFO [main] org.apache.hadoop.mapred.YarnChild: Kind: mapreduce.job, Service: job_1402488523460_0002, Ident: (org.apache.hadoop.mapreduce.security.token.JobTokenIdentifier#971d0d8) 2014-06-11 17:54:52,901 INFO [main] org.apache.hadoop.mapred.YarnChild: Sleeping for 0ms before retrying again. Got null now. 2014-06-11 17:54:53,165 INFO [main] org.apache.hadoop.mapred.YarnChild: mapreduce.cluster.local.dir for child: /tmp/hadoop-hduser/nm-local-dir/usercache/hduser/appcache/application_1402488523460_0002 2014-06-11 17:54:53,249 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring. 2014-06-11 17:54:53,249 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring. 2014-06-11 17:54:53,393 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id 2014-06-11 17:54:53,689 INFO [main] org.apache.hadoop.mapred.Task: Using ResourceCalculatorProcessTree : [ ] 2014-06-11 17:54:53,899 INFO [main] org.apache.hadoop.mapred.MapTask: Processing split: Paths:/user/hduser/MySort/input/data.txt:0+891082 2014-06-11 17:54:53,904 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: map.input.file is deprecated. Instead, use mapreduce.map.input.file 2014-06-11 17:54:53,904 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: map.input.start is deprecated. Instead, use mapreduce.map.input.start 2014-06-11 17:54:53,904 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: map.input.length is deprecated. Instead, use mapreduce.map.input.length 2014-06-11 17:54:54,028 ERROR [main] org.apache.sqoop.mapreduce.TextExportMapper: 2014-06-11 17:54:54,028 ERROR [main] org.apache.sqoop.mapreduce.TextExportMapper: Exception raised during data export 2014-06-11 17:54:54,028 ERROR [main] org.apache.sqoop.mapreduce.TextExportMapper: 2014-06-11 17:54:54,028 ERROR [main] org.apache.sqoop.mapreduce.TextExportMapper: Exception: java.util.NoSuchElementException at java.util.ArrayList$Itr.next(ArrayList.java:839) at mysort.__loadFromFields(mysort.java:198) at mysort.parse(mysort.java:147) at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:83) at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145) at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) 2014-06-11 17:54:54,030 ERROR [main] org.apache.sqoop.mapreduce.TextExportMapper: On input: ustNU 45 2014-06-11 17:54:54,031 ERROR [main] org.apache.sqoop.mapreduce.TextExportMapper: On input file: hdfs://localhost:9000/user/hduser/MySort/input/data.txt 2014-06-11 17:54:54,031 ERROR [main] org.apache.sqoop.mapreduce.TextExportMapper: At position 0 2014-06-11 17:54:54,031 ERROR [main] org.apache.sqoop.mapreduce.TextExportMapper: 2014-06-11 17:54:54,031 ERROR [main] org.apache.sqoop.mapreduce.TextExportMapper: Currently processing split: 2014-06-11 17:54:54,031 ERROR [main] org.apache.sqoop.mapreduce.TextExportMapper: Paths:/user/hduser/MySort/input/data.txt:0+891082 2014-06-11 17:54:54,031 ERROR [main] org.apache.sqoop.mapreduce.TextExportMapper: 2014-06-11 17:54:54,031 ERROR [main] org.apache.sqoop.mapreduce.TextExportMapper: This issue might not necessarily be caused by current input 2014-06-11 17:54:54,031 ERROR [main] org.apache.sqoop.mapreduce.TextExportMapper: due to the batching nature of export. 2014-06-11 17:54:54,031 ERROR [main] org.apache.sqoop.mapreduce.TextExportMapper: 2014-06-11 17:54:54,032 INFO [Thread-12] org.apache.sqoop.mapreduce.AutoProgressMapper: Auto-progress thread is finished. keepGoing=false 2014-06-11 17:54:54,033 WARN [main] org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:hduser (auth:SIMPLE) cause:java.io.IOException: Can't export data, please check task tracker logs 2014-06-11 17:54:54,033 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.io.IOException: Can't export data, please check task tracker logs at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112) at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145) at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) Caused by: java.util.NoSuchElementException at java.util.ArrayList$Itr.next(ArrayList.java:839) at mysort.__loadFromFields(mysort.java:198) at mysort.parse(mysort.java:147) at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:83) ... 10 more 2014-06-11 17:54:54,037 INFO [main] org.apache.hadoop.mapred.Task: Runnning cleanup for the task Any help in resolving the issue is appreciated.
Here is the complete procedure for installation and import and export commands for Sqoop. Hope fully it may be helpful to some one. This one is tried and tested by me and actually works. Download : apache.mirrors.tds.net/sqoop/1.4.4/sqoop-1.4.4.bin__hadoop-2.0.4-alpha.tar.gz sudo mv sqoop-1.4.4.bin__hadoop-2.0.4-alpha.tar.gz /usr/lib/sqoop copy paste followingtwo lines in .bashrc export SQOOP_HOME=/usr/lib/sqoop export PATH=$PATH:$SQOOP_HOME/bin Go to /usr/lib/sqoop/conf folder and copy sqoop-env-template.sh to new file sqoop-env.sh and modify export HADOOP_HOME ,HBASE_HOME,etc to the installation directory Download the postgresql conector jar file from jdbc.postgresql.org/download/postgresql-9.3-1101.jdbc41.jar create a directory manager.d in sqoop/conf/ create a file postgresql in conf/ and add the following line in it org.postgresql.Driver=/usr/lib/sqoop/lib/postgresql-9.3-1101.jdbc41.jar name the connector.jar file accordingly For Export Create a user in postgres: createuser -P -s -e ace Enter password for new role: ace Enter it again: ace CREATE DATABASE testdb OWNER ace TABLESPACE ace; create table stud1(id int,name text); Create a file student.txt Add lines such as: 1,Ace 2,iloveapis hadoop fs -put student.txt sqoop export --connect jdbc:postgresql://localhost:5432/testdb --username ace --password ace --table stud1 -m 1 --export-dir student.txt check in postgres: Select * from stud1; For Import: sqoop import --connect jdbc:postgresql://localhost:5432/testdb --username akshay --password akshay --table stud1 --m 1 hadoop fs -ls -R stud1 Expected Output: -rw-r--r-- 1 hduser supergroup 0 2014-06-13 18:10 stud1/_SUCCESS -rw-r--r-- 1 hduser supergroup 21 2014-06-13 18:10 stud1/part-m-00000 hadoop fs -cat stud1/part-m-00000 Expected Output: 1,Ace 2,iloveapis hadoop fs -copyToLocal stud1/part-m-00000 $HOME/imported_data.txt
com.thinkaurelius.titan.core.TitanException: Could not acquire new ID block from storage
I am running a simple program TitanGraph bg = TitanFactory.open("/home/titan-all-0.4.2/conf/titan-cassandra-es.properties"); IdGraph g = new IdGraph(bg, true, false); Vertex v = g.addVertex("xyz132456"); g.commit(); Here there is no vertex xyz132456 exist before but i am getting following exception. I am getting this exception very frequently. My titan server(0.4.2) is configured with all default settings as i am just testing simple operations with this. 253 [main] INFO org.elasticsearch.node - [Chtylok] version[0.90.5], pid[4015], build[c8714e8/2013-09-17T12:50:20Z] 253 [main] INFO org.elasticsearch.node - [Chtylok] initializing ... 260 [main] INFO org.elasticsearch.plugins - [Chtylok] loaded [], sites [] 2303 [main] INFO org.elasticsearch.node - [Chtylok] initialized 2303 [main] INFO org.elasticsearch.node - [Chtylok] starting ... 2309 [main] INFO org.elasticsearch.transport - [Chtylok] bound_address {local[1]}, publish_address {local[1]} 2316 [elasticsearch[Chtylok][clusterService#updateTask][T#1]] INFO org.elasticsearch.cluster.service - [Chtylok] new_master [Chtylok][1][local[1]]{local=true}, reason: local-disco-initial_connect(master) 2325 [main] INFO org.elasticsearch.discovery - [Chtylok] elasticsearch/1 2420 [main] INFO org.elasticsearch.http - [Chtylok] bound_address {inet[/0:0:0:0:0:0:0:0:9201]}, publish_address {inet[/10.0.0.5:9201]} 2420 [main] INFO org.elasticsearch.node - [Chtylok] started 2987 [elasticsearch[Chtylok][clusterService#updateTask][T#1]] INFO org.elasticsearch.gateway - [Chtylok] recovered [1] indices into cluster_state 3234 [main] INFO com.thinkaurelius.titan.diskstorage.Backend - Initiated backend operations thread pool of size 2 3542 [main] INFO com.thinkaurelius.titan.graphdb.configuration.GraphDatabaseConfiguration - Configuring edge store cache size: 201476830 5105 [main] WARN com.thinkaurelius.titan.graphdb.database.idassigner.StandardIDPool - Waiting for id renewal thread on partition 2 [1000 ms] 6105 [main] WARN com.thinkaurelius.titan.graphdb.database.idassigner.StandardIDPool - Waiting for id renewal thread on partition 2 [2000 ms] 7105 [main] WARN com.thinkaurelius.titan.graphdb.database.idassigner.StandardIDPool - Waiting for id renewal thread on partition 2 [3000 ms] 8106 [main] WARN com.thinkaurelius.titan.graphdb.database.idassigner.StandardIDPool - Waiting for id renewal thread on partition 2 [4001 ms] 9106 [main] WARN com.thinkaurelius.titan.graphdb.database.idassigner.StandardIDPool - Waiting for id renewal thread on partition 2 [5001 ms] 10106 [main] WARN com.thinkaurelius.titan.graphdb.database.idassigner.StandardIDPool - Waiting for id renewal thread on partition 2 [6001 ms] 11107 [main] WARN com.thinkaurelius.titan.graphdb.database.idassigner.StandardIDPool - Waiting for id renewal thread on partition 2 [7002 ms] 12107 [main] WARN com.thinkaurelius.titan.graphdb.database.idassigner.StandardIDPool - Waiting for id renewal thread on partition 2 [8002 ms] 13107 [main] WARN com.thinkaurelius.titan.graphdb.database.idassigner.StandardIDPool - Waiting for id renewal thread on partition 2 [9002 ms] 14108 [main] WARN com.thinkaurelius.titan.graphdb.database.idassigner.StandardIDPool - Waiting for id renewal thread on partition 2 [10003 ms] Exception in thread "Thread-4" com.thinkaurelius.titan.core.TitanException: Could not acquire new ID block from storage at com.thinkaurelius.titan.graphdb.database.idassigner.StandardIDPool.renewBuffer(StandardIDPool.java:117) at com.thinkaurelius.titan.graphdb.database.idassigner.StandardIDPool.access$100(StandardIDPool.java:14) at com.thinkaurelius.titan.graphdb.database.idassigner.StandardIDPool$IDBlockThread.run(StandardIDPool.java:172) Caused by: com.thinkaurelius.titan.diskstorage.PermanentStorageException: Permanent failure in storage backend at com.thinkaurelius.titan.diskstorage.cassandra.thrift.CassandraThriftKeyColumnValueStore.convertException(CassandraThriftKeyColumnValueStore.java:311) at com.thinkaurelius.titan.diskstorage.cassandra.thrift.CassandraThriftKeyColumnValueStore.getNamesSlice(CassandraThriftKeyColumnValueStore.java:196) at com.thinkaurelius.titan.diskstorage.cassandra.thrift.CassandraThriftKeyColumnValueStore.getSlice(CassandraThriftKeyColumnValueStore.java:120) at com.thinkaurelius.titan.diskstorage.idmanagement.ConsistentKeyIDManager$1.call(ConsistentKeyIDManager.java:106) at com.thinkaurelius.titan.diskstorage.idmanagement.ConsistentKeyIDManager$1.call(ConsistentKeyIDManager.java:103) at com.thinkaurelius.titan.diskstorage.util.BackendOperation.execute(BackendOperation.java:90) at com.thinkaurelius.titan.diskstorage.idmanagement.ConsistentKeyIDManager.getCurrentID(ConsistentKeyIDManager.java:103) at com.thinkaurelius.titan.diskstorage.idmanagement.ConsistentKeyIDManager.getIDBlock(ConsistentKeyIDManager.java:159) at com.thinkaurelius.titan.graphdb.database.idassigner.StandardIDPool.renewBuffer(StandardIDPool.java:111) ... 2 more Caused by: TimedOutException() at org.apache.cassandra.thrift.Cassandra$multiget_slice_result$multiget_slice_resultStandardScheme.read(Cassandra.java:11623) at org.apache.cassandra.thrift.Cassandra$multiget_slice_result$multiget_slice_resultStandardScheme.read(Cassandra.java:11560) at org.apache.cassandra.thrift.Cassandra$multiget_slice_result.read(Cassandra.java:11486) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78) at org.apache.cassandra.thrift.Cassandra$Client.recv_multiget_slice(Cassandra.java:701) at org.apache.cassandra.thrift.Cassandra$Client.multiget_slice(Cassandra.java:685) at com.thinkaurelius.titan.diskstorage.cassandra.thrift.CassandraThriftKeyColumnValueStore.getNamesSlice(CassandraThriftKeyColumnValueStore.java:176) ... 9 more Exception in thread "main" java.lang.IllegalArgumentException: -1 at com.google.common.base.Preconditions.checkArgument(Preconditions.java:92) at com.thinkaurelius.titan.graphdb.database.idassigner.StandardIDPool.nextBlock(StandardIDPool.java:88) at com.thinkaurelius.titan.graphdb.database.idassigner.StandardIDPool.nextID(StandardIDPool.java:134) at com.thinkaurelius.titan.graphdb.database.idassigner.VertexIDAssigner.assignID(VertexIDAssigner.java:269) at com.thinkaurelius.titan.graphdb.database.idassigner.VertexIDAssigner.assignID(VertexIDAssigner.java:155) at com.thinkaurelius.titan.graphdb.database.StandardTitanGraph.assignID(StandardTitanGraph.java:226) at com.thinkaurelius.titan.graphdb.transaction.StandardTitanTx.addPropertyInternal(StandardTitanTx.java:521) at com.thinkaurelius.titan.graphdb.transaction.StandardTitanTx.setProperty(StandardTitanTx.java:552) at com.thinkaurelius.titan.graphdb.transaction.StandardTitanTx.addProperty(StandardTitanTx.java:495) at com.thinkaurelius.titan.graphdb.transaction.StandardTitanTx.addVertex(StandardTitanTx.java:345) at com.thinkaurelius.titan.graphdb.blueprints.TitanBlueprintsTransaction.addVertex(TitanBlueprintsTransaction.java:72) at com.thinkaurelius.titan.graphdb.blueprints.TitanBlueprintsGraph.addVertex(TitanBlueprintsGraph.java:157) at com.thinkaurelius.titan.graphdb.blueprints.TitanBlueprintsGraph.addVertex(TitanBlueprintsGraph.java:24) at com.tinkerpop.blueprints.util.wrappers.id.IdGraph.addVertex(IdGraph.java:131) at newpackage.CityBizzStarting.main(CityBizzStarting.java:24)