nutch2.0 with cassandra - eclipse

Exception in thread "main" org.apache.gora.util.GoraException: java.io.IOException
at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:167)
at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:135)
at org.apache.nutch.storage.StorageUtils.createWebStore(StorageUtils.java:75)
at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:214)
at org.apache.nutch.crawl.Crawler.runTool(Crawler.java:68)
at org.apache.nutch.crawl.Crawler.run(Crawler.java:136)
at org.apache.nutch.crawl.Crawler.run(Crawler.java:250)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.nutch.crawl.Crawler.main(Crawler.java:257)
Caused by: java.io.IOException
at org.apache.gora.cassandra.store.CassandraStore.initialize(CassandraStore.java:88)
at org.apache.gora.store.DataStoreFactory.initializeDataStore(DataStoreFactory.java:102)
at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:161)
... 8 more
Caused by: java.lang.NullPointerException
at org.apache.gora.cassandra.store.CassandraMapping.<init>(CassandraMapping.java:117)
at org.apache.gora.cassandra.store.CassandraMappingManager.get(CassandraMappingManager.java:84)
at org.apache.gora.cassandra.store.CassandraClient.initialize(CassandraClient.java:84)
at org.apache.gora.cassandra.store.CassandraStore.initialize(CassandraStore.java:85)
... 10 more
I just run nutch2.0 on cassandra. It's the output of crawl, and the output of TestGoreStorage is as following:
Starting!
Exception in thread "main" org.apache.gora.util.GoraException: java.io.IOException
at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:167)
at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:135)
at org.apache.nutch.storage.StorageUtils.createWebStore(StorageUtils.java:75)
at org.apache.nutch.storage.TestGoraStorage.main(TestGoraStorage.java:204)
Caused by: java.io.IOException
at org.apache.gora.cassandra.store.CassandraStore.initialize(CassandraStore.java:88)
at org.apache.gora.store.DataStoreFactory.initializeDataStore(DataStoreFactory.java:102)
at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:161)
... 3 more
Caused by: java.lang.NullPointerException
at org.apache.gora.cassandra.store.CassandraMapping.<init>(CassandraMapping.java:117)
at org.apache.gora.cassandra.store.CassandraMappingManager.get(CassandraMappingManager.java:84)
at org.apache.gora.cassandra.store.CassandraClient.initialize(CassandraClient.java:84)
at org.apache.gora.cassandra.store.CassandraStore.initialize(CassandraStore.java:85)
... 5 more
I can connect cassandra with cassandra-cli, and just check out the nutch from svn.
Here is the effect config in gora.properties:
gora.datastore.default=org.apache.gora.cassandra.store.CassandraStore
gora.sqlstore.jdbc.driver=org.hsqldb.jdbc.JDBCDriver
gora.sqlstore.jdbc.url=jdbc:hsqldb:hsql://210.44.138.8/nutchtest
gora.sqlstore.jdbc.user=sa
gora.sqlstore.jdbc.password=
gora.cassandrastore.servers=210.44.138.8:9160
and the config in gora-cassandra-mapping:
<keyspace name="webpage" cluster="My Cluster" host="210.44.138.8">
<family name="p"/>
<family name="f"/>
<family name="sc" type="super"/>
</keyspace>
210.44.138.8 is a node of my cluster, and the name of cluster is "My Cluster",
more info: closed firewall, run in eclipse. I'm very pleasure if someone give me any help.

I'm not sure if I had the exact same problem, but I found that in the gora-cassandra-mapping.xml file I had to add a keyspace attribute (keyspace="ks1") to the class element:
<keyspace name="ks1" cluster="My Cluster" host="1.2.3.4">
...
</keyspace>
<class keyspace="ks1" keyClass="java.lang.String" name="org.apache.nutch.storage.WebPage">
...
</class>

Related

Weblogic 12c deployment null pointer exception

encounter null pointer exception but I cannot find any hints in log:
EJB deployment state REMOVED for [pas-app-master-app-4.0.5-SNAPSHOT-1:pas-app-master-app-4.0.5-SNAPSHOT-1, pas-app-master-ejb-4.0.5-SNAPSHOT.jar:pas-app-master-ejb-4.0.5-SNAPSHOT:pas-app-master-ejb-4.0.5-SNAPSHOT.jar](PASServiceFacade, SampleSessionBean)
Failure occurred in the execution of deployment request with ID "680923808590731" for task "116" on [partition-name: DOMAIN]. Error is: "weblogic.management.DeploymentException: java.lang.NullPointerException" weblogic.management.DeploymentException: java.lang.NullPointerException at weblogic.application.internal.flow.EnvContextFlow.activate(EnvContextFlow.java:81) at weblogic.application.internal.BaseDeployment$2.next(BaseDeployment.java:752) at weblogic.application.utils.StateMachineDriver.nextState(StateMachineDriver.java:45) at weblogic.application.internal.BaseDeployment.activate(BaseDeployment.java:262) at weblogic.application.internal.EarDeployment.activate(EarDeployment.java:66) at weblogic.application.internal.DeploymentStateChecker.activate(DeploymentStateChecker.java:165) at weblogic.deploy.internal.targetserver.AppContainerInvoker.activate(AppContainerInvoker.java:90) at weblogic.deploy.internal.targetserver.operations.AbstractOperation.activate(AbstractOperation.java:631) at weblogic.deploy.internal.targetserver.operations.ActivateOperation.activateDeployment(ActivateOperation.java:171) at weblogic.deploy.internal.targetserver.operations.ActivateOperation.doCommit(ActivateOperation.java:121) at weblogic.deploy.internal.targetserver.operations.StartOperation.doCommit(StartOperation.java:151) at weblogic.deploy.internal.targetserver.operations.AbstractOperation.commit(AbstractOperation.java:348) at weblogic.deploy.internal.targetserver.DeploymentManager.handleDeploymentCommit(DeploymentManager.java:907) at weblogic.deploy.internal.targetserver.DeploymentManager.activateDeploymentList(DeploymentManager.java:1468) at weblogic.deploy.internal.targetserver.DeploymentManager.handleCommit(DeploymentManager.java:459) at weblogic.deploy.internal.targetserver.DeploymentServiceDispatcher.commit(DeploymentServiceDispatcher.java:181) at weblogic.deploy.service.internal.targetserver.DeploymentReceiverCallbackDeliverer.doCommitCallback(DeploymentReceiverCallbackDeliverer.java:217) at weblogic.deploy.service.internal.targetserver.DeploymentReceiverCallbackDeliverer.access$100(DeploymentReceiverCallbackDeliverer.java:14) at weblogic.deploy.service.internal.targetserver.DeploymentReceiverCallbackDeliverer$2.run(DeploymentReceiverCallbackDeliverer.java:69) at weblogic.work.SelfTuningWorkManagerImpl$WorkAdapterImpl.run(SelfTuningWorkManagerImpl.java:678) at weblogic.invocation.ComponentInvocationContextManager._runAs(ComponentInvocationContextManager.java:352) at weblogic.invocation.ComponentInvocationContextManager.runAs(ComponentInvocationContextManager.java:337) at weblogic.work.LivePartitionUtility.doRunWorkUnderContext(LivePartitionUtility.java:57) at weblogic.work.PartitionUtility.runWorkUnderContext(PartitionUtility.java:41) at weblogic.work.SelfTuningWorkManagerImpl.runWorkUnderContext(SelfTuningWorkManagerImpl.java:652) at weblogic.work.ExecuteThread.execute(ExecuteThread.java:420) at weblogic.work.ExecuteThread.run(ExecuteThread.java:360) Caused By: java.lang.NullPointerException at weblogic.wsee.jaxws.ServiceRefProcessorImpl.colocated(ServiceRefProcessorImpl.java:508) at weblogic.wsee.jaxws.ServiceRefProcessorImpl.getWsdlURL(ServiceRefProcessorImpl.java:338) at weblogic.wsee.jaxws.ServiceRefProcessorImpl.createService(ServiceRefProcessorImpl.java:262) at weblogic.wsee.jaxws.ServiceRefProcessorImpl.createTargetRef(ServiceRefProcessorImpl.java:130) at weblogic.wsee.jaxws.ServiceRefProcessorImpl.bindServiceRef(ServiceRefProcessorImpl.java:420) at weblogic.application.naming.EnvironmentBuilder.bindServiceRef(EnvironmentBuilder.java:1179) at weblogic.application.naming.EnvironmentBuilder.bindServiceReferences(EnvironmentBuilder.java:1146) at weblogic.application.naming.EnvironmentBuilder.bindServiceReferences(EnvironmentBuilder.java:1518) at weblogic.application.naming.EnvironmentBuilder.bindEnvEntriesFromDDs(EnvironmentBuilder.java:2118) at weblogic.application.naming.EnvironmentBuilder.bindEnvEntriesFromDDs(EnvironmentBuilder.java:2089) at weblogic.application.internal.flow.EnvContextFlow.activate(EnvContextFlow.java:79) at weblogic.application.internal.BaseDeployment$2.next(BaseDeployment.java:752) at weblogic.application.utils.StateMachineDriver.nextState(StateMachineDriver.java:45) at weblogic.application.internal.BaseDeployment.activate(BaseDeployment.java:262) at weblogic.application.internal.EarDeployment.activate(EarDeployment.java:66) at weblogic.application.internal.DeploymentStateChecker.activate(DeploymentStateChecker.java:165) at weblogic.deploy.internal.targetserver.AppContainerInvoker.activate(AppContainerInvoker.java:90) at weblogic.deploy.internal.targetserver.operations.AbstractOperation.activate(AbstractOperation.java:631) at weblogic.deploy.internal.targetserver.operations.ActivateOperation.activateDeployment(ActivateOperation.java:171) at weblogic.deploy.internal.targetserver.operations.ActivateOperation.doCommit(ActivateOperation.java:121) at weblogic.deploy.internal.targetserver.operations.StartOperation.doCommit(StartOperation.java:151) at weblogic.deploy.internal.targetserver.operations.AbstractOperation.commit(AbstractOperation.java:348) at weblogic.deploy.internal.targetserver.DeploymentManager.handleDeploymentCommit(DeploymentManager.java:907) at weblogic.deploy.internal.targetserver.DeploymentManager.activateDeploymentList(DeploymentManager.java:1468) at weblogic.deploy.internal.targetserver.DeploymentManager.handleCommit(DeploymentManager.java:459) at weblogic.deploy.internal.targetserver.DeploymentServiceDispatcher.commit(DeploymentServiceDispatcher.java:181) at weblogic.deploy.service.internal.targetserver.DeploymentReceiverCallbackDeliverer.doCommitCallback(DeploymentReceiverCallbackDeliverer.java:217) at weblogic.deploy.service.internal.targetserver.DeploymentReceiverCallbackDeliverer.access$100(DeploymentReceiverCallbackDeliverer.java:14) at weblogic.deploy.service.internal.targetserver.DeploymentReceiverCallbackDeliverer$2.run(DeploymentReceiverCallbackDeliverer.java:69) at weblogic.work.SelfTuningWorkManagerImpl$WorkAdapterImpl.run(SelfTuningWorkManagerImpl.java:678) at weblogic.invocation.ComponentInvocationContextManager._runAs(ComponentInvocationContextManager.java:352) at weblogic.invocation.ComponentInvocationContextManager.runAs(ComponentInvocationContextManager.java:337) at weblogic.work.LivePartitionUtility.doRunWorkUnderContext(LivePartitionUtility.java:57) at weblogic.work.PartitionUtility.runWorkUnderContext(PartitionUtility.java:41) at weblogic.work.SelfTuningWorkManagerImpl.runWorkUnderContext(SelfTuningWorkManagerImpl.java:652) at weblogic.work.ExecuteThread.execute(ExecuteThread.java:420) at weblogic.work.ExecuteThread.run(ExecuteThread.java:360)

App can't run from eclipse which created by 'jhipster --skipClient'

java.lang.IllegalStateException: Error processing condition on org.springframework.boot.autoconfigure.context.MessageSourceAutoConfiguration
at org.springframework.boot.autoconfigure.condition.SpringBootCondition.matches(SpringBootCondition.java:64)
at org.springframework.context.annotation.ConditionEvaluator.shouldSkip(ConditionEvaluator.java:108)
at org.springframework.context.annotation.ConfigurationClassBeanDefinitionReader$TrackedConditionEvaluator.shouldSkip(ConfigurationClassBeanDefinitionReader.java:441)
at org.springframework.context.annotation.ConfigurationClassBeanDefinitionReader.loadBeanDefinitionsForConfigurationClass(ConfigurationClassBeanDefinitionReader.java:129)
at org.springframework.context.annotation.ConfigurationClassBeanDefinitionReader.loadBeanDefinitions(ConfigurationClassBeanDefinitionReader.java:118)
at org.springframework.context.annotation.ConfigurationClassPostProcessor.processConfigBeanDefinitions(ConfigurationClassPostProcessor.java:328)
at org.springframework.context.annotation.ConfigurationClassPostProcessor.postProcessBeanDefinitionRegistry(ConfigurationClassPostProcessor.java:233)
at org.springframework.context.support.PostProcessorRegistrationDelegate.invokeBeanDefinitionRegistryPostProcessors(PostProcessorRegistrationDelegate.java:271)
at org.springframework.context.support.PostProcessorRegistrationDelegate.invokeBeanFactoryPostProcessors(PostProcessorRegistrationDelegate.java:91)
at org.springframework.context.support.AbstractApplicationContext.invokeBeanFactoryPostProcessors(AbstractApplicationContext.java:692)
at org.springframework.context.support.AbstractApplicationContext.refresh(AbstractApplicationContext.java:530)
at org.springframework.boot.SpringApplication.refresh(SpringApplication.java:754)
at org.springframework.boot.SpringApplication.refreshContext(SpringApplication.java:386)
at org.springframework.boot.SpringApplication.run(SpringApplication.java:307)
at com.einfochips.demo.TestApp.main(TestApp.java:63)
Caused by: java.lang.IllegalStateException: Failed to introspect Class [org.springframework.security.config.annotation.web.configuration.WebSecurityConfiguration] from ClassLoader [sun.misc.Launcher$AppClassLoader#659e0bfd]

Elasticsearch cluster: java.lang.IllegalStateException: Future got interrupted

After a restart, our Elasticsearch cluster is showing this in the logs:
[2018-08-30T14:13:15,111][ERROR][o.e.x.m.c.c.ClusterStatsCollector] [myelastic01] collector [cluster_stats] failed to collect data
java.lang.IllegalStateException: Future got interrupted
at org.elasticsearch.common.util.concurrent.FutureUtils.get(FutureUtils.java:53) ~[elasticsearch-6.3.2.jar:6.3.2]
at org.elasticsearch.action.support.AdapterActionFuture.actionGet(AdapterActionFuture.java:34) ~[elasticsearch-6.3.2.jar:6.3.2]
(...)
Caused by: java.lang.InterruptedException
at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:998) ~[?:1.8.0_181]
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304) ~[?:1.8.0_181]
(...)
[2018-08-30T14:15:22,972][ERROR][o.e.x.m.c.i.IndexStatsCollector] [myelastic01] collector [index-stats] failed to collect data
org.elasticsearch.cluster.block.ClusterBlockException: blocked by: [SERVICE_UNAVAILABLE/1/state not recovered / initialized];
at org.elasticsearch.cluster.block.ClusterBlocks.globalBlockedException(ClusterBlocks.java:166) ~[elasticsearch-6.3.2.jar:6.3.2]
at org.elasticsearch.action.admin.indices.stats.TransportIndicesStatsAction.checkGlobalBlock(TransportIndicesStatsAction.java:68) ~[elasticsearch-6.3.2.jar:6.3.2]
at org.elasticsearch.action.admin.indices.stats.TransportIndicesStatsAction.checkGlobalBlock(TransportIndicesStatsAction.java:45) ~[elasticsearch-6.3.2.jar:6.3.2]
at org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$AsyncAction.<init>(TransportBroadcastByNodeAction.java:255) ~[elasticsearch-6.3.2.j
ar:6.3.2]
[2018-08-30T14:18:50,307][WARN ][r.suppressed ] path: /.kibana/_search, params: {ignore_unavailable=true, index=.kibana, filter_path=aggregations.types.buckets}
org.elasticsearch.action.search.SearchPhaseExecutionException: all shards failed
at org.elasticsearch.action.search.AbstractSearchAsyncAction.onPhaseFailure(AbstractSearchAsyncAction.java:288) ~[elasticsearch-6.3.2.jar:6.3.2]
at org.elasticsearch.action.search.AbstractSearchAsyncAction.executeNextPhase(AbstractSearchAsyncAction.java:128) ~[elasticsearch-6.3.2.jar:6.3.2]
at org.elasticsearch.action.search.AbstractSearchAsyncAction.onPhaseDone(AbstractSearchAsyncAction.java:249) ~[elasticsearch-6.3.2.jar:6.3.2]
(...)
[2018-08-30T14:18:50,304][WARN ][r.suppressed ] path: /.kibana/doc/config%3A6.3.2, params: {index=.kibana, id=config:6.3.2, type=doc}
org.elasticsearch.action.NoShardAvailableActionException: No shard available for [get [.kibana][doc][config:6.3.2]: routing [null]]
at org.elasticsearch.action.support.single.shard.TransportSingleShardAction$AsyncSingleAction.perform(TransportSingleShardAction.java:207) ~[elasticsearch-6.3.2.jar:
6.3.2]
at org.elasticsearch.action.support.single.shard.TransportSingleShardAction$AsyncSingleAction.start(TransportSingleShardAction.java:186) ~[elasticsearch-6.3.2.jar:6.
3.2]
at org.elasticsearch.action.support.single.shard.TransportSingleShardAction.doExecute(TransportSingleShardAction.java:95) ~[elasticsearch-6.3.2.jar:6.3.2]
(...)
This is a ELK stack which was working before, the configuration has not been modified. What could be the cause? I've searched around but found nothing relevant.

Titan JAVA API connection error?

I want to load data into titan ,which's backend is hbase.So I changed the example from https://github.com/thinkaurelius/titan/blob/titan10/titan-core/src/main/java/com/thinkaurelius/titan/example/GraphOfTheGodsFactory.java
a little.But it ocurred many Exceptions when i connect.
Exception in thread "main" com.thinkaurelius.titan.core.TitanException: Could not open global configuration
at com.thinkaurelius.titan.diskstorage.Backend.getStandaloneGlobalConfiguration(Backend.java:451)
at com.thinkaurelius.titan.graphdb.configuration.GraphDatabaseConfiguration.(GraphDatabaseConfiguration.java:1322)
at com.thinkaurelius.titan.core.TitanFactory.open(TitanFactory.java:94)
at com.thinkaurelius.titan.core.TitanFactory.open(TitanFactory.java:84)
at com.thinkaurelius.titan.core.TitanFactory$Builder.open(TitanFactory.java:139)
at titantest.TitanLoad.main(TitanLoad.java:106)
Caused by: com.thinkaurelius.titan.diskstorage.TemporaryBackendException: Temporary failure in storage backend
at com.thinkaurelius.titan.diskstorage.hbase.HBaseStoreManager.ensureTableExists(HBaseStoreManager.java:759)
at com.thinkaurelius.titan.diskstorage.hbase.HBaseStoreManager.ensureColumnFamilyExists(HBaseStoreManager.java:831)
at com.thinkaurelius.titan.diskstorage.hbase.HBaseStoreManager.openDatabase(HBaseStoreManager.java:456)
at com.thinkaurelius.titan.diskstorage.keycolumnvalue.KeyColumnValueStoreManager.openDatabase(KeyColumnValueStoreManager.java:29)
at com.thinkaurelius.titan.diskstorage.Backend.getStandaloneGlobalConfiguration(Backend.java:449)
... 5 more
Caused by: org.apache.hadoop.hbase.DoNotRetryIOException: java.lang.IllegalAccessError: tried to access method com.google.common.base.Stopwatch.()V from class org.apache.hadoop.hbase.zookeeper.MetaTableLocator
at org.apache.hadoop.hbase.client.RpcRetryingCaller.translateException(RpcRetryingCaller.java:229)
at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:202)
at org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:299)
at org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:278)
at org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:140)
at org.apache.hadoop.hbase.client.ClientScanner.(ClientScanner.java:135)
at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:845)
at org.apache.hadoop.hbase.MetaTableAccessor.fullScan(MetaTableAccessor.java:600)
at org.apache.hadoop.hbase.MetaTableAccessor.tableExists(MetaTableAccessor.java:364)
at org.apache.hadoop.hbase.client.HBaseAdmin.tableExists(HBaseAdmin.java:281)
at com.thinkaurelius.titan.diskstorage.hbase.HBaseAdmin1_0.tableExists(HBaseAdmin1_0.java:70)
at com.thinkaurelius.titan.diskstorage.hbase.HBaseStoreManager.ensureTableExists(HBaseStoreManager.java:753)
... 9 more
Caused by: java.lang.IllegalAccessError: tried to access method com.google.common.base.Stopwatch.()V from class org.apache.hadoop.hbase.zookeeper.MetaTableLocator
at org.apache.hadoop.hbase.zookeeper.MetaTableLocator.blockUntilAvailable(MetaTableLocator.java:434)
at org.apache.hadoop.hbase.client.ZooKeeperRegistry.getMetaRegionLocation(ZooKeeperRegistry.java:60)
at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateMeta(ConnectionManager.java:1139)
at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1106)
at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:293)
at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:147)
at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:56)
at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200)
... 19 more
Here is the code:
TitanGraph g = TitanFactory.build().set("storage.backend","hbase").set("storage.hostname","192.168.200.121,192.168.200.115")
.set("storage.hbase.table","mytitan").set("storage.hbase.ext.zookeeper.znode.parent","/hbase-unsecure").set("storage.port",2181).open();
BTW,I can connect to the Hbase cluster,it looks fine.

:INIT_FAILURE, Fail to create InputInitializerManager

I am using EMR services from Amazon Web Services and am attempting to run a count query on an external table that I've built. The data for the table is stored in mongodb and the table is an external table in Hive. The query I'm trying to run is
select user_id, count (*) from myTable group by user_id;
I can query select * from myTable but I cannot do any other queries. When I try to I get this error:
----------------------------------------------------------------------------------------------
VERTICES MODE STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED
----------------------------------------------------------------------------------------------
Map 1 container FAILED -1 0 0 -1 0 0
Reducer 2 container KILLED 1 0 0 1 0 0
----------------------------------------------------------------------------------------------
VERTICES: 00/02 [>>--------------------------] 0% ELAPSED TIME: 0.03 s
----------------------------------------------------------------------------------------------
Status: Failed
Vertex failed, vertexName=Map 1, vertexId=vertex_1476467351971_0008_2_00, diagnostics=[Vertex vertex_1476467351971_0008_2_00 [Map 1] killed/failed due to:INIT_FAILURE, Fail to create InputInitializerManager, org.apache.tez.dag.api.TezReflectionException: Unable to instantiate class with 1 arguments: org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator
at org.apache.tez.common.ReflectionUtils.getNewInstance(ReflectionUtils.java:70)
at org.apache.tez.common.ReflectionUtils.createClazzInstance(ReflectionUtils.java:89)
at org.apache.tez.dag.app.dag.RootInputInitializerManager$1.run(RootInputInitializerManager.java:151)
at org.apache.tez.dag.app.dag.RootInputInitializerManager$1.run(RootInputInitializerManager.java:148)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.tez.dag.app.dag.RootInputInitializerManager.createInitializer(RootInputInitializerManager.java:148)
at org.apache.tez.dag.app.dag.RootInputInitializerManager.runInputInitializers(RootInputInitializerManager.java:121)
at org.apache.tez.dag.app.dag.impl.VertexImpl.setupInputInitializerManager(VertexImpl.java:3986)
at org.apache.tez.dag.app.dag.impl.VertexImpl.access$3100(VertexImpl.java:204)
at org.apache.tez.dag.app.dag.impl.VertexImpl$InitTransition.handleInitEvent(VertexImpl.java:2818)
at org.apache.tez.dag.app.dag.impl.VertexImpl$InitTransition.transition(VertexImpl.java:2765)
at org.apache.tez.dag.app.dag.impl.VertexImpl$InitTransition.transition(VertexImpl.java:2747)
at org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
at org.apache.tez.state.StateMachineTez.doTransition(StateMachineTez.java:59)
at org.apache.tez.dag.app.dag.impl.VertexImpl.handle(VertexImpl.java:1888)
at org.apache.tez.dag.app.dag.impl.VertexImpl.handle(VertexImpl.java:203)
at org.apache.tez.dag.app.DAGAppMaster$VertexEventDispatcher.handle(DAGAppMaster.java:2242)
at org.apache.tez.dag.app.DAGAppMaster$VertexEventDispatcher.handle(DAGAppMaster.java:2228)
at org.apache.tez.common.AsyncDispatcher.dispatch(AsyncDispatcher.java:183)
at org.apache.tez.common.AsyncDispatcher$1.run(AsyncDispatcher.java:114)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.tez.common.ReflectionUtils.getNewInstance(ReflectionUtils.java:68)
... 25 more
Caused by: java.lang.RuntimeException: Failed to load plan: hdfs://ip-172-31-33-88.ec2.internal:8020/tmp/hive/hadoop/8c1eca9a-84ba-4d79-b39c-e633f1c6a646/hive_2016-10-14_18-57-38_728_4862537458701624850-1/hadoop/_tez_scratch_dir/157c0e27-0bfa-4acf-b0e9-b2fed595a8de/map.xml: org.apache.hive.com.esotericsoftware.kryo.KryoException: Unable to find class: com.mongodb.hadoop.hive.input.HiveMongoInputFormat
Serialization trace:
inputFileFormatClass (org.apache.hadoop.hive.ql.plan.PartitionDesc)
aliasToPartnInfo (org.apache.hadoop.hive.ql.plan.MapWork)
at org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:451)
at org.apache.hadoop.hive.ql.exec.Utilities.getMapWork(Utilities.java:298)
at org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.<init>(HiveSplitGenerator.java:131)
... 30 more
Caused by: org.apache.hive.com.esotericsoftware.kryo.KryoException: Unable to find class: com.mongodb.hadoop.hive.input.HiveMongoInputFormat
Serialization trace:
inputFileFormatClass (org.apache.hadoop.hive.ql.plan.PartitionDesc)
aliasToPartnInfo (org.apache.hadoop.hive.ql.plan.MapWork)
at org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:156)
at org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:133)
at org.apache.hive.com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:670)
at org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readClass(SerializationUtilities.java:180)
at org.apache.hive.com.esotericsoftware.kryo.serializers.DefaultSerializers$ClassSerializer.read(DefaultSerializers.java:326)
at org.apache.hive.com.esotericsoftware.kryo.serializers.DefaultSerializers$ClassSerializer.read(DefaultSerializers.java:314)
at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObjectOrNull(Kryo.java:759)
at org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readObjectOrNull(SerializationUtilities.java:198)
at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:132)
at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:551)
at org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:790)
at org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readClassAndObject(SerializationUtilities.java:175)
at org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:161)
at org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:39)
at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:708)
at org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readObject(SerializationUtilities.java:213)
at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125)
at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:551)
at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:686)
at org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readObject(SerializationUtilities.java:205)
at org.apache.hadoop.hive.ql.exec.SerializationUtilities.deserializeObjectByKryo(SerializationUtilities.java:583)
at org.apache.hadoop.hive.ql.exec.SerializationUtilities.deserializePlan(SerializationUtilities.java:492)
at org.apache.hadoop.hive.ql.exec.SerializationUtilities.deserializePlan(SerializationUtilities.java:469)
at org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:411)
... 32 more
Caused by: java.lang.ClassNotFoundException: com.mongodb.hadoop.hive.input.HiveMongoInputFormat
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:154)
... 55 more
]
Vertex killed, vertexName=Reducer 2, vertexId=vertex_1476467351971_0008_2_01, diagnostics=[Vertex received Kill in NEW state., Vertex vertex_1476467351971_0008_2_01 [Reducer 2] killed/failed due to:OTHER_VERTEX_FAILURE]
DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:1
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Map 1, vertexId=vertex_1476467351971_0008_2_00, diagnostics=[Vertex vertex_1476467351971_0008_2_00 [Map 1] killed/failed due to:INIT_FAILURE, Fail to create InputInitializerManager, org.apache.tez.dag.api.TezReflectionException: Unable to instantiate class with 1 arguments: org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator
at org.apache.tez.common.ReflectionUtils.getNewInstance(ReflectionUtils.java:70)
at org.apache.tez.common.ReflectionUtils.createClazzInstance(ReflectionUtils.java:89)
at org.apache.tez.dag.app.dag.RootInputInitializerManager$1.run(RootInputInitializerManager.java:151)
at org.apache.tez.dag.app.dag.RootInputInitializerManager$1.run(RootInputInitializerManager.java:148)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.tez.dag.app.dag.RootInputInitializerManager.createInitializer(RootInputInitializerManager.java:148)
at org.apache.tez.dag.app.dag.RootInputInitializerManager.runInputInitializers(RootInputInitializerManager.java:121)
at org.apache.tez.dag.app.dag.impl.VertexImpl.setupInputInitializerManager(VertexImpl.java:3986)
at org.apache.tez.dag.app.dag.impl.VertexImpl.access$3100(VertexImpl.java:204)
at org.apache.tez.dag.app.dag.impl.VertexImpl$InitTransition.handleInitEvent(VertexImpl.java:2818)
at org.apache.tez.dag.app.dag.impl.VertexImpl$InitTransition.transition(VertexImpl.java:2765)
at org.apache.tez.dag.app.dag.impl.VertexImpl$InitTransition.transition(VertexImpl.java:2747)
at org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
at org.apache.tez.state.StateMachineTez.doTransition(StateMachineTez.java:59)
at org.apache.tez.dag.app.dag.impl.VertexImpl.handle(VertexImpl.java:1888)
at org.apache.tez.dag.app.dag.impl.VertexImpl.handle(VertexImpl.java:203)
at org.apache.tez.dag.app.DAGAppMaster$VertexEventDispatcher.handle(DAGAppMaster.java:2242)
at org.apache.tez.dag.app.DAGAppMaster$VertexEventDispatcher.handle(DAGAppMaster.java:2228)
at org.apache.tez.common.AsyncDispatcher.dispatch(AsyncDispatcher.java:183)
at org.apache.tez.common.AsyncDispatcher$1.run(AsyncDispatcher.java:114)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.tez.common.ReflectionUtils.getNewInstance(ReflectionUtils.java:68)
... 25 more
Caused by: java.lang.RuntimeException: Failed to load plan: hdfs://ip-172-31-33-88.ec2.internal:8020/tmp/hive/hadoop/8c1eca9a-84ba-4d79-b39c-e633f1c6a646/hive_2016-10-14_18-57-38_728_4862537458701624850-1/hadoop/_tez_scratch_dir/157c0e27-0bfa-4acf-b0e9-b2fed595a8de/map.xml: org.apache.hive.com.esotericsoftware.kryo.KryoException: Unable to find class: com.mongodb.hadoop.hive.input.HiveMongoInputFormat
Serialization trace:
inputFileFormatClass (org.apache.hadoop.hive.ql.plan.PartitionDesc)
aliasToPartnInfo (org.apache.hadoop.hive.ql.plan.MapWork)
at org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:451)
at org.apache.hadoop.hive.ql.exec.Utilities.getMapWork(Utilities.java:298)
at org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.<init>(HiveSplitGenerator.java:131)
... 30 more
Caused by: org.apache.hive.com.esotericsoftware.kryo.KryoException: Unable to find class: com.mongodb.hadoop.hive.input.HiveMongoInputFormat
Serialization trace:
inputFileFormatClass (org.apache.hadoop.hive.ql.plan.PartitionDesc)
aliasToPartnInfo (org.apache.hadoop.hive.ql.plan.MapWork)
at org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:156)
at org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:133)
at org.apache.hive.com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:670)
at org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readClass(SerializationUtilities.java:180)
at org.apache.hive.com.esotericsoftware.kryo.serializers.DefaultSerializers$ClassSerializer.read(DefaultSerializers.java:326)
at org.apache.hive.com.esotericsoftware.kryo.serializers.DefaultSerializers$ClassSerializer.read(DefaultSerializers.java:314)
at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObjectOrNull(Kryo.java:759)
at org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readObjectOrNull(SerializationUtilities.java:198)
at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:132)
at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:551)
at org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:790)
at org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readClassAndObject(SerializationUtilities.java:175)
at org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:161)
at org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:39)
at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:708)
at org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readObject(SerializationUtilities.java:213)
at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125)
at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:551)
at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:686)
at org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readObject(SerializationUtilities.java:205)
at org.apache.hadoop.hive.ql.exec.SerializationUtilities.deserializeObjectByKryo(SerializationUtilities.java:583)
at org.apache.hadoop.hive.ql.exec.SerializationUtilities.deserializePlan(SerializationUtilities.java:492)
at org.apache.hadoop.hive.ql.exec.SerializationUtilities.deserializePlan(SerializationUtilities.java:469)
at org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:411)
... 32 more
Caused by: java.lang.ClassNotFoundException: com.mongodb.hadoop.hive.input.HiveMongoInputFormat
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:154)
... 55 more
]Vertex killed, vertexName=Reducer 2, vertexId=vertex_1476467351971_0008_2_01, diagnostics=[Vertex received Kill in NEW state., Vertex vertex_1476467351971_0008_2_01 [Reducer 2] killed/failed due to:OTHER_VERTEX_FAILURE]DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:1
hive>
I am using a single node EMR cluster from AWS with the "Core Hadoop" configuration, but I had the same errors using a three node cluster. I have already added the Mongo Hadoop Core, Mongo Hadoop Hive, Mongo Java driver to the master node.. These are the jars I'm using:
mongo-hadoop-hive-2.0.1.jar
mongo-java-driver-3.3.0.jar
remotecontent?filepath=org%2Fmongodb%2Fmongo-hadoop%2Fmongo-hadoop-core%2F2.0.1%2Fmongo-hadoop-core-2.0.1.jar
These jars were downloaded from Maven.
A similar question had previously been asked on StackOverflow and was resolved by adding the src for Apache Tez 0.8.4 and building it. Apache Tez 0.8.4 is already on the cluster though, at least according to the EMR configuration we chose.
We are using Hadoop 2.7.2 and Hive 2.1.0.