Segment cache is not getting created in druid - druid

I am submitting a ingestion task in druid. The task is getting completed successfully and segments are created in the hdfs.
Previously with the same config, the segment cache was getting updated.
However the druid- segment cache is not getting updated with the segments of the new datasource.
I have checked and found that all the druid services are up and running.
Below is the exception, which is getting logged.
io.druid.query.lookup.LookupReferencesManager.start()] on object[io.druid.query.lookup.LookupReferencesManager#7336fd8f].
INFO [main] io.druid.query.lookup.LookupReferencesManager - LookupReferencesManager is starting.
ERROR [main] io.druid.curator.discovery.ServerDiscoverySelector - No server instance found for [druid/coordinator]
INFO [NodeTypeWatcher[coordinator]] io.druid.curator.discovery.CuratorDruidNodeDiscoveryProvider$NodeTypeWatcher - Received INITIALIZED in node watcher for type [coordinator].
WARN [main] io.druid.java.util.common.RetryUtils - Failed on try 1, retrying in 1,481ms.
io.druid.java.util.common.IOE: No known server
at io.druid.discovery.DruidLeaderClient.getCurrentKnownLeader(DruidLeaderClient.java:276) ~[druid-server-0.12.2.jar:0.12.2]
at io.druid.discovery.DruidLeaderClient.makeRequest(DruidLeaderClient.java:128) ~[druid-server-0.12.2.jar:0.12.2]
at io.druid.query.lookup.LookupReferencesManager.fetchLookupsForTier(LookupReferencesManager.java:569) ~[druid-server-0.12.2.jar:0.12.2]
at io.druid.query.lookup.LookupReferencesManager.tryGetLookupListFromCoordinator(LookupReferencesManager.java:420) ~[druid-server-0.12.2.jar:0.12.2]
at io.druid.query.lookup.LookupReferencesManager.lambda$getLookupListFromCoordinator$4(LookupReferencesManager.java:398) ~[druid-server-0.12.2.jar:0.12.2]
at io.druid.java.util.common.RetryUtils.retry(RetryUtils.java:63) [java-util-0.12.2.jar:0.12.2]
at io.druid.java.util.common.RetryUtils.retry(RetryUtils.java:81) [java-util-0.12.2.jar:0.12.2]
at io.druid.query.lookup.LookupReferencesManager.getLookupListFromCoordinator(LookupReferencesManager.java:388) [druid-server-0.12.2.jar:0.12.2]
at io.druid.query.lookup.LookupReferencesManager.getLookupsList(LookupReferencesManager.java:365) [druid-server-0.12.2.jar:0.12.2]
at io.druid.query.lookup.LookupReferencesManager.loadAllLookupsAndInitStateRef(LookupReferencesManager.java:348) [druid-server-0.12.2.jar:0.12.2]
at io.druid.query.lookup.LookupReferencesManager.start(LookupReferencesManager.java:153) [druid-server-0.12.2.jar:0.12.2]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_162]
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_162]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_162]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_162]
at io.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler.start(Lifecycle.java:413) [java-util-0.12.2.jar:0.12.2]
at io.druid.java.util.common.lifecycle.Lifecycle.start(Lifecycle.java:311) [java-util-0.12.2.jar:0.12.2]
at io.druid.guice.LifecycleModule$2.start(LifecycleModule.java:134) [druid-api-0.12.2.jar:0.12.2]
at io.druid.cli.GuiceRunnable.initLifecycle(GuiceRunnable.java:101) [druid-services-0.12.2.jar:0.12.2]
at io.druid.cli.CliPeon.run(CliPeon.java:301) [druid-services-0.12.2.jar:0.12.2]
at io.druid.cli.Main.main(Main.java:116) [druid-services-0.12.2.jar:0.12.2]
ERROR [main] io.druid.curator.discovery.ServerDiscoverySelector - No server instance found for [druid/coordinator]

You can check whether the following properties are configured or not:
hive.druid.broker.address.default : MyIP:8082
hive.druid.coordinator.address.default : MyIP:8081
hive.druid.http.numConnection: 20
hive.druid.http.read.timeout: PT10M
hive.druid.indexer.memory.rownum.max: 75000
hive.druid.indexer.partition.size.max: 1000000
hive.druid.indexer.segments.granularity: DAY
hive.druid.metadata.base: druid
hive.druid.metadata.db.type: mysql
hive.druid.metadata.password: druid
hive.druid.metadata.uri: jdbc:mysql://MyIP:3306/druid
hive.druid.metadata.username: druid
hive.druid.storage.storageDirectory: /apps/hive/warehouse
hive.druid.working.directory: /tmp/druid-indexing

Related

Telemetry data unable to pass through root rule chain with node save timeseries

Things Board Version: V3.4.1 CE
OS: Window
Database: postgreSQL timescale
Queue: Rabbitmq
I discover that the telemetry data unable to pass through things board root rule chain with the node of the name save timeseries, i am not sure what is happening, i confirm there should be no problem on the connection between thingsboard and also postgreSQL...
I can see debug from here to know the problem is because failed to save to timeseries data....
2022-11-02 09:04:27,148 [sql-queue-2-ts timescale-11-thread-1] ERROR o.t.s.dao.sql.TbSqlBlockingQueue - [TS Timescale] Failed to save 2 entities
org.springframework.transaction.TransactionSystemException: Could not roll back JPA transaction; nested exception is org.hibernate.TransactionException: Unable to rollback against JDBC Connection
at org.springframework.orm.jpa.JpaTransactionManager.doRollback(JpaTransactionManager.java:593)
at org.springframework.transaction.support.AbstractPlatformTransactionManager.processRollback(AbstractPlatformTransactionManager.java:835)
at org.springframework.transaction.support.AbstractPlatformTransactionManager.rollback(AbstractPlatformTransactionManager.java:809)
at org.springframework.transaction.interceptor.TransactionAspectSupport.completeTransactionAfterThrowing(TransactionAspectSupport.java:672)
at org.springframework.transaction.interceptor.TransactionAspectSupport.invokeWithinTransaction(TransactionAspectSupport.java:392)
at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:119)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.proceed(CglibAopProxy.java:763)
at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:708)
at org.thingsboard.server.dao.sqlts.insert.timescale.TimescaleInsertTsRepository$$EnhancerBySpringCGLIB$$693764a7.saveOrUpdate()
at org.thingsboard.server.dao.sqlts.timescale.TimescaleTimeseriesDao.lambda$init$1(TimescaleTimeseriesDao.java:89)
at org.thingsboard.server.dao.sql.TbSqlBlockingQueue.lambda$init$2(TbSqlBlockingQueue.java:74)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: org.hibernate.TransactionException: Unable to rollback against JDBC Connection
at org.hibernate.resource.jdbc.internal.AbstractLogicalConnectionImplementor.rollback(AbstractLogicalConnectionImplementor.java:127)
at org.hibernate.resource.transaction.backend.jdbc.internal.JdbcResourceLocalTransactionCoordinatorImpl$TransactionDriverControlImpl.rollback(JdbcResourceLocalTransactionCoordinatorImpl.java:304)
at org.hibernate.engine.transaction.internal.TransactionImpl.rollback(TransactionImpl.java:142)
at org.springframework.orm.jpa.JpaTransactionManager.doRollback(JpaTransactionManager.java:589)
... 16 common frames omitted
Caused by: java.sql.SQLException: Connection is closed
at com.zaxxer.hikari.pool.ProxyConnection$ClosedConnection.lambda$getClosedConnection$0(ProxyConnection.java:515)
at com.sun.proxy.$Proxy153.rollback(Unknown Source)
at com.zaxxer.hikari.pool.ProxyConnection.rollback(ProxyConnection.java:396)
at com.zaxxer.hikari.pool.HikariProxyConnection.rollback(HikariProxyConnection.java)
at org.hibernate.resource.jdbc.internal.AbstractLogicalConnectionImplementor.rollback(AbstractLogicalConnectionImplementor.java:121)
... 19 common frames omitted
2022-11-02 09:04:27,148 [tb-rule-engine-consumer-37-thread-35 | QK(Main,TB_RULE_ENGINE,system)-10] INFO o.t.s.s.q.DefaultTbRuleEngineConsumerService - Failed to process 1 messages
2022-11-02 09:04:27,148 [tb-rule-engine-consumer-37-thread-35 | QK(Main,TB_RULE_ENGINE,system)-10] INFO o.t.s.s.q.DefaultTbRuleEngineConsumerService - [c1737420-58eb-11eb-808a-dfdc947dc52b] Failed to process message: TbMsg(queueName=Main, id=1318aa98-0755-49b0-9685-a71a2326ff7d, ts=1667351067141, type=POST_TELEMETRY_REQUEST, originator=354d8300-aa84-11ec-9a47-4727b3504d5d, customerId=d7094170-5c4c-11eb-b06a-c93fc5e45132, metaData=TbMsgMetaData(data={deviceType=Sensor, deviceName=RMS Voltage Sensor, ts=1667351067141}), dataType=JSON, data={"timestamp":1667351069011,"values":[{"id":"CnB Prai Gateway.RMS Shearline.Sensor5_Active","v":true,"t":1667291491472},{"id":"CnB Prai Gateway.RMS Shearline.Sensor5_Battery","v":296,"t":1667342191745},{"id":"CnB Prai Gateway.RMS Shearline.Sensor5_Signal","v":65478,"t":1667350910940},{"id":"CnB Prai Gateway.RMS Shearline.Sensor5_Voltage","v":0,"t":1667351068948}]}, ruleChainId=c1c082b0-58eb-11eb-808a-dfdc947dc52b, ruleNodeId=null, ctx=org.thingsboard.server.common.msg.TbMsgProcessingCtx#4c99aecc, callback=org.thingsboard.server.common.msg.queue.TbMsgCallback$1#415dca17), Last Rule Node: [RuleChain: Root Rule Chain|RuleNode: Save raw telemetry(71b87e70-177d-11ec-9530-3197ec48e7c5)]
I would suggest to use generator node to test where the problem is.
First you should test if you can save basic message (like the one you get when you open generator node). With this you will confirm that you can save data to database.
After that you should configure generator node to act as your device, and have same data and metadata as you would get from you device/integration.
Reach out back here with your findings from that.
Generator rule node ref: https://thingsboard.io/docs/user-guide/rule-engine-2-0/action-nodes/#generator-node

I'm having an XA_END error while trying to send a message from my banking application T24 to WMQ through my application server JBOSS

I'm setting up the Banking application I use(T24) to send and receive messages from IBM MQ. I'm getting errors when I connect my application server(JBOSS 7) to MQ
I've tried altering the MDB's but to no avail.
This is a snippet from my Log file:
2019-03-22 17:00:17,269 INFO [org.jboss.as.connector.deployers.RaXmlDeployer] (default-threads - 13) wmq.jmsra.rar: MQJCA4026:Transaction backed out with reason: 'The method 'xa_end' has failed with errorCode '100'.'.
2019-03-22 17:00:17,269 WARN [org.jboss.jca.core.connectionmanager.pool.strategy.OnePool] (default-threads - 24) IJ000612: Destroying connection that could not be successfully matched: org.jboss.jca.core.connectionmanager.listener.TxConnectionListener#213a8b6c[state=NORMAL managed connection=com.ibm.mq.connector.outbound.ManagedConnectionImpl#3db98551 connection handles=0 lastReturned=1553274017260 lastValidated=1553274017260 lastCheckedOut=1553274017260 trackByTx=false pool=org.jboss.jca.core.connectionmanager.pool.strategy.OnePool#2185ef44 mcp=SemaphoreConcurrentLinkedQueueManagedConnectionPool#52753b07[pool=WMQ Connection Pool] xaResource=XAResourceWrapperImpl#518c200e[xaResource=com.ibm.mq.connector.xa.XARWrapper#67583e6a pad=false overrideRmValue=null productName=IBM MQ productVersion=#(#) MQMBID sn=p910-L180709.TRIAL su=_QGElQYNUEeidSaRkJ_p2Kg pn=com.ibm.mq.connector/src/com/ibm/mq/connector/outbound/ManagedConnectionMetaDataImpl.java jndiName=java:jboss/jms/MQConnectionFactory] txSync=null]
2019-03-22 17:00:17,269 WARN [com.arjuna.ats.jta] (default-threads - 13) ARJUNA016045: attempted rollback of < formatId=131077, gtrid_length=29, bqual_length=36, tx_uid=0:ffffc0a81701:-783f63b7:5c951441:40a3, node_name=1, branch_uid=0:ffffc0a81701:-783f63b7:5c951441:40b4, subordinatenodename=null, eis_name=java:jboss/jms/MQConnectionFactory > (XAResourceWrapperImpl#44eb1bdd[xaResource=com.ibm.mq.connector.xa.XARWrapper#c57055e pad=false overrideRmValue=null productName=IBM MQ productVersion=#(#) MQMBID sn=p910-L180709.TRIAL su=_QGElQYNUEeidSaRkJ_p2Kg pn=com.ibm.mq.connector/src/com/ibm/mq/connector/outbound/ManagedConnectionMetaDataImpl.java jndiName=java:jboss/jms/MQConnectionFactory]) failed with exception code XAException.XAER_NOTA: javax.transaction.xa.XAException: The method 'xa_rollback' has failed with errorCode '-4'.
at com.ibm.mq.jmqi.JmqiXAResource.rollback(JmqiXAResource.java:874)
at com.ibm.mq.connector.xa.XARWrapper.rollback(XARWrapper.java:605)
at org.jboss.jca.core.tx.jbossts.XAResourceWrapperImpl.rollback(XAResourceWrapperImpl.java:196)
at com.arjuna.ats.internal.jta.resources.arjunacore.XAResourceRecord.topLevelAbort(XAResourceRecord.java:369)
at com.arjuna.ats.arjuna.coordinator.BasicAction.doAbort(BasicAction.java:2999)
at com.arjuna.ats.arjuna.coordinator.BasicAction.doAbort(BasicAction.java:2978)
at com.arjuna.ats.arjuna.coordinator.BasicAction.Abort(BasicAction.java:1658)
at com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator.cancel(TwoPhaseCoordinator.java:127)
at com.arjuna.ats.arjuna.AtomicAction.abort(AtomicAction.java:186)
at com.arjuna.ats.internal.jta.transaction.arjunacore.TransactionImple.rollbackAndDisassociate(TransactionImple.java:1282)
at com.arjuna.ats.internal.jta.transaction.arjunacore.BaseTransaction.rollback(BaseTransaction.java:143)
at com.arjuna.ats.jbossatx.BaseTransactionManagerDelegate.rollback(BaseTransactionManagerDelegate.java:134)
at org.jboss.as.ejb3.inflow.MessageEndpointInvocationHandler.afterDelivery(MessageEndpointInvocationHandler.java:69)
at sun.reflect.GeneratedMethodAccessor46.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.jboss.as.ejb3.inflow.AbstractInvocationHandler.handle(AbstractInvocationHandler.java:60)
at org.jboss.as.ejb3.inflow.MessageEndpointInvocationHandler.doInvoke(MessageEndpointInvocationHandler.java:135)
at org.jboss.as.ejb3.inflow.AbstractInvocationHandler.invoke(AbstractInvocationHandler.java:73)
at com.temenos.tafj.mdb.TransactedMDB$$$endpoint1.afterDelivery(Unknown Source)
at com.ibm.mq.connector.inbound.AbstractWorkImpl.run(AbstractWorkImpl.java:343)
at org.jboss.jca.core.workmanager.WorkWrapper.run(WorkWrapper.java:223)
at org.jboss.threads.SimpleDirectExecutor.execute(SimpleDirectExecutor.java:33)
at org.jboss.threads.QueueExecutor.runTask(QueueExecutor.java:808)
at org.jboss.threads.QueueExecutor.access$100(QueueExecutor.java:45)
at org.jboss.threads.QueueExecutor$Worker.run(QueueExecutor.java:849)
at java.lang.Thread.run(Thread.java:748)
at org.jboss.threads.JBossThread.run(JBossThread.java:320)
That error can be safely ignored.
XAER_NOTA is a valid return from xa_rollback()
The XA specification indicates that "[a]n RM can also unilaterally roll back and forget a branch any time except after a successful prepare"

How to configure PostgreSQL database for deploying alfresco on tomcat 8?

I have built alfresco(version 5.2) from source on ubuntu 16.04. I want to deploy alfresco on tomcat 8. The deployment is successful however the PostgreSQL database is not getting configured as required. I have followed the steps as given in http://docs.alfresco.com/5.1/tasks/postgresql-config.html
I observe the home page as given in image alfresco_page
Am I missing onto something here that the PostgreSQL database is not getting configured. Is there any other configuration that needs to be done that I have missed ?
UPDATE
The alfresco.log gave me this
2017-08-01 05:53:54,406 WARN [org.alfresco.web.scripts.servlet.X509ServletFilterBase] [localhost-startStop-1] clientAuth does not appear to be set for Tomcat. clientAuth must be set to 'want' for X509 Authentication
2017-08-01 05:53:54,416 WARN [org.alfresco.web.scripts.servlet.X509ServletFilterBase] [localhost-startStop-1] Attempting to set clientAuth=want through JMX...
2017-08-01 05:53:54,427 WARN [org.alfresco.web.scripts.servlet.X509ServletFilterBase] [localhost-startStop-1] Unable to set clientAuth=want through JMX.
2017-08-01 05:53:55,139 ERROR [org.apache.solr.core.CoreContainer] [coreLoadExecutor-5-thread-1] Error creating core [collection1]: Could not load conf for core collection1: Error loading solr config from solr/collection1/conf/solrconfig.xml
org.apache.solr.common.SolrException: Could not load conf for core collection1: Error loading solr config from solr/collection1/conf/solrconfig.xml
at org.apache.solr.core.ConfigSetService.getConfig(ConfigSetService.java:66)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:489)
at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:255)
at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:249)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.solr.common.SolrException: Error loading solr config from solr/collection1/conf/solrconfig.xml
at org.apache.solr.core.SolrConfig.readFromResourceLoader(SolrConfig.java:154)
at org.apache.solr.core.ConfigSetService.createSolrConfig(ConfigSetService.java:80)
at org.apache.solr.core.ConfigSetService.getConfig(ConfigSetService.java:61)
... 7 more
Caused by: java.io.IOException: Can't find resource 'solrconfig.xml' in classpath or '/root/tomcat85/output/build/webapps/solr/collection1/conf'
at org.apache.solr.core.SolrResourceLoader.openResource(SolrResourceLoader.java:362)
at org.apache.solr.core.SolrResourceLoader.openConfig(SolrResourceLoader.java:308)
at org.apache.solr.core.Config.<init>(Config.java:117)
at org.apache.solr.core.Config.<init>(Config.java:87)
at org.apache.solr.core.SolrConfig.<init>(SolrConfig.java:167)
at org.apache.solr.core.SolrConfig.readFromResourceLoader(SolrConfig.java:145)
... 9 more
2017-08-01 05:54:09,634 WARN [org.hibernate.cfg.SettingsFactory] [localhost-startStop-1] Could not obtain connection metadata
org.apache.commons.dbcp.SQLNestedException: Cannot create PoolableConnectionFactory (Connection refused. Check that the hostname and port are correct and that the postmaster is accepting TCP/IP connections.)
at org.apache.commons.dbcp.BasicDataSource.createPoolableConnectionFactory(BasicDataSource.java:1549)
at org.apache.commons.dbcp.BasicDataSource.createDataSource(BasicDataSource.java:1388)
at org.apache.commons.dbcp.BasicDataSource.getConnection(BasicDataSource.java:1044)
at org.springframework.orm.hibernate3.LocalDataSourceConnectionProvider.getConnection(LocalDataSourceConnectionProvider.java:83)
at org.hibernate.cfg.SettingsFactory.buildSettings(SettingsFactory.java:84)
at org.hibernate.cfg.Configuration.buildSettings(Configuration.java:2079)
at org.hibernate.cfg.Configuration.buildSessionFactory(Configuration.java:1304)
at org.springframework.orm.hibernate3.LocalSessionFactoryBean.newSessionFactory(LocalSessionFactoryBean.java:863)
at org.springframework.orm.hibernate3.LocalSessionFactoryBean.buildSessionFactory(LocalSessionFactoryBean.java:782)
at org.springframework.orm.hibernate3.AbstractSessionFactoryBean.afterPropertiesSet(AbstractSessionFactoryBean.java:188)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.invokeInitMethods(AbstractAutowireCapableBeanFactory.java:1573)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.initializeBean(AbstractAutowireCapableBeanFactory.java:1511)
Things to check:
Is postgres running (ps -ef|grep postgres)?
Can you use psql to connect to postgres using the db.name, db.username, and db.password that are configured in alfresco-global.properties?
Did you follow the step in the docs about editing pg_hba.conf to make sure that postgres is configured to allow password based authentication?
Also, it is exceedingly rare to need to build Alfresco from source unless you are making changes to the low-level classes themselves, which is not recommended.

HCatOutputFormat ClassNotFoundException

I Have a mapreduce program in which i make use of Hcatalog to get details from a Hive table 'A' with HcatInputFormat, process it and write it back to Hive table 'B' using HcatOutput format.
i wrote the program using eclipse and created a runnable 'Hadooptest' jar from the project, and i run the jar using hadoop jar command in hadoop cluster (with -libjars parameter
when i create runnable jar by extracting all referenced jar into the jar file, and then execute in hadoop cluster, the mapreduce runs fine and finishes successfully.
The issue is, when i create runnable jar using the 'copy required libraries into sub-folder next to generated JAR' option, then move both the jar and refrenced libraries to hadoop cluster, and execute it, It shows
"org.apache.hadoop.yarn.exceptions.YarnRuntimeException:
java.lang.RuntimeException: java.lang.ClassNotFoundException: Class
org.apache.hive.hcatalog.mapreduce.HCatOutputFormat not found"
below is the full yarn log :
2016-06-29 12:17:57,951 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Created MRAppMaster for application appattempt_1466834505106_0057_000002
2016-06-29 12:17:58,672 WARN [main] org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2016-06-29 12:17:58,773 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Executing with tokens:
2016-06-29 12:17:58,773 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Kind: YARN_AM_RM_TOKEN, Service: , Ident: (appAttemptId { application_id { id: 57 cluster_timestamp: 1466834505106 } attemptId: 2 } keyId: 783034855)
2016-06-29 12:17:58,974 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Using mapred newApiCommitter.
2016-06-29 12:17:59,840 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: OutputCommitter set in config null
2016-06-29 12:17:59,920 INFO [main] org.apache.hadoop.service.AbstractService: Service org.apache.hadoop.mapreduce.v2.app.MRAppMaster failed in state INITED; cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hive.hcatalog.mapreduce.HCatOutputFormat not found
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hive.hcatalog.mapreduce.HCatOutputFormat not found
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.call(MRAppMaster.java:472)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.call(MRAppMaster.java:452)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.callWithJobClassLoader(MRAppMaster.java:1538)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.createOutputCommitter(MRAppMaster.java:452)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceInit(MRAppMaster.java:371)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$4.run(MRAppMaster.java:1496)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1493)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1426)
Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hive.hcatalog.mapreduce.HCatOutputFormat not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2195)
at org.apache.hadoop.mapreduce.task.JobContextImpl.getOutputFormatClass(JobContextImpl.java:222)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.call(MRAppMaster.java:468)
... 11 more
Caused by: java.lang.ClassNotFoundException: Class org.apache.hive.hcatalog.mapreduce.HCatOutputFormat not found
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2101)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2193)
... 13 more
End of LogType:syslog
And it is necessary that i export required libraries into seperate folder and not extract into the jar itself.
Any help to figure this out will be appreciated.
All the jars aalready exists in your hadoop cluster, you need just to pass their location to your program, it's mentioned in the official doc :
export HADOOP_HOME=<path_to_hadoop_install>
export HCAT_HOME=<path_to_hcat_install>
export HIVE_HOME=<path_to_hive_install>
export LIB_JARS=$HCAT_HOME/share/hcatalog/hcatalog-core-0.5.0.jar,
$HIVE_HOME/lib/hive-metastore-0.10.0.jar,
$HIVE_HOME/lib/libthrift-0.7.0.jar,
$HIVE_HOME/lib/hive-exec-0.10.0.jar,
$HIVE_HOME/lib/libfb303-0.7.0.jar,
$HIVE_HOME/lib/jdo2-api-2.3-ec.jar,
$HIVE_HOME/lib/slf4j-api-1.6.1.jar
export HADOOP_CLASSPATH=$HCAT_HOME/share/hcatalog/hcatalog-core-0.5.0.jar:
$HIVE_HOME/lib/hive-metastore-0.10.0.jar:
$HIVE_HOME/lib/libthrift-0.7.0.jar:
$HIVE_HOME/lib/hive-exec-0.10.0.jar:
$HIVE_HOME/lib/libfb303-0.7.0.jar:
$HIVE_HOME/lib/jdo2-api-2.3-ec.jar:
$HIVE_HOME/conf:$HADOOP_HOME/conf:
$HIVE_HOME/lib/slf4j-api-1.6.1.jar
$HADOOP_HOME/bin/hadoop --config $HADOOP_HOME/conf jar <path_to_jar>
<main_class> -libjars $LIB_JARS <program_arguments>

error writing to mongodb from pig

I'am trying to use the mongo hadoop connector with pig or streaming to load/store data from mongodb. using pig i have following problem:
$cat process.pig
REGISTER /usr/hdp/2.2.4.2-2/hadoop/lib/mongo-java-driver-3.0.2.jar
REGISTER /usr/hdp/2.2.4.2-2/hadoop/lib/mongo-hadoop-core-1.4.0.jar
REGISTER /usr/hdp/2.2.4.2-2/hadoop/lib/mongo-hadoop-pig-1.4.0.jar
SET mapreduce.map.speculative false
SET mapreduce.reduce.speculative false
SET mapreduce.fileoutputcommitter.marksuccessfuljobs false
SET mongo.auth.uri 'mongodb://hadoop:password#127.0.0.1:27017/admin'
raw = LOAD 'mongodb://hadoop:password#127.0.0.1:27017/hadoop.collection'
USING com.mongodb.hadoop.pig.MongoLoader('id:chararray, t:chararray, c_s:map[]');
writing the data into a bson file with
STORE raw
INTO 'file:///tmp/pig_without_limit_bson'
USING com.mongodb.hadoop.pig.BSONStorage('id');
works and i'am able to import the file with mongorestore.
writing to mongodb with
STORE raw
INTO 'mongodb://hadoop:password#127.0.0.1:27017/hadoop.out'
USING com.mongodb.hadoop.pig.MongoInsertStorage('id:chararray, t:chararray', 'id');
does not work and produces following error:
Input(s):
Failed to read data from "mongodb://hadoop:password#127.0.0.1:27017/hadoop.collection"
Output(s):
Failed to produce result in "mongodb://hadoop:password#127.0.0.1:27017/hadoop.out"
$cat pig.log
Error: java.lang.IllegalStateException: state should be: open
at com.mongodb.assertions.Assertions.isTrue(Assertions.java:70)
at com.mongodb.connection.BaseCluster.selectServer(BaseCluster.java:79)
at com.mongodb.binding.ClusterBinding$ClusterBindingConnectionSource.<init>(ClusterBinding.java:75)
at com.mongodb.binding.ClusterBinding$ClusterBindingConnectionSource.<init>(ClusterBinding.java:71)
at com.mongodb.binding.ClusterBinding.getWriteConnectionSource(ClusterBinding.java:68)
at com.mongodb.operation.OperationHelper.withConnection(OperationHelper.java:175)
at com.mongodb.operation.MixedBulkWriteOperation.execute(MixedBulkWriteOperation.java:141)
at com.mongodb.operation.MixedBulkWriteOperation.execute(MixedBulkWriteOperation.java:72)
at com.mongodb.Mongo.execute(Mongo.java:745)
at com.mongodb.Mongo$2.execute(Mongo.java:728)
at com.mongodb.DBCollection.executeBulkWriteOperation(DBCollection.java:1968)
at com.mongodb.DBCollection.executeBulkWriteOperation(DBCollection.java:1962)
at com.mongodb.BulkWriteOperation.execute(BulkWriteOperation.java:98)
at com.mongodb.hadoop.output.MongoOutputCommitter.commitTask(MongoOutputCommitter.java:133)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputCommitter.commitTask(PigOutputCommitter.java:356)
at org.apache.hadoop.mapred.Task.commit(Task.java:1163)
at org.apache.hadoop.mapred.Task.done(Task.java:1025)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:345)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Pig Stack Trace
---------------
ERROR 0: java.io.IOException: No FileSystem for scheme: mongodb
org.apache.pig.backend.executionengine.ExecException: ERROR 0: java.io.IOException: No FileSystem for scheme: mongodb
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:535)
at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:280)
at org.apache.pig.PigServer.launchPlan(PigServer.java:1390)
at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1375)
at org.apache.pig.PigServer.execute(PigServer.java:1364)
at org.apache.pig.PigServer.executeBatch(PigServer.java:415)
at org.apache.pig.PigServer.executeBatch(PigServer.java:398)
at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:171)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:234)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:205)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:81)
at org.apache.pig.Main.run(Main.java:495)
at org.apache.pig.Main.main(Main.java:170)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: java.io.IOException: No FileSystem for scheme: mongodb
at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2607)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2614)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2653)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2635)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:370)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)
at org.apache.pig.StoreFunc.cleanupOnFailureImpl(StoreFunc.java:193)
at org.apache.pig.StoreFunc.cleanupOnFailure(StoreFunc.java:161)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:526)
... 18 more
However, if using the limit operator (even if limiting to enormous figures) all documents are saved into mongodb.
raw_limited = limit raw 1000000;
STORE raw_limited
INTO 'mongodb://hadoop:password#127.0.0.1:27017/hadoop.out'
USING com.mongodb.hadoop.pig.MongoInsertStorage('id:chararray, t:chararray', 'id');
results in
Input(s):
Successfully read 100 records (638 bytes) from:
Output(s):
Successfully stored 100 records (18477 bytes) in:
$mongo hadoop
>> db.out.count()
100
why is that and how can it be fixed? did i miss something?
Seems to be a bug in the mongo java driver.
It works, if using version 3.0.4 of the mongo java driver.