Writing from PIG to MongoDB - error 2116 - mongodb schema not found - mongodb

I am using Hadoop on Windows Server 2008 - Hortonworks distribution
We are using PIG and trying to write the data into MongoDB; I am not able to read or write to the MongoDB; not sure what the issue we get an error 2116 which states that the mongodb schema is empty
Command to read -
register 'D:\hdp\pig-0.12.1.2.1.1.0-1621\lib\mongo-hadoop-core-1.2.0.jar'
register 'D:\mongo-hadoop-2.2-1.2.0\mongo-hadoop-2.2-1.2.0\mongo-hadoop-1.2.0.jar'
register 'D:\mongo-hadoop-2.2-1.2.0\mongo-hadoop-2.2-1.2.0\mongo-hadoop-pig-1.2.0.jar'
register 'D:\hdp\hadoop-2.4.0.2.1.1.0-1621\lib\mongo-2.6.1.jar'
set mapred.map.tasks.speculative.execution false;
set mapred.reduce.tasks.speculative.execution false;
SET mapreduce.fileoutputcommitter.marksuccessfuljobs false;
SalesLoading = load 'mongodb://localhost/benvenuedb.SalesData' using com.mongodb.hadoop.pig.MongoLoader();
store SalesLoading into 'mongodb://localhost:27017/benvenuedb.SalesData1' using com.mongodb.hadoop.pig.MongoStorage();
Error Messages
Pig Stack Trace
---------------
ERROR 2116:
<line 5, column 0> Output Location Validation Failed for: 'mongodb://127.0.0.1:27017/benvenuedb.SalesData More info to follow:
The value of property mongo.pig.output.schema must not be null
org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1002: Unable to store alias salesLoading
at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1637)
at org.apache.pig.PigServer.registerQuery(PigServer.java:577)
at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:1093)
at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:501)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:198)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:173)
at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69)
at org.apache.pig.Main.run(Main.java:541)
at org.apache.pig.Main.main(Main.java:156)
Caused by: org.apache.pig.impl.plan.VisitorException: ERROR 2116:
<line 5, column 0> Output Location Validation Failed for: 'mongodb://127.0.0.1:27017/benvenuedb.SalesData More info to follow:
The value of property mongo.pig.output.schema must not be null
at org.apache.pig.newplan.logical.rules.InputOutputFileValidator$InputOutputFileVisitor.visit(InputOutputFileValidator.java:75)
at org.apache.pig.newplan.logical.relational.LOStore.accept(LOStore.java:66)
at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:64)
at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66)
at org.apache.pig.newplan.DepthFirstWalker.walk(DepthFirstWalker.java:53)
at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:52)
at org.apache.pig.newplan.logical.rules.InputOutputFileValidator.validate(InputOutputFileValidator.java:45)
at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:303)
at org.apache.pig.PigServer.compilePp(PigServer.java:1382)
at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1307)
at org.apache.pig.PigServer.execute(PigServer.java:1299)
at org.apache.pig.PigServer.access$400(PigServer.java:124)
at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1632)
... 8 more
Caused by: java.lang.IllegalArgumentException: The value of property mongo.pig.output.schema must not be null
at com.google.common.base.Preconditions.checkArgument(Preconditions.java:88)
at org.apache.hadoop.conf.Configuration.set(Configuration.java:971)
at org.apache.hadoop.conf.Configuration.set(Configuration.java:953)
at com.mongodb.hadoop.pig.MongoStorage.setStoreLocation(MongoStorage.java:249)
at org.apache.pig.newplan.logical.rules.InputOutputFileValidator$InputOutputFileVisitor.visit(InputOutputFileValidator.java:68)
... 20 more
I have issued netstat -an to see the open ports
The local address is 10.69.148.89; I do not see the port 27017 open in this IP; however 127.0.0.1 has 27017 open. There is something simple we are overlooking.
Need some help; we have spent over 2 days with no resolution

Have you tried setting the property it says is missing?
The value of property mongo.pig.output.schema must not be null

There are some issues in writing to MongoDB from PIG especially when you use Hortonworks windows distribution. I have broken this into Three steps;
Write to HDFS filesystem as a json file using JSONStorage( );
Move the HDFS file to windows filesystem
Load the json file into MongoDB
I am open if anyone has attempted this in a different way

Related

wsadmin script timing out when executing against DMGR via SOAP

I'm attempting to start and stop an application on a single JVM via the wsadmin console since the Web UI for IBM BPM PS Adv. doesn't allow for that kind of operation. So, I have the following script:
https://gist.github.com/predatorian3/b8661c949617727630152cbe04f78d7e
and when I run it against the DMGR from the Cell Host, I receive the following errors.
[wasadmin#server01 ~]$ cat /usr/local/bin/Run_wsadmin.sh
#!/bin/bash
#
#
#
/opt/IBM/WebSphere/AppServer/bin/wsadmin.sh -lang jython -user serviceAccount -password password $*
[wasadmin#cessoapscrt00 ~]$ time Run_wsadmin.sh -f /opt/IBM/wsadmin/wsadmin_Restart_Application.py WPS00 CRT00WPS01 redirectResource_war
WASX7209I: Connected to process "dmgr" on node CRTDMGR using SOAP connector; The type of process is: DeploymentManager
WASX7303I: The following options are passed to the scripting environment and are available as arguments that are stored in the argv variable: "[WPS00, CRT00WPS01, redirectResource_war]"
WASX7017E: Exception received while running file "/opt/IBM/wsadmin/wsadmin_Restart_Application.py"; exception information: com.ibm.websphere.management.exception.ConnectorException
org.apache.soap.SOAPException: [SOAPException: faultCode=SOAP-ENV:Client; msg=Read timed out; targetException=java.net.SocketTimeoutException: Read timed out]
real 3m21.275s
user 0m17.411s
sys 0m0.796s
So, I'm not specifying the connection types, and using the default, which is SOAP. However, upon reading about the other Connection Types, none of them seem any better, but I attribute that to IBM Documentation vagueness. Is there an option to increase the timeout wait periods, or turn it off, or is there a better connection type?
Also running this directly on the wsadmin console, it seems that it is hanging up on gathering the application manager string.
[wasadmin#server01 ~]$ Run_wsadmin.sh
WASX7209I: Connected to process "dmgr" on node CRTDMGR using SOAP connector; The type of process is: DeploymentManager WASX7031I: For help, enter: "print Help.help()"
wsadmin>appManager = AdminControl.queryNames('cell=CRTCELL,node=WPS00,type=ApplicatoinManager,process=CRT00WPS01,*')
WASX7015E: Exception running command: "appManager = AdminControl.queryNames('cell=CRTCELL,node=WPS00,type=ApplicationManager,process=CRT00WPS01,*')"; exception information:
com.ibm.websphere.management.exception.ConnectorException
org.apache.soap.SOAPException: [SOAPException: faultCode=SOAP-ENV:Client; msg=Read timed out; targetException=java.net.SocketTimeoutException: Read timed out]
wsadmin>
You can increase timeout value in {profile}/properties/soap.client.props
com.ibm.SOAP.requestTimeout=180
If you want to turn off timeout, modify com.ibm.SOAP.requestTimeout=0
Or if you want longer timeout you can modify the value 180 to something else.
Also about your query command, I noticed that you have a typo on the MBean type, you had type=ApplicatoinManager, it should be type=ApplicationManager
HERE YOU GO -- I had the same issue. I want to override the timeout prop temporarily. This worked like a champ. Make sure you follow below steps exactly.I did some mistakes and the prop did not passed, I figured out and it works.
Copy the soap.client.props file from /properties and give it a new name such as mysoap.client.props.
Edit mysoap.client.props and update the value of com.ibm.SOAP.requestTimeout as required
Create a new Java properties file soap_override.props and enter the following line:
com.ibm.SOAP.ConfigURL=file:/mysoap.client.props
Pass soap_override.props into wsadmin using the -p option: wsadmin -p soap_override.props...
REFERENCE:
https://www.ibm.com/developerworks/community/blogs/timdp/entry/avoiding_wsadmin_request_timeouts_the_neat_way32?lang=en

Configure MongoDB with Nutch2.3, some error about indexerJob?

I had successfully configure MongoDB(5.3.1) and Nutch(2.3), when I run the command "./bin/nutch index -all" some errors printed after inject/generate/fetch/parse/updatedb commands work,the error details like:
SolrIndexerJob: java.lang.RuntimeException: job failed: name=apache-nutch-2.3.1.jar, jobid=job_local140530148_0001
at org.apache.nutch.util.NutchJob.waitForCompletion(NutchJob.java:120)
at org.apache.nutch.indexer.IndexingJob.run(IndexingJob.java:154)
at org.apache.nutch.indexer.IndexingJob.index(IndexingJob.java:176)
at org.apache.nutch.indexer.IndexingJob.run(IndexingJob.java:202)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.nutch.indexer.IndexingJob.main(IndexingJob.java:211)
I had configure the file in $NUTCH_HOME/runtime/local/conf/nutch-site.xml
details:
If all the others steps was running, it would be not a problem with mongodb but with solr (your nutch-site.xml suggests that you wanted index your data in solr). As far that i remember, when i used solr, i precised the core name, it would be something like that :
http://localhost:8983/solr/mycore/

OrientDB 2.1.9 crashes with OStorageException EOFException when running SQL script in console

I've been using my SQL database initialization script for a while, but it seems that recently the database crashes in the middle of the execution and I don't know why, but here's some details:
I am running OrientDB on Ubuntu 14 Trusty x64 (via Vagrant)
It always seems to crash while the script attempts to create a UNIQUE_HASH_INDEX, but doesn't always crash at the same UNIQUE_HASH_INDEX instruction
The script creates a lot of vertices and edges, but for example, it will crash here (see line with UNIQUE_HASH_INDEX):
CREATE CLASS Channel EXTENDS V;
CREATE PROPERTY Channel.version LONG;
CREATE PROPERTY Channel.channelId STRING;
CREATE INDEX Channel.uq_channelId ON Channel(channelId) UNIQUE_HASH_INDEX;
The database crashes entirely with the following error:
Creating index... Error:
com.orientechnologies.orient.core.exception.OStorageException: Error
on executing command: sql.create INDEX Channel.uq_channelId ON
Channel(channelId) UNIQUE_HASH_INDEX
Error: java.io.EOFException
Looking at the log files, the only hint I get are the last two lines:
2016-01-14 17:17:05:437 INFO Received signal: SIGTERM [OSignalHandler]
2016-01-14 17:17:05:454 INFO Received signal: SIGTERM [OSignalHandler]
How can I resolve this issue, or at least get better hints as to what is making the database crash?
I also test with OrientDB 2.1.6, as I was running the older version initially. Same problem.
Sorry, false alarm -- this is a Vagrant issue, not an OrientDB issue. Running the exact same script on a 32bit instance instead of 64bit solved my problem, and installing the same script on a real 64bit server also works.

cannot attach to service manager-error

I am new in firebird and I would like to trace my firebird-database activities, hence I am trying to use Audit/Trace Services.
My firbird databse is on Server: 10.7.105.8
I am running this comman in my cmd:
C:\Program Files\Firebird\Firebird_2_5\bin>fbtracemgr -se 10.7.105.8:3050:service_mgr -user SYSDBA -password masterkey -start -name "User Trace 1" -config "fbtrace.conf" > C:\Users\Babak\Desktop\trace.out
but I get this error:
Can not attach to service manager
Service 3050 : Service_mgr is not defined
What should I do to solve this problem?
thank you so much
EDIT
thank you for your hints. I think my trace process works fine, but I cant find the information, what I need in my trace.out file
If I am starting my trace my command promp looks like this:
if in this step I take a look in my trace.out I can only see this:
Trace Session ID 3 Started
I am running some select queries in my firebird, and then I finish my trace with with ctr+c, then the only things, which I can see in my trace.out are something like this:
Trace session ID 3 started
2015-07-08 10:49:59.868874 ***** loading fbclient.dll proc=4116 64Bit DLL Preload
2015-07-08 10:49:59.869066 GetDllDirectoryA=""
2015-07-08 10:49:59.869075 GetModuleFileNameA="C:\Program Files\Firebird\Firebird_2_5\bin\fbclient.dll"
2015-07-08 10:49:59.869086 Log-Level is set to 0
2015-07-08 10:49:59.869096 fbclient.dll loaded by: C:\Program Files\Firebird\Firebird_2_5\bin\fbtracemgr.exe
2015-07-08 10:49:59.869113 ***** dimensio integration successfully fbclient.dll
2015-07-08 10:58:10.091330 ***** cleanup unload fbclientorg.dll proc=4116
and not more infos about queries, which I have run.
Could you please say me, what I have done wrong? or what should I do more?
As Mark says, check file "fbtrace.conf". This is a text file and you will see something like this:
# default database section
#
<database>
# Do we trace database events or not
enabled false
# Operations log file name. For use by system audit trace only
#log_filename
....
....
# Put transaction start/end records
log_transactions false <--- TO TEST, SET THIS TO TRUE
# Put sql statement prepare records
log_statement_prepare false <-- TO TEST, SET THIS TO TRUE
Set to true what you need to trace, save the file and check the result.
Firebird connection strings are of the format:
host/port:database
Where /port is optional and defaults to 3050, and database is either the alias or path of a database, or the name of a service. Replace :3050 with /3050 (or leave it off entirely).
The following worked for me:
Open start menu
Search for services and open it
Search Firebird Guardian in the services list.
Start Firebird Guardian if it is stopped or restart if it is running.
Now try to connect your server. It will work.

Jboss shows error with datasource on startup

On starting jboss I am getting the following error :
--- MBEANS THAT ARE THE ROOT CAUSE OF THE PROBLEM ---
ObjectName: jboss.jca:service=DataSourceBinding,name=DefaultDS
State: NOTYETINSTALLED
Depends On Me:
jboss.ejb:service=EJBTimerService,persistencePolicy=database
jboss:service=KeyGeneratorFactory,type=HiLo
jboss.mq:service=StateManager
jboss.mq:service=PersistenceManager
And for all database connections in the servlet I get the following exception :
org.postgresql.util.PSQLException: FATAL: password a
uthentication failed for user "poll"
It was working fine and all of a sudden I started getting these errors. My password is correct. I even tried changing the password and then tried again it showed the same exception. What is happening here?
The DefaultDS data source is what the name suggests; the default datasource. It ships with JBoss and is configured to use the Hypersonic (ie in-memory) database. JBoss uses the DefaultDS datasource to read/write internal queues, timed events, etc
Check the file ../conf/standardjbosscmp-jdbc.xml to see what you've got configured for the DefaultDS datasource. It sounds like you've edited that file unintentionally. Unless you need to persist internal queues etc across boots, just leave it as shipped using Hypersonic.
See the JBoss doc for more.