Mappers fail for pig to insert data into MongoDB - mongodb

I am trying to import a file from HDFS to MongoDB using MongoInsertStorage with PIG. The files are large, around 5GB. The script runs fine when I run it in local mode with
pig -x local example.pig
However if I run it in the mapreduce mode, Most of the mappers fail with the following error:
Error: com.mongodb.ConnectionString.getReadConcern()Lcom/mongodb/ReadConcern;
Container killed by the ApplicationMaster.
Container killed on request.
Exit code is 143 Container exited with a non-zero exit code 143
Can someone help me solve this issue?? I also increased the memory allocated to YARN containers but that hasnt helped.
Some mappers are also timing out after 300 seconds.
Pig Script is as follows
REGISTER mongo-java-driver-3.2.2.jar
REGISTER mongo-hadoop-core-1.4.0.jar
REGISTER mongo-hadoop-pig-1.4.0.jar
REGISTER mongodb-driver-3.2.2.jar
DEFINE MongoInsertStorage com.mongodb.hadoop.pig.MongoInsertStorage();
SET mapreduce.reduce.speculative true
BIG_DATA = LOAD 'hdfs://example.com:8020/user/someuser/sample.csv' using PigStorage(',') As (a:chararray,b:chararray,c:chararray);
STORE BIG_DATA INTO 'mongodb://insert.some.ip.here:27017/test.samplecollection' USING MongoInsertStorage('', '')

Found a solution.
For the error
Error: com.mongodb.ConnectionString.getReadConcern()Lcom/mongodb/ReadConcern;
Container killed by the ApplicationMaster.
Container killed on request.
Exit code is 143 Container exited with a non-zero exit code 143
I changed the JAR versions - hadoopcore and hadooppig from 1.4.0 to 2.0.2 and for Mongo Java driver from 3.2.2 to 3.4.2. This eliminated the ReadConcern Error on the mappers!
For the timeout, I added this after registering the jars:
SET mapreduce.task.timeout 1800000
I had been using SET mapred.task.timeout which didnt work
Hope this helps anyone who has a similar issue!

Related

Undefined method [] for nil when trying to run cap deploy:restart

I have a Rails 5.2 app, and with cap 3.4.1 we are suddenly getting this weird error:
[b35efe76] Phusion Passenger(R) 6.0.8
DEBUG [b35efe76] Finished in 0.305 seconds with exit status 0 (successful).
(Backtrace restricted to imported tasks)
cap aborted!
SSHKit::Runner::ExecuteError: Exception while executing as deploy#host.com: undefined method `[]' for nil:NilClass
Trying to narrow it down, it happens when attempting to restart, because this is the line that fails:
cap production deploy:restart
The question is, how do I go about finding what file is trying to call [] on a nil value on? Running cap with --trace is of no value because it just gives me internal errors - nothing in my code. Basically, how do I find out what is nil?
One more clue, currently, I have a bunch of servers, if I run the cap restart command on server A, it restarts fine, on server B, it throws this error, so I'm guessing there's an environment variable on server A but not B, but the error is just so opaque I don't know where to begin.
Thanks for any help,
kevin
A wild guess: I had a similar problem and I could solve it by upgrading capistrano-passenger to >= 0.2.1
Looks like the version change of passenger from 6.0.7 to 6.0.8 has introduces a problem. I see you are also on 6.0.8, so it might affect you, too!
Link to the capistrano-passenger issue

IBM BLUEMIX BLOCKCHAIN SDK-DEMO failing

I have been working with HFC SDK for Node.js and it used to work, but since last night I am having some problems.
When running helloblockchain.js only few times works, most time I get this error when it tries to enroll a new user:
E0113 11:56:05.983919636 5288 handshake.c:128] Security handshake failed: {"created":"#1484304965.983872199","description":"Handshake read failed","file":"../src/core/lib/security/transport/handshake.c","file_line":237,"referenced_errors":[{"created":"#1484304965.983866102","description":"FD shutdown","file":"../src/core/lib/iomgr/ev_epoll_linux.c","file_line":948}]}
Error: Failed to register and enroll JohnDoe: Error
Other times, the enroll works and the failure appears deploying the chaincode:
Enrolled and registered JohnDoe successfully
Deploying chaincode ...
E0113 12:14:27.341527043 5455 handshake.c:128] Security handshake failed: {"created":"#1484306067.341430168","description":"Handshake read failed","file":"../src/core/lib/security/transport/handshake.c","file_line":237,"referenced_errors":[{"created":"#1484306067.341421859","description":"FD shutdown","file":"../src/core/lib/iomgr/ev_epoll_linux.c","file_line":948}]}
Failed to deploy chaincode: request={"fcn":"init","args":["a","100","b","200"],"chaincodePath":"chaincode","certificatePath":"/certs/peer/cert.pem"}, error={"error":{"code":14,"metadata":{"_internal_repr":{}}},"msg":"Error"}
Or:
Enrolled and registered JohnDoe successfully
Deploying chaincode ...
E0113 12:15:27.448867739 5483 handshake.c:128] Security handshake failed: {"created":"#1484306127.448692244","description":"Handshake read failed","file":"../src/core/lib/security/transport/handshake.c","file_line":237,"referenced_errors":[{"created":"#1484306127.448668047","description":"FD shutdown","file":"../src/core/lib/iomgr/ev_epoll_linux.c","file_line":948}]}
events.js:160
throw er; // Unhandled 'error' event
^
Error
at ClientDuplexStream._emitStatusIfDone (/usr/lib/node_modules/hfc/node_modules/grpc/src/node/src/client.js:189:19)
at ClientDuplexStream._readsDone (/usr/lib/node_modules/hfc/node_modules/grpc/src/node/src/client.js:158:8)
at readCallback (/usr/lib/node_modules/hfc/node_modules/grpc/src/node/src/client.js:217:12)
E0113 12:15:27.563487641 5483 handshake.c:128] Security handshake failed: {"created":"#1484306127.563437122","description":"Handshake read failed","file":"../src/core/lib/security/transport/handshake.c","file_line":237,"referenced_errors":[{"created":"#1484306127.563429661","description":"FD shutdown","file":"../src/core/lib/iomgr/ev_epoll_linux.c","file_line":948}]}
This code worked yesterday, so I don't know what could be happening.
Does anybody know how can I fix it?
Thanks,
Javier.
ibm-bluemix
blockchain
These types of intermittent issues are usually related to GRPC. An initial suggestion is to ensure that you are using at least GRPC version 1.0.0.
If you are using a Mac, then the maximum number of open file descriptors should be checked (using ulimit -n). Sometimes this is initially set to a low value such as 256, so increasing the value could help.
There are a couple of GRPC issues with similar symptoms.
https://github.com/grpc/grpc/issues/8732
https://github.com/grpc/grpc/issues/8839
https://github.com/grpc/grpc/issues/8382
There is a grpc.initial_reconnect_backoff_ms property that is mentioned in some of these issues. Increasing the value past the 1000 ms level might help reduce the frequency of issues. Below are instructions for how the helloblockchain.js file can be modified to set this property to a higher value.
Open the helloblockchain.js file in the Hyperledger Fabric Client example and find the enrollAndRegisterUsers function.
Add “grpc.initial_reconnect_backoff_ms": 5000 to the setMemberServicesUrl call.
chain.setMemberServicesUrl(ca_url, {
pem: cert, "grpc.initial_reconnect_backoff_ms": 5000
});
Add “grpc.initial_reconnect_backoff_ms": 5000 to the addPeer call.
chain.addPeer("grpcs://" + peers[i].discovery_host + ":" + peers[i].discovery_port,
{pem: cert, "grpc.initial_reconnect_backoff_ms": 5000
});
Note that setting the grpc.initial_reconnect_backoff_ms property may reduce the frequency of issues, but it will not necessarily eliminate all issues.
The connection to the eventhub that is made in the helloblockchain.js file can also be a factor. There is an earlier version of the Hyperledger Fabric Client that does not utilize the eventhub. This earlier version could be tried to determine if this makes a difference. After running git clone https://github.com/IBM-Blockchain/SDK-Demo.git, run git checkout b7d5195 to use this prior level. Before running node helloblockchain.js from a Node.js command window, the git status command can be used to check the code level that is being used.

Can't start OrientDB Server

I installed OrientDB 2.2.0 on my server but can't start it running server.sh script. Previous version started fine on the server and the current version is running on my notebook. The server is a droplet from Digital Ocean with Ubuntu 15.10 32-bit. The error I get is below.
Invalid maximum direct memory size: -XX:MaxDirectMemorySize=512g The
specified size exceeds the maximum representable size. Error: Could
not create the Java Virtual Machine. Error: A fatal exception has
occurred. Program will exit.
Update
The problem was with -XX:MaxDirectMemorySize=512g. I changed it to -XX:MaxDirectMemorySize=512m and the error disappeared. The problem now is that the server tries to start but gives me the message below:
Creating the system database 'OSystem' for current server
[OSystemDatabase]Exception in thread "main"
java.lang.OutOfMemoryError: Direct buffer memory
at java.nio.Bits.reserveMemory(Bits.java:693)
at java.nio.DirectByteBuffer.(DirectByteBuffer.java:123)
at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:311)
at com.orientechnologies.common.directmemory.OByteBufferPool.allocateBuffer(OByteBufferPool.java:309)
at com.orientechnologies.common.directmemory.OByteBufferPool.acquireDirect(OByteBufferPool.java:228)
at com.orientechnologies.orient.core.storage.cache.local.OWOWCache.cacheFileContent(OWOWCache.java:1255)
at com.orientechnologies.orient.core.storage.cache.local.OWOWCache.load(OWOWCache.java:617)
at com.orientechnologies.orient.core.storage.cache.local.twoq.O2QCache.updateCache(O2QCache.java:1200)
at com.orientechnologies.orient.core.storage.cache.local.twoq.O2QCache.doLoad(O2QCache.java:439)
at com.orientechnologies.orient.core.storage.cache.local.twoq.O2QCache.allocateNewPage(O2QCache.java:489)
at com.orientechnologies.orient.core.storage.impl.local.paginated.atomicoperations.OAtomicOperation.commitChanges(OAtomicOperation.java:426)
at com.orientechnologies.orient.core.storage.impl.local.paginated.atomicoperations.OAtomicOperationsManager.endAtomicOperation(OAtomicOperationsManager.java:420)
at com.orientechnologies.orient.core.storage.impl.local.paginated.base.ODurableComponent.endAtomicOperation(ODurableComponent.java:118)
at com.orientechnologies.orient.core.storage.impl.local.paginated.OPaginatedCluster.create(OPaginatedCluster.java:197)
at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.addClusterInternal(OAbstractPaginatedStorage.java:3349)
at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.doAddCluster(OAbstractPaginatedStorage.java:3330)
at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.create(OAbstractPaginatedStorage.java:381)
at com.orientechnologies.orient.core.storage.impl.local.paginated.OLocalPaginatedStorage.create(OLocalPaginatedStorage.java:120)
at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.create(ODatabaseDocumentTx.java:378)
at com.orientechnologies.orient.server.OSystemDatabase.init(OSystemDatabase.java:106)
at com.orientechnologies.orient.server.OSystemDatabase.(OSystemDatabase.java:42)
at com.orientechnologies.orient.server.OServer.initSystemDatabase(OServer.java:1217)
at com.orientechnologies.orient.server.OServer.activate(OServer.java:343)
at com.orientechnologies.orient.server.OServerMain.main(OServerMain.java:41)
(Posted on behalf of the OP).
Everything is okay now. I just changed -XX:MaxDirectMemorySize=512m to -XX:MaxDirectMemorySize=2g.
I suppose you have 32-bit JVM. You should set this value to -XX:MaxDirectMemorySize=2g .

OrientDB 2.1.9 crashes with OStorageException EOFException when running SQL script in console

I've been using my SQL database initialization script for a while, but it seems that recently the database crashes in the middle of the execution and I don't know why, but here's some details:
I am running OrientDB on Ubuntu 14 Trusty x64 (via Vagrant)
It always seems to crash while the script attempts to create a UNIQUE_HASH_INDEX, but doesn't always crash at the same UNIQUE_HASH_INDEX instruction
The script creates a lot of vertices and edges, but for example, it will crash here (see line with UNIQUE_HASH_INDEX):
CREATE CLASS Channel EXTENDS V;
CREATE PROPERTY Channel.version LONG;
CREATE PROPERTY Channel.channelId STRING;
CREATE INDEX Channel.uq_channelId ON Channel(channelId) UNIQUE_HASH_INDEX;
The database crashes entirely with the following error:
Creating index... Error:
com.orientechnologies.orient.core.exception.OStorageException: Error
on executing command: sql.create INDEX Channel.uq_channelId ON
Channel(channelId) UNIQUE_HASH_INDEX
Error: java.io.EOFException
Looking at the log files, the only hint I get are the last two lines:
2016-01-14 17:17:05:437 INFO Received signal: SIGTERM [OSignalHandler]
2016-01-14 17:17:05:454 INFO Received signal: SIGTERM [OSignalHandler]
How can I resolve this issue, or at least get better hints as to what is making the database crash?
I also test with OrientDB 2.1.6, as I was running the older version initially. Same problem.
Sorry, false alarm -- this is a Vagrant issue, not an OrientDB issue. Running the exact same script on a 32bit instance instead of 64bit solved my problem, and installing the same script on a real 64bit server also works.

How do I fix server Oops error when deploying Play Framework 2.0 application as a Windows service?

I was following the processes in this very helpful post:
How do I run a Play Framework 2.0 application as a Windows service?
when I ran into trouble in Step 9. When I execute runConsole.bat, the service cycles between the running and restarting states. The full log is here:
wrapper.log
but what jumps out at me in the log are the following:
INFO|7268/0|play.core.server.NettyServer|13-12-28 13:07:28|Oops, cannot start the server.
INFO|7268/0|play.core.server.NettyServer|13-12-28 13:07:28|Configuration error: Configuration error[Cannot connect to database [default]]
...
INFO|7268/0|play.core.server.NettyServer|13-12-28 13:07:28|Caused by: org.h2.jdbc.JdbcSQLException: Database may be already in use: "Locked by another process". Possible solutions: close all other connection(s); use the server mode [90020-168]
...
INFO|wrapper|play.core.server.NettyServer|13-12-28 13:07:28|restart process due to default exit code rule
INFO|wrapper|play.core.server.NettyServer|13-12-28 13:07:28|restart internal RUNNING
INFO|wrapper|play.core.server.NettyServer|13-12-28 13:07:28|stopping process with pid/timeout 7268 45000
INFO|wrapper|play.core.server.NettyServer|13-12-28 13:07:30|process exit code: -1
...
INFO|7812/1|play.core.server.NettyServer|13-12-28 13:07:45|[error] c.j.b.h.AbstractConnectionHook - Failed to obtain initial connection Sleeping for 0ms and trying again. Attempts left: 0. Exception: null
INFO|7812/1|play.core.server.NettyServer|13-12-28 13:07:45|Oops, cannot start the server.
INFO|7812/1|play.core.server.NettyServer|13-12-28 13:07:45|Configuration error: Configuration error[Cannot connect to database [default]]
repeats several times...
Half way through writing this post, I realized that before step 9., you should terminate the start.bat script which you start up in step 6. When I did this, runConsole.bat would execute normally without all the errors I list above.