Writing to Mongodb via Java Driver. Getting the following error:
com.mongodb.WriteConcernException: { "serverUsed" : "127.0.0.1:27017" , "err" : "_a != -1" , "n" : 0 , "connectionId" : 3 , "ok" : 1.0}
at com.mongodb.CommandResult.getWriteException(CommandResult.java:90)
at com.mongodb.CommandResult.getException(CommandResult.java:79)
at com.mongodb.CommandResult.throwOnError(CommandResult.java:131)
at com.mongodb.DBTCPConnector._checkWriteError(DBTCPConnector.java:135)
at com.mongodb.DBTCPConnector.access$000(DBTCPConnector.java:39)
at com.mongodb.DBTCPConnector$1.execute(DBTCPConnector.java:186)
at com.mongodb.DBTCPConnector$1.execute(DBTCPConnector.java:181)
at com.mongodb.DBTCPConnector.doOperation(DBTCPConnector.java:210)
at com.mongodb.DBTCPConnector.say(DBTCPConnector.java:181)
at com.mongodb.DBCollectionImpl.insertWithWriteProtocol(DBCollectionImpl.java:528)
at com.mongodb.DBCollectionImpl.insert(DBCollectionImpl.java:193)
at com.mongodb.DBCollectionImpl.insert(DBCollectionImpl.java:165)
at com.mongodb.DBCollection.insert(DBCollection.java:161)
at com.mongodb.DBCollection.insert(DBCollection.java:147)
at com.mongodb.DBCollection$insert.call(Unknown Source)
Can't find any reference in docs to "err" : "_a != -1". Any thoughts?
EDIT:
Adding code I used (not all as it relies on other libraries to parse files):
MongoClient mongoClient = new MongoClient()
mongoClient.setWriteConcern(WriteConcern.SAFE)
DB db = mongoClient.getDB("vcf")
List<DBObject> documents = new ArrayList<DBObject>()
DBCollection recordsColl = db.getCollection("records")
//loop through file
BasicDBObject mongoRecord = new BasicDBObject()
//add data to mongoRecord
documents.add(mongoRecord)
//end loop
recordsColl.insert(documents)
mongoClient.close()
Related
I have strict requirement, to save the null values to the Mongodb, as I am aware of the case of nosql where storing null is not recommended but my business requirement have a scenario.
a sample csv file which has a null value
a,b,c,id
,2,3,A
4,4,4,B
code to save csv to mongodb
StructType schema = DataTypes.createStructType(new StructField[] {
DataTypes.createStructField("a", DataTypes.IntegerType, false),
DataTypes.createStructField("b", DataTypes.IntegerType, true),
DataTypes.createStructField("c", DataTypes.IntegerType, true),
DataTypes.createStructField("id", DataTypes.StringType, true),
});
Dataset<Row> g = spark.read()
.format("csv")
.schema(schema)
.option("header", "true")
.option("inferSchema","false")
.load("/home/Documents/SparkLogs/a.csv");
MongoSpark.save(g
.write()
.option("database", "A")
.option("collection","b").mode("overwrite")
)
;
Mongodb Output
{
"_id" : ObjectId("5d663b6bec20c94c990e6d0c"),
"a" : 4,
"b" : 4,
"c" : 4,
"id" : "B"
}
/* 2 */
{
"_id" : ObjectId("5d663b6bec20c94c990e6d0d"),
"b" : 2,
"c" : 3,
"id" : "A"
}
My requirement is to have a 'a' field will null type in it.
Saving as DataSet with MongoSpark will ignore the null value keys defaultly. So my workaround is to convert Dataset to javaPairRDD of BsonObject types.
Code
/** imports ***/
import scala.Tuple2;
import java.beans.Encoder;
import java.util.UUID;
import org.apache.hadoop.conf.Configuration;
import org.apache.spark.api.java.JavaPairRDD;
import org.apache.spark.sql.Encoders;
import org.apache.spark.sql.types.StructField;
import org.apache.spark.sql.types.StructType;
import org.bson.BSONObject;
import org.bson.BasicBSONObject;
import com.mongodb.hadoop.MongoOutputFormat;
/** imports ***/
private static void saveToMongoDB_With_Null(Dataset<Row> ds, Configuration outputConfig,String [] cols) {
JavaPairRDD<Object,BSONObject> document = ds
.toJavaRDD()
.mapToPair(f -> {
BSONObject doc = new BasicBSONObject();
for(String p:cols)
doc.put(p, f.getAs(p));
return new Tuple2<Object, BSONObject>(null, doc);
});
document.saveAsNewAPIHadoopFile(
"file:///this-is-completely-unused"
, Object.class
, BSONObject.class
, MongoOutputFormat.class
, outputConfig);
}
Configuration outputConfig = new Configuration();
outputConfig.set("mongo.output.uri",
"mongodb://192.168.0.19:27017/database.collection");
outputConfig.set("mongo.output.format",
"com.mongodb.hadoop.MongoOutputFormat");
Dataset<Row> g = spark.read()
.format("csv")
.schema(schema)
.option("header", "true")
.option("inferSchema","false")
.load("/home/Documents/SparkLogs/a.csv");
saveToMongoDB_With_Null(g, outputConfig,g.columns());
Needed Maven Dependency
<!-- https://mvnrepository.com/artifact/org.mongodb.mongo-hadoop/mongo-hadoop-core -->
<dependency>
<groupId>org.mongodb.mongo-hadoop</groupId>
<artifactId>mongo-hadoop-core</artifactId>
<version>2.0.2</version>
</dependency>
MongoDB output after workflow
{
"_id" : "a62e9b02-da97-493b-9563-fc19054df60e",
"a" : null,
"b" : 2,
"c" : 3,
"id" : "A"
}
{
"_id" : "fed373a8-e671-44a4-8b85-7c7e2ff59585",
"a" : 4,
"b" : 4,
"c" : 4,
"id" : "B"
}
Downsides
Bringing the high-level api like Dataset to low-level rdds will loose the spark's ability to optimise the queryplans , so trade-off is performance.
I have a big speed problem on my website using Flask/MongoDB as backend. A basic request (get 1 user for example) takes about 4 sec to respond.
Here is the python code :
#users_apis.route('/profile/<string:user_id>',methods= ['GET','PUT','DELETE'])
#auth_token_required
def profile(user_id):
if request.method == "GET":
avatar = ''
if user_id == str(current_user.id):
if(current_user.birthday):
age = (date.today().year - current_user.birthday.year)
else:
age = ''
return make_response(jsonify({
"id" : str(current_user.id),
"username" : current_user.username,
"email" : current_user.email,
"first_name": current_user.first_name,
"last_name" : current_user.last_name,
"age" : age,
"birthday" : current_user.birthday,
"gender" : current_user.gender,
"city" : current_user.city,
"country" : current_user.country,
"languages" : current_user.languages,
"description" : current_user.description,
"phone_number" : current_user.phone_number,
"countries_visited" : current_user.countries_visited,
"countries_to_visit" : current_user.countries_to_visit,
"zip_code" : str(current_user.zip_code),
"address" : current_user.address,
"pictures" : current_user.pictures,
"avatar" : "",
"interests" : current_user.interests,
"messages" : current_user.messages,
"invitations" : current_user.invitations,
"events" : current_user.events
}), 200)
And my mongodb database is build like this :
The selected user is nearly empty (has no friends, no events, no pictures...).
class BaseUser(db.Document, UserMixin):
username = db.StringField(max_length=64, unique=True, required=True)
email = db.EmailField(unique=True, required=True)
password = db.StringField(max_length=255, required=True)
active = db.BooleanField(default=True)
joined_on = db.DateTimeField(default=datetime.now())
roles = db.ListField(db.ReferenceField(Role), default=[])
class User(BaseUser)
# Identity
first_name = db.StringField(max_length=255)
last_name = db.StringField(max_length=255)
birthday = db.DateTimeField()
gender = db.StringField(max_length=1,choices=GENDER,default='N')
# Coordinates
address = db.StringField(max_length=255)
zip_code = db.IntField()
city = db.StringField(max_length=64)
region = db.StringField(max_length=64)
country = db.StringField(max_length=32)
phone_number = db.StringField(max_length=18)
# Community
description = db.StringField(max_length=1000)
activities = db.StringField(max_length=1000)
languages = db.ListField(db.StringField(max_length=32))
countries_visited = db.ListField(db.StringField(max_length=32))
countries_to_visit = db.ListField(db.StringField(max_length=32))
interests = db.ListField(db.ReferenceField('Tags'))
friends = db.ListField(db.ReferenceField('User'))
friend_requests = db.ListField(db.ReferenceField('User'))
pictures = db.ListField(db.ReferenceField('Picture'))
events = db.ListField(db.ReferenceField('Event'))
messages = db.ListField(db.ReferenceField('PrivateMessage'))
invitations = db.ListField(db.ReferenceField('Invitation'))
email_validated = db.BooleanField(default=False)
validation_date = db.DateTimeField()
I have a debian serveur with 6Go Ram and 1 vcore, 2,4GHz.
When I check the log for the mongoDB I don't see request that takes more then 378ms (for a search request)
If I use TOP during a request on my server:
I see for 1 sec a 97% CPU use for Python during the request.
When I check the python server output :
I see 4 second between the Option request and the Get Request.
I finally managed to "fix" my issue.
It seems all the problem was due to the #auth_token_required.
Each request done by the front end to the back end with the "headers.append('Authentication-Token',currentUser.token);" created a huge delay.
I replaced #auth_token_required by #login_required.
I m now using cookies.
Hope it helps someone.
I have a Query to matchAll product in ElasticSearch. It run perfeclty but I want to add a Sort for this Query. I don't find example which run and I don't understand why it generate error.
This code for sorting query :
$match = new \Elastica\Query\MatchAll();
$query = new \Elastica\Query($match);
$query->addSort([
'product.price' => ['order' => 'asc']
]);
return $this->find($query);
Generate this error :
Error: Wrong parameters for Exception([string $exception [, long $code
[, Exception $previous = NULL]]])
I try lot of thing before post this but the error is always the same.
ElasticSearch : 5.2.2
FosElasticaBundle : 3.2.2
PHP : 5.6.30
Symfony : 2.8
This error means that is a incompatibility between ES, Elastica and FosElasticBundle. Warning ES and base PHP labrary Elastica...
this code run perfectly :
$query = new Query();
$queryRange = new \Elastica\Query\Range('product.price', array('gt' => 0, 'lt' => 20));
$query->setQuery($queryRange);
return $this->find($query);
ElasticSearch : 1.7.4
FosElasticaBundle : 3.2.2
PHP : 5.6.30
Symfony : 2.8
I am using spring data mongodb to interact with my mongodb setup. I was testing different write concerns and noticed that with Unacknowledged write concern, the time for updating 1000 documents was around 5-6 secs even though Unacknowledged write concern doesn't wait for any acknowledgement.
I tested the same with raw java driver and the time was around 40 msec.
What could be cause of this huge time difference between raw java driver and spring data mongodb update?
Note that I am using Unacknowledged write concern and mongodb v2.6.1 with default configurations.
Adding Code Used for comparison:-
Raw Java driver code:-
MongoClient mongoClient = new MongoClient("localhost", 27017);
DB db = mongoClient.getDB( "testdb" );
DBCollection collection = db.getCollection("product");
WriteResult wr = null;
try {
long start = System.currentTimeMillis();
wr = collection.update(
new BasicDBObject("productId", new BasicDBObject("$gte", 10000000)
.append("$lt", 10001000)),
new BasicDBObject("$inc", new BasicDBObject("price", 100)),
false, true, WriteConcern.UNACKNOWLEDGED);
long end = System.currentTimeMillis();
System.out.println(wr + " Time taken: " + (end - start) + " ms.");
}
Spring Code:-
Config.xml
<mongo:mongo host="localhost" port="27017" />
<mongo:db-factory dbname="testdb" mongo-ref="mongo" />
<bean id="Unacknowledged" class="com.mongodb.WriteConcern">
<constructor-arg name="w" type="int" value="0"/>
</bean>
<bean id="mongoTemplate" class="org.springframework.data.mongodb.core.MongoTemplate">
<constructor-arg name="mongoDbFactory" ref="mongoDbFactory" />
<property name="writeConcern" ref="Unacknowledged"/>
</bean>
Java Code for update function which is part of ProductDAOImpl:-
public int update(long fromProductId, long toProductId, double changeInPrice)
{
Query query = new Query(new Criteria().andOperator(
Criteria.where("productId").gte(fromProductId),
Criteria.where("productId").lt(toProductId)));
Update update = new Update().inc("price", changeInPrice);
WriteResult writeResult =
mongoTemplate.updateMulti(query, update, Product.class);
return writeResult.getN();
}
Accesing code:-
ProductDAOImpl productDAO = new ProductDAOImpl();
productDAO.setMongoTemplate(mongoTemplate);
long start = System.currentTimeMillis();
productDAO.update(10000000, 10001000, 100);
long end = System.currentTimeMillis();
System.out.println("Time taken = " + (end - start) + " ms.");
Schema:-
{
"_id" : ObjectId("53b64d000cf273a0d95a1a3d"),
"_class" : "springmongo.domain.Product",
"productId" : NumberLong(6),
"productName" : "product6",
"manufacturer" : "company30605739",
"supplier" : "supplier605739",
"category" : "category30605739",
"mfgDate" : ISODate("1968-04-26T05:00:00.881Z"),
"price" : 665689.7224373372,
"tags" : [
"tag82",
"tag61",
"tag17"
],
"reviews" : [
{
"name" : "name528965",
"rating" : 6.5
},
{
"name" : "name818975",
"rating" : 7.5
},
{
"name" : "name436239",
"rating" : 3.9
}
],
"manufacturerAdd" : {
"state" : "state55",
"country" : "country155",
"zipcode" : 718
},
"supplierAdd" : {
"state" : "state69",
"country" : "country69",
"zipcode" : 691986
}
}
Hope it helps.
I am using mongo-java-driver-2.9.1 for interacting with mongodb, I want to log the query that are fired on to the mongodb server. e.g. In java for inserting the document this is the code that I write
DBCollection coll = db.getCollection("mycollection");
BasicDBObject doc = new BasicDBObject("name", "MongoDB")
.append("type", "database")
.append("count", 1);
coll.insert(doc);
for this, equivalent code in "mongo" client for inserting document in mongodb is
db.mycollection.insert({
"name" : "MongoDB",
"type" : "database",
"count" : 1
})
I want to log this second code, is there any way to do it?
I think the MongoDB Java driver has not logging support so you have to write your logging Message by your own. Here an Example:
DBCollection coll = db.getCollection("mycollection");
BasicDBObject doc = new BasicDBObject("name", "MongoDB")
.append("type", "database")
.append("count", 1);
WriteResult insert = coll.insert(doc);
String msg = "";
if(insert.getError() == null){
msg = "insert into: " + collection.toString() +" ; Object " + q.toString());
//log the message
} else {
msg = "ERROR by insert into: " + collection.toString() +" ; Object " + q.toString());
msg = msg + " Error message: " + insert.getError();
}
//log the message