BulkWriteError - MongoDB - mongodb

I understand this error occurs because of _id duplication in my code. I'm writing an aggregation pipeline where there are no unique fields in _id and I guess that's where duplication occurs. I want to either insert duplicate _id (which I suppose I can't) or create unique _id object.
I tried using BSON.ObjectId() to generate unique id with insert_many() but it says it cannot encode object: {ObjectId('5d4a71227e16ce9599a8d6ac')}, of type:
col.aggregate([{
'$project' : {
'date' : {'$dateToString' : {'format' : '%Y-%m-%d', 'date':
{'$dateFromString': {'dateString': '$createdAt'}}}},
'cinemaid' : 1,
'planid' : 1,
'location' : 1,
'creditworth' : 1,
'amountpaid' : 1,
'upsize' : 1
}
},
{
'$group' : {
'_id' : {'location' : '$location', 'cinemaid' : '$cinemaid', 'planid' :
'$planid', 'date' : '$date',
'credithworth' : {'$sum' : '$creditworth' },
'addons' : { '$sum' : '$amountpaid'},
'upsize' : {'$sum' : '$upsize'}
}
}])
I expect each aggregated document either with unique _id or duplicate _id to be inserted into collection.

Related

MongoDB aggregate return averages only where date key is greater than a specific date in PyMongo

I have a collection that looks something like this;
{_id: 1204187,
'name' : 'name',
'date' : 2020-06-21T00:00:00.000+00:00,
'metric_a' : 88.14502,
'metric_b' : 31.26421,
'metric_c' : 1544.32414,
'info' : {'foreign_key' : 156789,
'country' : 'US',
'tags' : ['a', 'b', 'c']}}
I would like to return aggregated docs, but only aggregate documents where the date is greater than 7-1-2020.
Here is my first attempt;
date_obj = dt.datetime(today.year, today.month, 1)
docs = loads(dumps(collection.aggregate(
[{
'$match' : {
'_id' : '$name',
'metric_a' : {'$avg' : '$metric_a'},
'metric_b' : {'$avg' : '$metric_b'},
metric_c: {'$avg' : '$metric_c'},
},'$match' : {'date' : {'$gte' : date_obj}}
}])))
This ends up grouping each document with a different date separately. What am I missing?
Found a solution;
docs = loads(dumps(collection.aggregate(
[
{'$match' : {'date' : {'$gte' : date_obj}}},
{
'$group' : {
'_id' : '$name',
'metric_a' : {'$avg' : '$metric_a'},
'metric_b' : {'$avg' : '$metric_b'},
'metric_c' : {'$avg' : '$metric_c'},
'info' : {'$last' : '$info'}
}
}
])))
Pretty intuitive, set conditions first and then how and what you want returned.

What are the efficient query for mongodb if value exist on array then don't update and return the error that id already exist

I have an entry stored on my collection like this:
{
"_id" : ObjectId("5d416c595f19962ff0680dbc"),
"data" : {
"a" : 6,
"b" : [
"5c35f04c4e92b8337885d9a6"
]
},
"image" : "123.jpg",
"hyperlinks" : "google.com",
"expirydate" : ISODate("2019-08-27T06:10:35.074Z"),
"createdate" : ISODate("2019-07-31T10:24:25.311Z"),
"lastmodified" : ISODate("2019-07-31T10:24:25.311Z"),
"__v" : 0
},
{
"_id" : ObjectId("5d416c595f19962ff0680dbd"),
"data" : {
"a" : 90,
"b" : [
"5c35f04c4e92b8337885d9a7"
]
},
"image" : "456.jpg",
"hyperlinks" : "google.com",
"expirydate" : ISODate("2019-08-27T06:10:35.074Z"),
"createdate" : ISODate("2019-07-31T10:24:25.311Z"),
"lastmodified" : ISODate("2019-07-31T10:24:25.311Z"),
"__v" : 0
}
I have to write the query for push userid on b array which is under data object and increment the a counter which is also under data object.
For that, I wrote the Code i.e
db.collection.updateOne({_id: ObjectId("5d416c595f19962ff0680dbd")},
{$inc: {'data.a': 1}, $push: {'data.b': '124sdff54f5s4fg5'}}
)
I also want to check that if that id exist on array then return the response that following id exist, so for that I wrote extra query which will check and if id exist then return the error response that following id exist,
My question is that any single query will do this? Like I don't want to write Two Queries for single task.
Any help is really appreciated for that
You can add one more check in the update query on "data.b". Following would be the query:
db.collection.updateOne(
{
_id: ObjectId("5d416c595f19962ff0680dbd"),
"data.b":{
$ne: "124sdff54f5s4fg5"
}
},
{
$inc: {'data.a': 1},
$push: {'data.b': '124sdff54f5s4fg5'}
}
)
For duplicate entry, you would get the following response:
{ "acknowledged" : true, "matchedCount" : 0, "modifiedCount" : 0 }
If matched count is 0, you can show the error that the id already exists.
You can use the operator $addToSet to check if the element already exits in the array.
db.collection.updateOne({_id: ObjectId("5d416c595f19962ff0680dbd")},
{$inc: {'data.a': 1}, $addToSet: {'data.b': '124sdff54f5s4fg5'}}
)

MongoDB update latest subdocument

here is my mongo document..
{
"_id" : ObjectId("5a69d0acb76d1c2e08e4ccd8"),
"subscriptions" : [
{
"sub_id" : "5a56fd399dd78e33948c9b8e",
"invoice_id" : "5a56fd399dd78e33948c9b8d"
},
{
"sub_id" : "5a56fd399dd78e33948c9b8e"
}
]
}
i want to update and upsert invoice_id into last element of sub-array..
i have tried..
sort: {$natural: -1},
subscription.$.invoice
what i want it to be is....
{
"_id" : ObjectId("5a69d0acb76d1c2e08e4ccd8"),
"subscriptions" : [
{
"sub_id" : "5a56fd399dd78e33948c9b8e",
"invoice_id" : "5a56fd399dd78e33948c9b8d"
},
{
"sub_id" : "5a56fd399dd78e33948c9b8e",
"invoice_id" : "5a56fd399dd78e33948c9b8f"
}
]
}
While there are ways to get the last array element, like Saravana shows in her answer, I don't recommend doing it that way because it introduces race conditions. For example, if two subs are added simultaneously, you can't depend on which one is 'last' in the array.
If an invoice_id has to be tied to a specific sub_id, then it's far better to query and find that specific element in the array, then add the invoice_id to it.
In the comments, the OP indicated that the current order of operations is 1) add sub_id, 2) insert the invoice record into the INVOICE collection and get the invoice_id, 3) add the invoice_id into the new subscription.
However, if you already have the sub_id, then it's better to re-order your operations this way: 1) insert the invoice record and get the invoice_id 2) add both sub_id and invoice_id with a single operation.
Doing this improves performance (eliminates the second update operation), but more importantly, eliminates race conditions because you're adding both sub_id and invoice_id at the same time.
we can get the document and update last element by index
> var doc = db.sub.findOne({"_id" : ObjectId("5a69d0acb76d1c2e08e4ccd8")})
> if ( doc.subscriptions.length - 1 >= 0 )
doc.subscriptions[doc.subscriptions.length-1].invoice_id="5a56fd399dd78e33948c9b8f"
> db.sub.update({_id:doc._id},doc)
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })
or write an aggregation pipeline to form the document and use it for update
db.sub.aggregate(
[
{$match : { "_id" : ObjectId("5a69d0acb76d1c2e08e4ccd8") }},
{$addFields : { last : { $subtract : [{$size : "$subscriptions"},1]}}},
{$unwind : { path :"$subscriptions" , includeArrayIndex : "idx"}},
{$project : { "subscriptions.sub_id" : 1,
"subscriptions.invoice_id" : {
$cond : {
if: { $eq: [ "$idx", "$last" ] },
then: "5a56fd399dd78e33948c9b8f",
else: "$$REMOVE"
}
}
}
},
{$group : {_id : "$_id", subscriptions : {$push : "$subscriptions"}}}
]
).pretty()
result doc
{
"_id" : ObjectId("5a69d0acb76d1c2e08e4ccd8"),
"subscriptions" : [
{
"sub_id" : "5a56fd399dd78e33948c9b8e"
},
{
"sub_id" : "5a56fd399dd78e33948c9b8e",
"invoice_id" : "5a56fd399dd78e33948c9b8f"
}
]
}

How to group in mongodb withtout including _id

so I want to result as :
{
"id" : 888789999,
"name" : "Malaysian with Attendance Allowance",
}
but I tried
{$group : {
'id' : '$profiles.id',
'name' : {$first:'$profiles.name'},
}}
an I get an error :
"errmsg" : "The field 'id' must be an accumulator object",
you can try this, to group by profile id and get first name, add project if you need id without underscore.
In group, except _id other fields should have accumulation or aggregation
{$group : {
'_id' : '$profiles.id',
'name' : {$first:'$profiles.name'},
}}
If you don't want to include _id then
{$group : {
_id: null,
'id' : {$first:'$profiles.id'},// any accumulation which you need
'name' : {$first:'$profiles.name'},
}}
db.collectionName.aggregate(
[
{
$group : {
id : { profiles: { $profiles: "$id" }},
name : {$first:{profiles:{$profiles:"name"}}}
}
}
]
)

$add operation returns null value

I have collection with document structure :
{
'year' : 2014,
'month' : 1
}
I am executing the following operation :
db.collname.aggregate(
[
{
$project : {
'year100' : {$multiply : ["$year" , 100]},
'result' : { '$add' : ['$year100', '$month'] }
}
}
]
);
I get the following result :
{
"result" : [
{
"_id" : ObjectId("5563596c515a88832210f0e4"),
"year100" : 201400.0000000000000000,
"result" : null
},
}
Why is add operation returuning null value as against to actual value ? Please help.
MongoDb not allow to used same fields in project to arithmetic operation instead of one $project used two different projects like this :
db.collname.aggregate({ $project : { 'year100' : {$multiply : ["$year" , 100]} ,"month":"$month"} },{"$project":{"year100":1,"result":{"$add":["$year100","$month"]}}})