Query and Update Child Documents without knowing keys - mongodb

I have a collection with documents having the following format
{
name: "A",
details : {
matchA: {
comment: "Hello",
score: 5
},
matchI: {
score: 10
},
lastMatch:{
score: 5
}
}
},
{
name: "B",
details : {
match2: {
score: 5
},
match7: {
score: 10
},
firstMatch:{
score: 5
}
}
}
I don't immediatly know the name of the keys that are children of details, they don't follow a known format, there can be different amounts etc.
I would like to write a query which will update the children in such a manner that any subdocument with a score less than 5, gets a new field added (say lowScore: true).
I've looked around a bit and I found $ and $elemMatch, but those only work on arrays. Is there an equivalent for subdocuments? Is there some way of doing it using the aggregation pipeline?

I don't think you can do that using a normal update(). There is a way through the aggregation framework which itself, however, cannot alter any persisted data. So you will need to loop through the results and update your documents individually like e.g. here: Aggregation with update in mongoDB
This is the required query to transform your data into what you need for the subsequent update:
collection.aggregate({
$addFields: {
"details": {
$objectToArray: "$details" // transform "details" into uniform array of key-value pairs
}
}
}, {
$unwind: "$details" // flatten the array created above
}, {
$match: {
"details.v.score": {
$lt: 10 // filter out anything that's not relevant to us
// (please note that I used some other filter than the one you wanted "score less than 5" to get some results using your sample data
},
"details.v.lowScore": { // this filter is not really required but it seems to make sense to check for the presence of the field that you want to create in case you run the query repeatedly
$exists: false
}
}
}, {
$project: {
"fieldsToUpdate": "$details.k" // ...by populating the "details" array again
}
})
Running this query returns:
/* 1 */
{
"_id" : ObjectId("59cc0b6afab2f8c9e1404641"),
"fieldsToUpdate" : "matchA"
}
/* 2 */
{
"_id" : ObjectId("59cc0b6afab2f8c9e1404641"),
"fieldsToUpdate" : "lastMatch"
}
/* 3 */
{
"_id" : ObjectId("59cc0b6afab2f8c9e1404643"),
"fieldsToUpdate" : "match2"
}
/* 4 */
{
"_id" : ObjectId("59cc0b6afab2f8c9e1404643"),
"fieldsToUpdate" : "firstMatch"
}
You could then $set your new field "lowScore" using a cursor as described in the linked answer above.

Related

How to extract grouped results from array in $group stage and return as separate fields?

I'm running an aggregation query, and the $group stage is as follows
$group:
{
_id:
{
year_month: { $dateToString: { "date": "$updated_at", "format": "%Y-%m" } }
,client_name: "$clients_docs.client_name"
,client_label: "$clients_docs.client_label"
,client_code: "$clients_docs.client_code"
,client_country: "$clients_docs.client_country"
,base_curr: "$clients_docs.client_base_currency"
,inv_curr: "$clients_docs.client_invoice_currency"
,dest_curr: "$store.destination_currency"
}
,total_vol: { $sum: "$USD_Value" }
,total_tran: { $sum: 1 }
}
It returns the correct results, and returns all the grouped results in the _id:{} array.
I now want to extract all those fields from the array and return them not within the array so I can more easily export the output to a spreadsheet.
I tried using this stage:
{
$project:
{
year_month: 1
,client_name: 1
,client_label: 1
,client_code: 1
,client_country: 1
,base_curr: 1
,inv_curr: 1
,dest_curr: 1
,total_vol: 1
,total_tran : 1
}
},
But that returned the same results as the $group stage:
{
"_id" : {
"year_month" : "2022-01",
"client_name" : "client A",
"client_label" : "client A",
"client_code" : NumberInt(0000),
"client_country" : "TH",
"base_curr" : "USD",
"inv_curr" : "USD",
"dest_curr" : "HKD"
},
"total_vol" : 100000,
"total_tran" : 100.0
}
I want the "year_month" through "dest_curr" fields at the same level as the "total_vol" and "total_tran", so that when the data is exported they all appear as separate columns (now it's all captured as one "_id" column, and a "total_vol" and "total_tran" column). What's the best way to do this?
From a terminology perspective, you currently have an embedded document (or nested fields) rather than an array.
The straightforward way to do this is to simply enumerate each field, eg:
"year_month": "$_id.year_month",
There are fancier ways to do this, but as you only have a handful of fields this should suffice. Working playground example here.
Edit
An alternative ("fancier") approach is to leverage the $replaceWith stage using the $mergeObjects operator inside of it. Then you can $unset the previous _id field afterwards. It would look like this:
db.collection.aggregate([
{
"$replaceWith": {
"$mergeObjects": [
"$$ROOT",
"$_id"
]
}
},
{
$unset: "_id"
}
])
Playground link here
I also fixed the earlier playground link that had a typo for the client_label field.

In mongodb know index of array element matched with $in operator?

I am using aggregation with mongoDB now i am facing a problem here, i am trying to match my documents which are present in my input array by using $in operator. Now i want to know the index of the lement from the input array now can anyone please tell me how can i do that.
My code
var coupon_ids = ["58455a5c1f65d363bd5d2600", "58455a5c1f65d363bd5d2601","58455a5c1f65d363bd5d2602"]
couponmodel.aggregate(
{ $match : { '_id': { $in : coupons_ids }} },
/* Here i want to know index of coupon_ids element that is matched because i want to perform some operation in below code */
function(err, docs) {
if (err) {
} else {
}
});
Couponmodel Schema
var CouponSchema = new Schema({
category: {type: String},
coupon_name: {type: String}, // this is a string
});
UPDATE-
As suggested by user3124885 that aggregation is not better in performance, can anyone please tell me the performance difference between aggregation and normal query in mongodb. And which one is better ??
Update-
I read this question on SO mongodb-aggregation-match-vs-find-speed. Here the user himself commented that both take same time, also by seeing vlad-z answer i think aggregation is better. Please if anyone of you have worked on mongodb Then please tell me what are your opinion about this.
UPDATE-
I used sample json data containing 30,000 rows and tried match with aggregation v/s find query aggregation got executed in 180 ms where find query took 220ms. ALso i ran $lookup it is also taking not much than 500ms so think aggregation is bit faster than normal query. Please correct me guys if any one of you have tried using aggregation and if not then why ??
UPDATE-
I read this post where user uses below code as a replacement of $zip SERVER-20163 but i am not getting how can i solve my problem using below code. So can anybody please tell me how can i use below code to solve my issue.
{$map: {
input: {
elt1: "$array1",
elt2: "$array2"
},
in: ["$elt1", "$elt2"]
}
Now can anyone please help me, it would be really be a great favor for me.
So say we have the following in the database collection:
> db.couponmodel.find()
{ "_id" : "a" }
{ "_id" : "b" }
{ "_id" : "c" }
{ "_id" : "d" }
and we wish to search for the following ids in the collections
var coupons_ids = ["c", "a" ,"z"];
We'll then have to build up a dynamic projection state so that we can project the correct indexes, so we'll have to map each id to its corresponding index
var conditions = coupons_ids.map(function(value, index){
return { $cond: { if: { $eq: ['$_id', value] }, then: index, else: -1 } };
});
Then we can then inject this in to our aggregation pipeline
db.couponmodel.aggregate([
{ $match : { '_id' : { $in : coupons_ids } } },
{ $project: { indexes : conditions } },
{ $project: {
index : {
$filter: {
input: "$indexes", as: "indexes", cond: { $ne: [ "$$indexes", -1 ] }
}
}
}
},
{ $unwind: '$index' }
]);
Running the above will now output each _id and it's corresponding index within the coupons_ids array
{ "_id" : "a", "index" : 1 }
{ "_id" : "c", "index" : 0 }
However we can also add more items in to the pipeline at the end and reference $index to get the current matched index.
I think you could do it in a faster way simply retrieving the array and search manually. Remember that aggregation don't give you performance.
//$match,$in,$and
$match:{
$and:[
{"uniqueID":{$in:["CONV0001"]}},
{"parentID":{$in:["null"]}},
]
}
}])

How to project whether field exists

If I have documents with similar structure as below. I am updating them with the results of the computations and I want to know whether the result has already been inserted into a document or not. Let's say for each document I run computation 'c' and computation 'd'. Now I want to display a table of all documents and show whether a computation 'd' has been already carried out. And for this table I do not care about computation 'c'.
{
"_id":1
"a":1,
"resultsOfComputation":{
"c":{large embedded document},
"d":{large embedded document}
}
}
{
"_id":2
"a":1,
"resultsOfComputation":{
"c":{large embedded document}
}
}
I would like to get a result that tells me whether a document contains a specific field. For example, I would like to know whether it contains field "resultsOfComputation.d", no matter what is the value of that field.
An example of the result of the query for "resultsOfComputation.d" would be:
{
"_id":1
"a":1,
"resultsOfComputation":{
"d":true
}
}
{
"_id":2
"resultsOfComputation":{
"d":false
}
}
If "resultsOfComputation.d" is not in the document it can also be undefined, which is also ok:
{
"_id":1
"a":1,
"resultsOfComputation":{
"d":true
}
}
{
"_id":2
"a":1,
"resultsOfComputation":{}
}
In general, the idea is to get all the root elements of the documents, but only true/false/undefined for the selected (one) result of computation, since the result of computation is a large embedded document.
Run the following aggregation pipeline to get the desired results:
db.collection.aggregate([
{
"$project": {
"a": 1,
"resultsOfComputation": {
"d": { "$gt": ["$resultsOfComputation.d", null] }
}
}
}
])
Sample Output
/* 1 */
{
"_id" : 1,
"a" : 1,
"resultsOfComputation" : {
"d" : true
}
}
/* 2 */
{
"_id" : 2,
"a" : 1,
"resultsOfComputation" : {
"d" : false
}
}

MongoDB Aggregation with DBRef

Is it possible to aggregate on data that is stored via DBRef?
Mongo 2.6
Let's say I have transaction data like:
{
_id : ObjectId(...),
user : DBRef("user", ObjectId(...)),
product : DBRef("product", ObjectId(...)),
source : DBRef("website", ObjectId(...)),
quantity : 3,
price : 40.95,
total_price : 122.85,
sold_at : ISODate("2015-07-08T09:09:40.262-0700")
}
The trick is "source" is polymorphic in nature - it could be different $ref values such as "webpage", "call_center", etc that also have different ObjectIds. For example DBRef("webpage", ObjectId("1")) and DBRef("webpage",ObjectId("2")) would be two different webpages where a transaction originated.
I would like to ultimately aggregate by source over a period of time (like a month):
db.coll.aggregate( { $match : { sold_at : { $gte : start, $lt : end } } },
{ $project : { source : 1, total_price : 1 } },
{ $group : {
_id : { "source.$ref" : "$source.$ref" },
count : { $sum : $total_price }
} } );
The trick is you get a path error trying to use a variable starting with $ either by trying to group by it or by trying to transform using expressions via project.
Any way to do this? Actually trying to push this data via aggregation to a subcollection to operate on it there. Trying to avoid a large cursor operation over millions of records to transform the data so I can aggregate it.
Mongo 4. Solved this issue in the following way:
Having this structure:
{
"_id" : LUUID("144e690f-9613-897c-9eab-913933bed9a7"),
"owner" : {
"$ref" : "person",
"$id" : NumberLong(10)
},
...
...
}
I needed to use "owner.$id" field. But because of "$" in the name of field, I was unable to use aggregation.
I transformed "owner.$id" -> "owner" using following snippet:
db.activities.find({}).aggregate([
{
$addFields: {
"owner": {
$arrayElemAt: [{ $objectToArray: "$owner" }, 1]
}
}
},
{
$addFields: {
"owner": "$owner.v"
}
},
{"$group" : {_id:"$owner", count:{$sum:1}}},
{$sort:{"count":-1}}
])
Detailed explanations here - https://dev.to/saurabh73/mongodb-using-aggregation-pipeline-to-extract-dbref-using-lookup-operator-4ekl
You cannot use DBRef values with the aggregation framework. Instead you need to use JavasScript processing of mapReduce in order to access the property naming that they use:
db.coll.mapReduce(
function() {
emit( this.source.$ref, this["total_price"] )
},
function(key,values) {
return Array.sum( values );
},
{
"query": { "sold_at": { "$gte": start, "$lt": end } },
"out": { "inline": 1 }
}
)
You really should not be using DBRef at all. The usage is basically deprecated now and if you feel you need some external referencing then you should be "manually referencing" this with your own code or implemented by some other library, with which you can do so in a much more supported way.

MongoDB conditionally $addToSet sub-document in array by specific field

Is there a way to conditionally $addToSet based on a specific key field in a subdocument on an array?
Here's an example of what I mean - given the collection produced by the following sample bootstrap;
cls
db.so.remove();
db.so.insert({
"Name": "fruitBowl",
"pfms" : [
{
"n" : "apples"
}
]
});
n defines a unique document key. I only want one entry with the same n value in the array at any one time. So I want to be able to update the pfms array using n so that I end up with just this;
{
"Name": "fruitBowl",
"pfms" : [
{
"n" : "apples",
"mState": 1111234
}
]
}
Here's where I am at the moment;
db.so.update({
"Name": "fruitBowl",
},{
// not allowed to do this of course
// "$pull": {
// "pfms": { n: "apples" },
// },
"$addToSet": {
"pfms": {
"$each": [
{
"n": "apples",
"mState": 1111234
}
]
}
}
}
)
Unfortunately, this adds another array element;
db.so.find().toArray();
[
{
"Name" : "fruitBowl",
"_id" : ObjectId("53ecfef5baca2b1079b0f97c"),
"pfms" : [
{
"n" : "apples"
},
{
"n" : "apples",
"mState" : 1111234
}
]
}
]
I need to effectively upsert the apples document matching on n as the unique identifier and just set mState whether or not an entry already exists. It's a shame I can't do a $pull and $addToSet in the same document (I tried).
What I really need here is dictionary semantics, but that's not an option right now, nor is breaking out the document - can anyone come up with another way?
FWIW - the existing format is a result of language/driver serialization, I didn't choose it exactly.
further
I've gotten a little further in the case where I know the array element already exists I can do this;
db.so.update({
"Name": "fruitBowl",
"pfms.n": "apples",
},{
$set: {
"pfms.$.mState": 1111234,
},
}
)
But of course that only works;
for a single array element
as long as I know it exists
The first limitation isn't a disaster, but if I can't effectively upsert or combine $addToSet with the previous $set (which of course I can't) then it the only workarounds I can think of for now mean two DB round-trips.
The $addToSet operator of course requires that the "whole" document being "added to the set" is in fact unique, so you cannot change "part" of the document or otherwise consider it to be a "partial match".
You stumbled on to your best approach using $pull to remove any element with the "key" field that would result in "duplicates", but of course you cannot modify the same path in different update operators like that.
So the closest thing you will get is issuing separate operations but also doing that with the "Bulk Operations API" which is introduced with MongoDB 2.6. This allows both to be sent to the server at the same time for the closest thing to a "contiguous" operations list you will get:
var bulk = db.so.initializeOrderedBulkOp();
bulk.find({ "Name": "fruitBowl", "pfms.n": "apples": }).updateOne({
"$pull": { "pfms": { "n": "apples" } }
});
bulk.find({ "Name": "fruitBowl" }).updateOne({
"$push": { "pfms": { "n": "apples", "state": 1111234 } }
})
bulk.execute();
That pretty much is your best approach if it is not possible or practical to move the elements to another collection and rely on "upserts" and $set in order to have the same functionality but on a collection rather than array.
I have faced the exact same scenario. I was inserting and removing likes from a post.
What I did is, using mongoose findOneAndUpdate function (which is similar to update or findAndModify function in mongodb).
The key concept is
Insert when the field is not present
Delete when the field is present
The insert is
findOneAndUpdate({ _id: theId, 'likes.userId': { $ne: theUserId }},
{ $push: { likes: { userId: theUserId, createdAt: new Date() }}},
{ 'new': true }, function(err, post) { // do the needful });
The delete is
findOneAndUpdate({ _id: theId, 'likes.userId': theUserId},
{ $pull: { likes: { userId: theUserId }}},
{ 'new': true }, function(err, post) { // do the needful });
This makes the whole operation atomic and there are no duplicates with respect to the userId field.
I hope this helpes. If you have any query, feel free to ask.
As far as I know MongoDB now (from v 4.2) allows to use aggregation pipelines for updates.
More or less elegant way to make it work (according to the question) looks like the following:
db.runCommand({
update: "your-collection-name",
updates: [
{
q: {},
u: {
$set: {
"pfms.$[elem]": {
"n":"apples",
"mState": NumberInt(1111234)
}
}
},
arrayFilters: [
{
"elem.n": {
$eq: "apples"
}
}
],
multi: true
}
]
})
In my scenario, The data need to be init when not existed, and update the field If existed, and the data will not be deleted. If the datas have these states, you might want to try the following method.
// Mongoose, but mostly same as mongodb
// Update the tag to user, If there existed one.
const user = await UserModel.findOneAndUpdate(
{
user: userId,
'tags.name': tag_name,
},
{
$set: {
'tags.$.description': tag_description,
},
}
)
.lean()
.exec();
// Add a default tag to user
if (user == null) {
await UserModel.findOneAndUpdate(
{
user: userId,
},
{
$push: {
tags: new Tag({
name: tag_name,
description: tag_description,
}),
},
}
);
}
This is the most clean and fast method in the scenario.
As a business analyst , I had the same problem and hopefully I have a solution to this after hours of investigation.
// The customer document:
{
"id" : "1212",
"customerCodes" : [
{
"code" : "I"
},
{
"code" : "YK"
}
]
}
// The problem : I want to insert dateField "01.01.2016" to customer documents where customerCodes subdocument has a document with code "YK" but does not have dateField. The final document must be as follows :
{
"id" : "1212",
"customerCodes" : [
{
"code" : "I"
},
{
"code" : "YK" ,
"dateField" : "01.01.2016"
}
]
}
// The solution : the solution code is in three steps :
// PART 1 - Find the customers with customerCodes "YK" but without dateField
// PART 2 - Find the index of the subdocument with "YK" in customerCodes list.
// PART 3 - Insert the value into the document
// Here is the code
// PART 1
var myCursor = db.customers.find({ customerCodes:{$elemMatch:{code:"YK", dateField:{ $exists:false} }}});
// PART 2
myCursor.forEach(function(customer){
if(customer.customerCodes != null )
{
var size = customer.customerCodes.length;
if( size > 0 )
{
var iFoundTheIndexOfSubDocument= -1;
var index = 0;
customer.customerCodes.forEach( function(clazz)
{
if( clazz.code == "YK" && clazz.changeDate == null )
{
iFoundTheIndexOfSubDocument = index;
}
index++;
})
// PART 3
// What happens here is : If i found the indice of the
// "YK" subdocument, I create "updates" document which
// corresponds to the new data to be inserted`
//
if( iFoundTheIndexOfSubDocument != -1 )
{
var toSet = "customerCodes."+ iFoundTheIndexOfSubDocument +".dateField";
var updates = {};
updates[toSet] = "01.01.2016";
db.customers.update({ "id" : customer.id } , { $set: updates });
// This statement is actually interpreted like this :
// db.customers.update({ "id" : "1212" } ,{ $set: customerCodes.0.dateField : "01.01.2016" });
}
}
}
});
Have a nice day !