How do I update a field in a sub-document array with a field from the document in MongoDB? - mongodb

I have a large amount of data (~160M items) where a date value wasn't populated on the sub-document array fields, but was populated on the parent document. I'm very new to MongoDB and having trouble figuring out how to $set the field to match. Here's a sample of the data:
{
"_id": "5f11d4c48663f32e940696ed",
"Widgets":[{
"WidgetId":663,
"Name":"Super Widget 2.0",
"Created":null,
"LastUpdated":null
}],
"Status":3,
"LastUpdated":null,
"Created": "2018-11-09T18:22:16.000Z"
}
}
My knowledge of MongoDB is pretty limited but here's the basic aggregation I have created for part of the pipeline and where I'm struggling:
db.sample.aggregate(
[
{
"$match" : {
"Donors.$.Created" : {
"$exists" : true
}
}
},
{
"$match" : {
"Widgets.$.Created" : null
}
},
{
"$set" : {
"Widgets.$.Created" : "Created" // <- This is where I can't figure out how to define the reference to the parent "Created" field
}
}
]
);
The desired output would be:
{
"_id": "5f11d4c48663f32e940696ed",
"Widgets":[{
"WidgetId":663,
"Name":"Super Widget 2.0",
"Created":"2018-11-09T18:22:16.000Z",
"LastUpdated":null
}],
"Status":3,
"LastUpdated":null,
"Created": "2018-11-09T18:22:16.000Z"
}
}
Thanks for any assitance

Are you attempting to add the Created field to sub documents on query/aggregation? Or are you attempting to update/save the Created field on the subdocuments?
The $ is an update operator, to be used with updateMany or updateOne. Not aggregate.
https://docs.mongodb.com/manual/reference/operator/query-array/
https://docs.mongodb.com/manual/reference/operator/update-array/
If you just want to add the parents Created field to all subdocuments on query/aggregation this is all you have to do: https://mongoplayground.net/p/yHDHULCSTIz
db.collection.aggregate([
{
"$addFields": {
"Widgets.Created": "$Created"
}
}
])
If your attempting to save the parents Created field to all subdocuments:
db.sample.updateMany({"Widgets.Created" : null}, [{$set: {"Widgets.Created" : "$Created"}}])
Note: This matches any doc that has a subdocument with a null Created field and updates all the subdocuments.

Related

How to extract grouped results from array in $group stage and return as separate fields?

I'm running an aggregation query, and the $group stage is as follows
$group:
{
_id:
{
year_month: { $dateToString: { "date": "$updated_at", "format": "%Y-%m" } }
,client_name: "$clients_docs.client_name"
,client_label: "$clients_docs.client_label"
,client_code: "$clients_docs.client_code"
,client_country: "$clients_docs.client_country"
,base_curr: "$clients_docs.client_base_currency"
,inv_curr: "$clients_docs.client_invoice_currency"
,dest_curr: "$store.destination_currency"
}
,total_vol: { $sum: "$USD_Value" }
,total_tran: { $sum: 1 }
}
It returns the correct results, and returns all the grouped results in the _id:{} array.
I now want to extract all those fields from the array and return them not within the array so I can more easily export the output to a spreadsheet.
I tried using this stage:
{
$project:
{
year_month: 1
,client_name: 1
,client_label: 1
,client_code: 1
,client_country: 1
,base_curr: 1
,inv_curr: 1
,dest_curr: 1
,total_vol: 1
,total_tran : 1
}
},
But that returned the same results as the $group stage:
{
"_id" : {
"year_month" : "2022-01",
"client_name" : "client A",
"client_label" : "client A",
"client_code" : NumberInt(0000),
"client_country" : "TH",
"base_curr" : "USD",
"inv_curr" : "USD",
"dest_curr" : "HKD"
},
"total_vol" : 100000,
"total_tran" : 100.0
}
I want the "year_month" through "dest_curr" fields at the same level as the "total_vol" and "total_tran", so that when the data is exported they all appear as separate columns (now it's all captured as one "_id" column, and a "total_vol" and "total_tran" column). What's the best way to do this?
From a terminology perspective, you currently have an embedded document (or nested fields) rather than an array.
The straightforward way to do this is to simply enumerate each field, eg:
"year_month": "$_id.year_month",
There are fancier ways to do this, but as you only have a handful of fields this should suffice. Working playground example here.
Edit
An alternative ("fancier") approach is to leverage the $replaceWith stage using the $mergeObjects operator inside of it. Then you can $unset the previous _id field afterwards. It would look like this:
db.collection.aggregate([
{
"$replaceWith": {
"$mergeObjects": [
"$$ROOT",
"$_id"
]
}
},
{
$unset: "_id"
}
])
Playground link here
I also fixed the earlier playground link that had a typo for the client_label field.

Mongodb- Query to check if a field in the json array exists or not

I have a JSON in MongoDB and I am trying to check if at least one of the items in the JSON doesn't contain a specific field.
{
"_id" : 12345,
"orderItems" : [
{
"itemId" : 45678,
"isAvailable" : true,
"isEligible" " false
},
{
"itemId" : 87653,
"isAvailable" : true
}
]
}
So in the above JSON, since the 2nd one under order items doesn't contain iseligible field, I need to get this _id.
I tried the below query so far, which didnt work:
db.getCollection('orders').find({"orderItems.iseligible":{$exists:false})
You can use $elemMatch to evaluate the presence of the nested key. Once that's accomplished, project out the _id value.
db.orders.find({
orderItems: {
$elemMatch: {
"isEligible": {
$exists: false
}
}
}
},
{
_id: 1
})
Here is a Mongo playground with the finished code, and a similar SO answer.

How to remove duplicate entries from an array?

In the following example, "Algorithms in C++" is present twice.
The $unset modifier can remove a particular field but how to remove an entry from a field?
{
"_id" : ObjectId("4f6cd3c47156522f4f45b26f"),
"favorites" : {
"books" : [
"Algorithms in C++",
"The Art of Computer Programming",
"Graph Theory",
"Algorithms in C++"
]
},
"name" : "robert"
}
As of MongoDB 2.2 you can use the aggregation framework with an $unwind, $group and $project stage to achieve this:
db.users.aggregate([{$unwind: '$favorites.books'},
{$group: {_id: '$_id',
books: {$addToSet: '$favorites.books'},
name: {$first: '$name'}}},
{$project: {'favorites.books': '$books', name: '$name'}}
])
Note the need for the $project to rename the favorites field, since $group aggregate fields cannot be nested.
The easiest solution is to use setUnion (Mongo 2.6+):
db.users.aggregate([
{'$addFields': {'favorites.books': {'$setUnion': ['$favorites.books', []]}}}
])
Another (more lengthy) version that is based on the idea from #kynan's answer, but preserves all the other fields without explicitly specifying them (Mongo 3.4+):
> db.users.aggregate([
{'$unwind': {
'path': '$favorites.books',
// output the document even if its list of books is empty
'preserveNullAndEmptyArrays': true
}},
{'$group': {
'_id': '$_id',
'books': {'$addToSet': '$favorites.books'},
// arbitrary name that doesn't exist on any document
'_other_fields': {'$first': '$$ROOT'},
}},
{
// the field, in the resulting document, has the value from the last document merged for the field. (c) docs
// so the new deduped array value will be used
'$replaceRoot': {'newRoot': {'$mergeObjects': ['$_other_fields', "$$ROOT"]}}
},
// this stage wouldn't be necessary if the field wasn't nested
{'$addFields': {'favorites.books': '$books'}},
{'$project': {'_other_fields': 0, 'books': 0}}
])
{ "_id" : ObjectId("4f6cd3c47156522f4f45b26f"), "name" : "robert", "favorites" :
{ "books" : [ "The Art of Computer Programmning", "Graph Theory", "Algorithms in C++" ] } }
What you have to do is use map reduce to detect and count duplicate tags .. then use $set to replace the entire books based on { "_id" : ObjectId("4f6cd3c47156522f4f45b26f"),
This has been discussed sevel times here .. please seee
Removing duplicate records using MapReduce
Fast way to find duplicates on indexed column in mongodb
http://csanz.posterous.com/look-for-duplicates-using-mongodb-mapreduce
http://www.mongodb.org/display/DOCS/MapReduce
How to remove duplicate record in MongoDB by MapReduce?
function unique(arr) {
var hash = {}, result = [];
for (var i = 0, l = arr.length; i < l; ++i) {
if (!hash.hasOwnProperty(arr[i])) {
hash[arr[i]] = true;
result.push(arr[i]);
}
}
return result;
}
db.collection.find({}).forEach(function (doc) {
db.collection.update({ _id: doc._id }, { $set: { "favorites.books": unique(doc.favorites.books) } });
})
Starting in Mongo 4.4, the $function aggregation operator allows applying a custom javascript function to implement behaviour not supported by the MongoDB Query Language.
For instance, in order to remove duplicates from an array:
// {
// "favorites" : { "books" : [
// "Algorithms in C++",
// "The Art of Computer Programming",
// "Graph Theory",
// "Algorithms in C++"
// ]},
// "name" : "robert"
// }
db.collection.aggregate(
{ $set:
{ "favorites.books":
{ $function: {
body: function(books) { return books.filter((v, i, a) => a.indexOf(v) === i) },
args: ["$favorites.books"],
lang: "js"
}}
}
}
)
// {
// "favorites" : { "books" : [
// "Algorithms in C++",
// "The Art of Computer Programming",
// "Graph Theory"
// ]},
// "name" : "robert"
// }
This has the advantages of:
keeping the original order of the array (if that's not a requirement, then prefer #Dennis Golomazov's $setUnion answer)
being more efficient than a combination of expensive $unwind and $group stages.
$function takes 3 parameters:
body, which is the function to apply, whose parameter is the array to modify.
args, which contains the fields from the record that the body function takes as parameter. In our case "$favorites.books".
lang, which is the language in which the body function is written. Only js is currently available.

How do I update Array Elements matching criteria in a MongoDB document?

I have a document with an array field, similar to this:
{
"_id" : "....",
"Statuses" : [
{ "Type" : 1, "Timestamp" : ISODate(...) },
{ "Type" : 2, "Timestamp" : ISODate(...) },
//Etc. etc.
]
}
How can I update a specific Status item's Timestamp, by specifying its Type value?
From mongodb shell you can do this by
db.your_collection.update(
{ _id: ObjectId("your_objectid"), "Statuses.Type": 1 },
{ $set: { "Statuses.$.Timestamp": "new timestamp" } }
)
so the c# equivalent
var query = Query.And(
Query.EQ("_id", "your_doc_id"),
Query.EQ("Statuses.Type", 1)
);
var result = your_collection.Update(
query,
Update.Set("Statuses.$.Timestamp", "new timestamp", UpdateFlags.Multi,SafeMode.True)
);
This will update the specific document, you can remove _id filter if you wanted to update the whole collection
Starting with MongoDB 3.6, the $[<identifier>] positional operator may be used. Unlike the $ positional operator — which updates at most one array element per document — the $[<identifier>] operator will update every matching array element. This is useful for scenarios where a given document may have multiple matching array elements that need to be updated.
db.yourCollection.update(
{ _id: "...." },
{ $set: {"Statuses.$[element].Timestamp": ISODate("2021-06-23T03:47:18.548Z")} },
{ arrayFilters: [{"element.Type": 1}] }
);
The arrayFilters option matches the array elements to update, and the $[element] is used within the $set update operator to indicate that only array elements that matched the arrayFilter should be updated.

Updating array of objects using mongoose and date [duplicate]

I have a document with an array field, similar to this:
{
"_id" : "....",
"Statuses" : [
{ "Type" : 1, "Timestamp" : ISODate(...) },
{ "Type" : 2, "Timestamp" : ISODate(...) },
//Etc. etc.
]
}
How can I update a specific Status item's Timestamp, by specifying its Type value?
From mongodb shell you can do this by
db.your_collection.update(
{ _id: ObjectId("your_objectid"), "Statuses.Type": 1 },
{ $set: { "Statuses.$.Timestamp": "new timestamp" } }
)
so the c# equivalent
var query = Query.And(
Query.EQ("_id", "your_doc_id"),
Query.EQ("Statuses.Type", 1)
);
var result = your_collection.Update(
query,
Update.Set("Statuses.$.Timestamp", "new timestamp", UpdateFlags.Multi,SafeMode.True)
);
This will update the specific document, you can remove _id filter if you wanted to update the whole collection
Starting with MongoDB 3.6, the $[<identifier>] positional operator may be used. Unlike the $ positional operator — which updates at most one array element per document — the $[<identifier>] operator will update every matching array element. This is useful for scenarios where a given document may have multiple matching array elements that need to be updated.
db.yourCollection.update(
{ _id: "...." },
{ $set: {"Statuses.$[element].Timestamp": ISODate("2021-06-23T03:47:18.548Z")} },
{ arrayFilters: [{"element.Type": 1}] }
);
The arrayFilters option matches the array elements to update, and the $[element] is used within the $set update operator to indicate that only array elements that matched the arrayFilter should be updated.