Mongo pull multiple elements inside an array of object - mongodb

I'm trying to pull one or multiple objects from an array and I noticed something odd.
Let's take the following document.
{
"_id" : UUID("f7e80c8e-6b4a-4741-95a3-2567cccf9e5f"),
"createdAt" : ISODate("2021-07-19T17:07:28.499Z"),
"description" : null,
"externalLinks" : [
{
"id" : "ZV8xMjM0NQ==",
"type" : "event"
},
{
"id" : "cF8xMjM0NQ==",
"type" : "planning"
}
],
"updatedAt" : ISODate("2021-07-19T17:07:28.499Z")
}
I wrote a basic query to pull one element of externalLinks which looks like
db.getCollection('Collection').update(
{
_id: {
$in: [UUID("f7e80c8e-6b4a-4741-95a3-2567cccf9e5f")]
}
}, {
$pull: {
externalLinks: {
"type": "planning",
"id": "cF8xMjM0NQ=="
}
}
})
And it's working fine. But it's getting trickier when I want to pull multiple element from the externalLinks. And I'm using the operator $in for that.
And the strange behaviour is here :
db.getCollection('Collection').update(
{
_id: {
$in: [UUID("f7e80c8e-6b4a-4741-95a3-2567cccf9e5f")]
}
}, {
$pull: {
externalLinks: {
$in: [{
"type": "planning",
"id": "cF8xMjM0NQ=="
}]
}
}
})
And this query doesn't work. The solution is to switch both field from externalLinks
and do something like :
$in: [{
"id": "cF8xMjM0NQ==",
"type": "planning"
}]
I tried multiple things like : $elemMatch, $positioning but it should be possible to pull multiple externalLinks.
I also tried the $and operator without success.
I could easily iterate over the externalLinks to update but it'd be too easy.
And it's tickling my brain to choose that solution.
Any help would be appreciate, thank you !

Document fields have order, and MongoDB compares documents based on the order of the fields see here, so what field you put first matters.
After MongoDB 4.2 we can also do pipeline updates, that can be sometimes bigger, but they are much more powerful, and feels more like programming.
(less declarative and pattern matching)
This doesn't mean that you need pipeline update in your case but check this way also.
Query
pipeline update
filter and keep members that the condition doesn't exist
Test code here
db.collection.update(
{_id: {$in: ["f7e80c8e-6b4a-4741-95a3-2567cccf9e5f"]}},
[{"$set":
{"externalLinks":
{"$filter":
{"input": "$externalLinks",
"cond":
{"$not":
[{"$and":
[{"$eq": ["$$this.id", "ZV8xMjM0NQ=="]},
{"$eq": ["$$this.type", "event"]}]}]}}}}}])

Related

Using cond to specify _id fields for group in mongodb aggregation

new to Mongo. Trying to group across different sub fields of a document based on a condition. The condition is a regex on a field value. Looks like -
db.collection.aggregate([{
{
"$group": {
"$cond": [{
"upper.leaf": {
$not: {
$regex: /flower/
}
}
},
{
"_id": {
"leaf": "$upper.leaf",
"stem": "$upper.stem"
}
},
{
"_id": {
"stem": "$upper.stem",
"petal": "$upper.petal"
}
}
]
}
}])
Using api v4.0: cond in the docs shows - { $cond: [ <boolean-expression>, <true-case>, <false-case> ] }
The error I get with the above code is - "Syntax error: dotted field name 'upper.leaf' can not used in a sub object."
Reading up on that I tried $let to re-assign the dotted field name. But started to hit various syntax errors with no obvious issue in the query.
Also tried using $project to rename the fields, but got - Field names may not start with '$'
Thoughts on the best approach here? I can always address this at the application level and split my query into two but it's attractive potentially to solve it natively in mongo.
$group syntax is wrong
{
$group:
{
_id: <expression>, // Group By Expression
<field1>: { <accumulator1> : <expression1> },
...
}
}
You tried to do
{
$group:
<expression>
}
And even if your expression resulted in the same code, its invalid syntax for $group (check from the documentation where you are allowed to use expressions)
One other problem is that you use the query operator for regex, and not the aggregate regex operators (you can't do that, if you aggregate you can use only aggregate operators, only $match is the exception that you can use both if you add $expr)
You need this i think
[{
"$group" : {
"_id" : {
"$cond" : [ {
"$not" : [ {
"$regexMatch" : {
"input" : "$upper.leaf",
"regex" : "/flower/"}}]},
{"leaf" : "$upper.leaf","stem" : "$upper.stem"},
{"stem" : "$upper.stem","petal" : "$upper.petal"}]
}
}}]
Its similar code, but expression gets as value of the "_id" and $regexMatch
is used that is aggregate operator.
I didnt tested the code.

Update an array item of Mongodb with $and query

Hi I am trying to increment the count of the matching requirement in an array. My sample collection looks like the following:
{
"_id": ObjectId("60760ba2e870fa518f2ae48b"),
"userId": "6075f7289822d94dca8066b4",
"requirements": [
{
"searchText": "zee5",
"planType": "basic",
"mode": "PRIVATE",
"count": 32.0
},
{
"searchText": "sony",
"planType": "standard",
"mode": "PUBLIC",
"count": 12.0
},
{
"searchText": "prime",
"planType": "premium",
"mode": "PRIVATE",
"count": 2
}
]
}
If a user searches for prime, with filter premium and PRIVATE, then the count of the last requirement should be updated. If he searches for prime, with filter standard and PRIVATE, then the new requirement will be inserted with count 1.
I am doing in two steps. First I fire an update with the following query and then if no update, I fire a push query with count 1:
db.getCollection('userProfile').update({ "$and" : [{ "requirements.searchText" : {$eq:"prime"}}, {"requirements.mode" : {$eq: "PUBLIC"}}, {"requirements.planType": {$eq: "standard"}}, { "userId" : "6075f7289822d94dca8066b4"}]}, {$inc: {"requirements.$.count" : 1}})
I was expecting that the above query will not update any requirement, since there is no exact match. Interestingly, it increments the count of the second requirement with (sony, standard, public). What is wrong with the query? How can I get it right?
Demo - with Update - https://mongoplayground.net/p/-ISXaAayxxv
Demo No update - https://mongoplayground.net/p/88bTj3lz7U_
Use $elemMatch to make sure all properties are present in the same object inside the array
The $elemMatch operator matches documents that contain an array field with at least one element that matches all the specified query criteria.
db.collection.update(
{
"requirements": {
$elemMatch: { "searchText": "prime","mode": "PUBLIC", "planType": "standard" }
},
"userId": "6075f7289822d94dca8066b4"
},
{ $inc: { "requirements.$.count": 1 } }
)
Problem -
Your current query will match any document with all these fields in
requirements array in any object, if they match 1 property in 1 index of the array and another match in the next index query will find the document valid.
"searchText": "prime",
"mode": "PUBLIC",
"planType": "standard"

How is findById() + save() different from update() in MongoDB

While trying to update a MongoDB document using Mongoose, can I use a findById() with a save() in the callback, or should I stick with traditional update methods such as findByIdAndModify, findOneAndModify, update(), etc.? Say I want to update the name field of the following document (please see a more elaborate example in the edit at the end, which motivated my question):
{
"_id": ObjectId("123"),
"name": "Development"
}
(Mongoose model name for the collection is Category)
I could do this:
Category.update({ "_id" : "123" }, { "name" : "Software Development" }, { new: true })
or I could do this:
Category.findById("123", function(err, category) {
if (err) throw err;
category.name = "Software Development";
category.save();
});
For more elaborate examples, it feels easier to manipulate a JavaScript object that can simply be saved, as opposed to devising a relatively complex update document for the .update() operation. Am I missing something fundamentally important?
Edited 7/21/2016 Responding to the comment from #Cameron, I think a better example is warranted:
{
"_id": ObjectId("123"),
"roles": [{
"roleId": ObjectId("1234"),
"name": "Leader"
}, {
"roleId": ObjectId("1235"),
"name": "Moderator"
}, {
"roleId": ObjectId("1236"),
"name": "Arbitrator"
}]
}
What I am trying to do is remove some roles as well as add some roles in the roles array of sub-documents in a single operation. To add role sub-documents, $push can be used and to remove role sub-documents, $pull is used. But if I did something like this:
Person.update({
"_id": "123"
}, {
$pull : {
"roles" : {
"roleId" : {
$in : [ "1235", "1236" ]
}
}
},
$push : {
"roles" : {
$each: [{
"roleId" : ObjectId("1237"),
"name" : "Developer"
}]
}
}
}
When I try to execute this, I get the error Cannot update 'roles' and 'roles' at the same time, of course. That's when I felt it is easier to find a document, manipulate it any way I want and then save it. In that scenario, I don't know if there is really any other choice for updating the document.
I typically like to use findById() when I am performing more elaborate updates and don't think you are missing anything fundamentally important.
However one method to be aware of in mongoose is findByIdAndUpdate(), this issues a mongodb findAndModify update command and would allow you to perform your first example with the following code: Category.findByIdAndUpdate("123", function(err, savedDoc) {...}).

Filtering and Matching Arrays within Arrays

I am looking for querying JSON file which has nested array structure. Each design element has multiple SLID and status. I want to write mongodb query to get designs with highest SLID and status as "OLD".
Here is the sample JSON:
{
"_id" : ObjectId("55cddc30f1a3c59ca1e88f30"),
"designs" : [
{
"Deid" : 1,
"details" : [
{
"SLID" : 1,
"status" : "OLD"
},
{
"SLID" : 2,
"status" : "NEW"
}
]
},
{
"Deid" : 2,
"details" : [
{
"SLID" : 1,
"status" : "NEW"
},
{
"SLID" : 2,
"status" : "NEW"
},
{
"SLID" : 3,
"status" : "OLD"
}
]
}
]
}
In this sample the expected query should return the following as SLID is highest with status "OLD".
{
"_id" : ObjectId("55cddc30f1a3c59ca1e88f30"),
"designs" : [
{
"Deid" : 2,
"details" : [
{
"SLID" : 3,
"status" : "OLD"
}
]
}
]
}
I have tried following query but it kept returning other details array element (which has status "NEW") along with above element.
db.Collection.find({"designs": {$all: [{$elemMatch: {"details.status": "OLD"}}]}},
{"designs.details":{$slice:-1}})
Edit:
To summarize the problem:
Requirement is to get all design from document set with highest SLID (always the last item in details array) if it has status as "OLD".
Present Problem
What you should have been picking up from the previously linked question is that the positional $ operator itself is only capable of matching the first matched element within an array. When you have nested arrays like you do, then this means "always" the "outer" array can only be reported and never the actual matched position within the inner array nor any more than a single match.
Other examples show usage of the aggregation framework for MongoDB in order to "filter" elements from the array by generally processing with $unwind and then using conditions to match the array elements that you require. This is generally what you need to do in this case to get matches from your "inner" array. While there have been improvements since the first answers, your "last match" or effectively a "slice" condition, excludes other present possibilities. Therefore:
db.junk.aggregate([
{ "$match": {
"designs.details.status": "OLD"
}},
{ "$unwind": "$designs" },
{ "$unwind": "$designs.details" },
{ "$group": {
"_id": {
"_id": "$_id",
"Deid": "$designs.Deid"
},
"details": { "$last": "$designs.details"}
}},
{ "$match": {
"details.status": "OLD"
}},
{ "$group": {
"_id": "$_id",
"details": { "$push": "$details"}
}},
{ "$group": {
"_id": "$_id._id",
"designs": {
"$push": {
"Deid": "$_id.Deid",
"details": "$details"
}
}
}}
])
Which would return on your document or any others like it a result like:
{
"_id" : ObjectId("55cddc30f1a3c59ca1e88f30"),
"designs" : [
{
"Deid" : 2,
"details" : [
{
"SLID" : 3,
"status" : "OLD"
}
]
}
]
}
The key there being to $unwind both arrays and then $group back on the relevant unique elements in order to "slice" the $last element from each "inner" array.
The next of your conditions requires a $match in order to see that the "status" field of that element is the value that you want. Then of course since the documents have been essentially "de-normalized" by the $unwind operations and even with the subsequent $group, the following $group statements re-construct the document into it's original form.
Aggregation pipelines can either be quite simple or quite difficult depending on what you want to do, and reconstruction of documents with filtering like this means you need to take care in the steps, particularly if there are other fields involved. As you should also appreciate here, this process of $unwind to de-normalize and $group operations is not very efficient, and can cause significant overhead depending on the number of possible documents that can be met by the initial $match query.
Better Solution
While currently only available in the present development branch, there are some new operators available to the aggregation pipeline that make this much more efficient, and effectively "on par" with the performance of a general query. These are notably the $filter and $slice operators, which can be employed in this case as follows:
db.junk.aggregate([
{ "$match": {
"designs.details.status": "OLD"
}},
{ "$redact": {
"$cond": [
{ "$gt": [
{ "$size":{
"$filter": {
"input": "$designs",
"as": "designs",
"cond": {
"$anyElementTrue":[
{ "$map": {
"input": {
"$slice": [
"$$designs.details",
-1
]
},
"as": "details",
"in": {
"$eq": [ "$$details.status", "OLD" ]
}
}}
]
}
}
}},
0
]},
"$$KEEP",
"$$PRUNE"
]
}},
{ "$project": {
"designs": {
"$map": {
"input": {
"$filter": {
"input": "$designs",
"as": "designs",
"cond": {
"$anyElementTrue":[
{ "$map": {
"input": {
"$slice": [
"$$designs.details",
-1
]
},
"as": "details",
"in": {
"$eq": [ "$$details.status", "OLD" ]
}
}}
]
}
}
},
"as": "designs",
"in": {
"Deid": "$$designs.Deid",
"details": { "$slice": [ "$$designs.details", -1] }
}
}
}
}}
])
This effectively makes the operations just a $match and $project stage only, which is basically what is done with a general .find() operation. The only real addition here is a $redact stage, which allows the documents to be additionally filtered from the initial query condition by futher logical conditions that can inspect the document.
In this case, we can see if the document not only contains an "OLD" status, but also that this is the last element of at least one of the inner arrays matches that status it it's own last entry, otherwise it is "pruned" from the results for not meeting that condition.
In both the $redact and $project, the $slice operator is used to get the last entry from the "details" array within the "designs" array. In the initial case it is applied with $filter to remove any elements where the condition did not match from the "outer" or "designs" array, and then later in the $project to just show the last element from the "designs" array in final presentation. That last "reshape" is done by $map to replace the whole arrays with the last element slice only.
Whilst the logic there seems much more long winded than the initial statement, the performance gain is potentially "huge" due to being able to treat each document as a "unit" without the need to denormalize or otherwise de-construct until the final projection is made.
Best Solution for Now
In summary, the current processes you can use to achieve the result are simply not efficient for solving the problem. It would in fact be more efficient to simply match the documents that meet the basic condition ( contain a "status" that is "OLD" ) in conjuntction with a $where condition to test the last element of each array. However the actual "filtering" of the arrays in output is best left to client code:
db.junk.find({
"designs.details.status": "OLD",
"$where": function() {
return this.designs.some(function(design){
return design.details.slice(-1)[0].status == "OLD";
});
}
}).forEach(function(doc){
doc.designs = doc.designs.filter(function(design) {
return design.details.slice(-1)[0].status == "OLD";
}).map(function(design) {
design.details = design.details.slice(-1);
return design;
});
printjson(doc);
});
So the query condition at least only returns the documents that match all conditions, and then the client side code filters out the content from arrays by testing the last elements and then just slicing out that final element as the content to display.
Right now, that is probably the most efficient way to do this as it mirrors the future aggregation operation capabilties.
The problems are really in the structure of your data. While it may suit your display purposes of your application, the usage of nested arrays makes this notoriously difficult to query as well as "impossible" to atomically update due to the limitations of the positional operator mentioned before.
While you can always $push new elements to the inner array by matching it's existence within or just the presence of the outer array element, what you cannot do is alter that "status" of an inner element in an atomic operation. In order to modify in such a way, you need to retrieve the entire document, then modifiy the contents in code, and save back the result.
The problems with that process mean you are likely to "collide" on updates and possibly overwrite the changes made by another concurrent request to the one you are currently processing.
For these reasons you really should reconsider all of your design goals for the application and the suitability to such a structure. Keeping a more denormalized form may cost you in some other areas, but it sure would make things much more simple to both query and update with the kind of inspection level this seems to require.
The end conclusion here should be that you reconsider the design. Though getting your results is both possible now and in the future, the other operational blockers should be enough to warrant a change.

Group result mongoDB

I have a collection with array countries values like this. I want to sum the values of the countries.
{
"_id": ObjectId("54cd5e7804f3b06c3c247428"),
"country_json": {
"AE": NumberLong("13"),
"RU": NumberLong("16"),
"BA": NumberLong("10"),
...
}
},
{
"_id": ObjectId("54cd5e7804f3b06c3c247429"),
"country_json": {
"RU": NumberLong("12"),
"ES": NumberLong("28"),
"DE": NumberLong("16"),
"AU": NumberLong("44"),
...
}
}
How to sum the values of countries to get a result like this?
{
"AE": 13,
"RU": 28,
..
}
This can simply be done using aggregation
> db.test.aggregate([
{$project: {
RU: "$country_json.RU",
AE: "$country_json.AE",
BA: "$country_json.BA"
}},
{$group: {
_id: null,
RU: {$sum: "$RU"},
AE: {$sum: "$AE"},
BA: {$sum: "$BA"}
}
])
Output:
{
"_id" : null,
"RU" : NumberLong(28),
"AE" : NumberLong(13),
"BA" : NumberLong(10)
}
This isn't a very good document structure if you intend to aggregate statistics across the "keys" like this. Not really a fan of "data as key names" anyway, but the main point is it does not "play well" with many MongoDB query forms due to the key names being different everywhere.
Particularly with the aggregation framework, a better form to store the data is within an actual array, like so:
{
"_id": ObjectId("54cd5e7804f3b06c3c247428"),
"countries": [
{ "key": "AE", "value": NumberLong("13"),
{ "key": "RU", "value": NumberLong("16"),
{ "key": "BA", "value": NumberLong("10")
]
}
With that you can simply use the aggregation operations:
db.collection.aggregate([
{ "$unwind": "$countries" },
{ "$group": {
"_id": "$countries.key",
"value": { "$sum": "$countries.value" }
}}
])
Which would give you results like:
{ "_id": "AE", "value": NumberLong(13) },
{ "_id": "RU", "value": NumberLong(28) }
That kind of structure does "play well" with the aggregation framework and other MongoDB query patterns because it really is how it's "expected" to be done when you want to use the data in this way.
Without changing the structure of the document you are forced to use JavaScript evaluation methods in order to traverse the keys of your documents because that is the only way to do it with MongoDB:
db.collection.mapReduce(
function() {
var country = this.country_json;
Object.keys(country).forEach(function(key) {
emit( key, country[key] );
});
},
function(key,values) {
return values.reduce(function(p,v) { return NumberLong(p+v) });
},
{ "out": { "inline": 1 } }
)
And that would produce exactly the same result as shown from the aggregation example output, but working with the current document structure. Of course, the use of JavaScript evaluation is not as efficient as the native methods used by the aggregation framework so it's not going to perform as well.
Also note the possible problems here with "large values" in your cast NumberLong fields, since the main reason they are represented that way to JavaScipt is that JavaScipt itself has limitations on the size of that value than can be represented. Likely your values are just trivial but simply "cast" that way, but for large enough numbers as per the intent, then the math will simply fail.
So it's generally a good idea to consider changing how you structure this data to make things easier. As a final note, the sort of output you were expecting with all the keys in a single document is similarly counter intuitive, as again it requires traversing keys of a "hash/map" rather than using the natural iterators of arrays or cursors.