Mongoose - Find object with any key and specific subkey - mongodb

Let's say I have a Mongo database that contains objects such as :
[
{
"test": {
"123123": {
"someField": null
}
}
},
{
"test": {
"323143": {
"someField": "lalala"
},
"121434": {
"someField": null
}
}
},
{
"test": {
"4238023": {
"someField": "afafa"
}
}
},
]
As you can see, the keys right under "test" can vary.
I want to find all documents that have at least one someField that is not null.
Something like find : "test.*.someField": { $ne: null } ( * represents any value here)
How can i do this in mongoose ? I'm thinking an aggregation pipeline will be needed here but not exactly sure how.
Constraints :
I don't have much control over the db schema in this scenario.
Ideally i don't want to have to do this logic in nodeJS, I would like to query directly via the db.

The trickiest part here is that you cannot search keys that match a pattern. Luckily there is a workaround. Yes, you do need an aggregation pipeline.
Let's look at an individual document:
{
"test": {
"4238023": {
"someField": "afafa"
}
}
}
We need to query someField, but to get to it, we need to somehow circumvent 4238023 because it varies with each document. What if we could break that test object down and look at it presented like so:
{
"k": "4238023",
"v": {
"someField": "afafa"
}
}
Suddenly, it get a heck of a lot easier to query it. Well, mongodb aggreation offers a function called $objectToArray which does exactly that.
So what we are going to do is:
Convert the test object into an array for each document.
Match only documents where AT LEAST ONE v.someField is not null.
Put it back together to look as your original documents, minus the ones that do not match the null criterion.
So, here is the pipeline you need:
db.collection.aggregate([
{
"$project": {
"arr": {
"$objectToArray": "$$ROOT.test"
}
}
},
{
"$match": {
arr: {
$elemMatch: {
"v.someField": {
$ne: null
}
}
}
}
},
{
"$project": {
"_id": 1,
"test": {
$arrayToObject: "$arr"
}
}
}
])
Playground: https://mongoplayground.net/p/b_VNuOLgUb2
Note that in mongoose you will run this aggregation the same way you would do it in a terminal... well plus the .then.
YourCollection.aggregate([
...
...
])
.then(result => console.log(result))

Related

MongoDB: Can't update in nested arrays

I've been trying to modify a value in multiple arrays for a few arrays and I can't find documentation on how to do this.
My collection looks like this
"rates": [
{
"category": "Web",
"seniorityRates": [
{
"seniority": "junior",
"rate": 100
},
{
"seniority": "intermediate",
"rate": 135
},
{
"seniority": "senior",
"rate": 165
}
]
}
]
I'm just trying to modify "junior" to "beginner", this should be simple.
Thanks to these answers:
How can I update a multi level nested array in MongoDB?
MongoDB updating fields in nested array
I've manage to write that python code (pymongo), but it doesn't works...
result = my_coll.update_many({},
{
"$set":
{
"rates.$[].seniorityRates.$[j].seniority" : new
}
},
upsert=False,
array_filters= [
{
"j.seniority": old
}
]
)
The path 'rates' must exist in the document in order to apply array updates.
It correspond to this command that doesn't work either
db.projects.updateMany({},
{
$set:
{
"rates.$[].seniorityRates.$[j].seniority" : "debutant"
}
},
{ arrayFilters = [
{
"j.seniority": "junior"
}
]
}
)
clone(t={}){const r=t.loc||{};return e({loc:new Position("line"in r?r.line:this.loc.line,"column"in r?r.column:......)} could not be cloned
What am I doing wrong ?
Any help would be very appreciated
The other option could be Sample
db.collection.update({},
{
$set: {
"rates.$[].seniorityRates.$[j].seniority": "debutant"
}
},
{
arrayFilters: [
{
"j.rate": { //As per your data, you can apply the condition o rate field to modify the level
$lte: 100
}
}
]
})
Or
The actual query should work Sample
db.collection.update({},
{
$set: {
"rates.$[].seniorityRates.$[j].seniority": "debutant"
}
},
{
arrayFilters: [
{
"j.seniority": "junior"
}
]
})
The same should work in python, a sample question
So I was just dumb here, I inverted two parameters so I didn't have the correct collection in the python code...
Thanks Gibbs for pointing out where the mistake was in the mongo command.
I will not delete this post as it can help other to know how to do this kind of queries.

how to filter the fields of a document within another document in mongodb?

My document is the following
{
"name":"Name1",
"status":"active",
"points":[
{
"lag":"final"
},
{
"lag":"final"
}
]
},
{
"name":"Name2",
"status":"active",
"points":[
{
"lag":"final"
},
{
"lag":""
}
]
}
I need to get all the documents that have some value in the lag field, for this example should get two document,
I tried with this query, but it only returns me when all points have full lag
{ "points.lag":{$not:{ $eq:"" }},status:{$in:['active']}}
Play
You need to use elemMatch to check whether atleast one element matches the condition.
db.collection.find({
"points": {
"$elemMatch": {
"lag": {
$ne: null
}
}
}
})

MongoDB: adding fields based on partial match query - expression vs query

So I have one collection that I'd like to query/aggegate. The query is made up of several parts that are OR'ed together. For every part of the query, I have a specific set of fields that need to be shown.
So my hope was to do this with an aggregate, that will $match the queries OR'ed together all at once, and then use $project with $cond to see what fields are needed. The problem here is that $cond uses expressions, while the $match uses queries. Which is a problem since some query features are not available as an expression. So a simple conversion is not an option.
So I need another solution..
- I could just make an aggregate per separate query, because there I know what fields to match, and them merger the results together. But this will not work if I use pagination in the queries (limit/skip etc).
- find some other way to tag every document so I can (afterwards) remove any fields not needed. It might not be super efficient, but would work. No clue yet how to do that
- figure out a way to make queries that are only made of expressions. For my purpose that might be good enough, and it would mean a rewrite of the query parser. It could work, but is not ideal.
So This is the next incarnation right here. It will deduplicate and merge records and finally transform it back again to something resembling a normal query result:
db.getCollection('somecollection').aggregate(
[
{
"$facet": {
"f1": [
{
"$match": {
<some query 1>
},
{
"$project: {<some fixed field projection>}
}
],
"f2": [
{
"$match": {
<some query 1>
}
},
{
"$project: {<some fixed field projection>}
}
]
}
},
{
$project: {
"rt": { $concatArrays: [ "$f1", "$f2"] }
}
},
{ $unwind: { path: "$rt"} },
{ $replaceRoot: {newRoot:"$rt"}},
{ $group: {_id: "$_id", items: {$push: {item:"$$ROOT"} } }},
{
$project: {
"rt": { $mergeObjects: "$items" }
}
},
{ $replaceRoot: {newRoot:"$rt.item"}},
]
);
There might still be some optimisation to be, so any comments are welcome
I found an extra option using $facet. This way, I can make a facet for every group opf fields/subqueries. This seems to work fine, except that the result is a single document with a bunch of arrays. not yet sure how to convert that back to multiple documents.
okay, so now I have it figured out. I'm not sure yet about all of the intricacies of this solution, but it seems to work in general. Here an example:
db.getCollection('somecollection').aggregate(
[
{
"$facet": {
"f1": [
{
"$match": {
<some query 1>
},
{
"$project: {<some fixed field projection>
}
],
"f2": [
{
"$match": {
<some query 1>
}
},
{
"$project: {<some fixed field projection>
}
]
}
},
{
$project: {
"rt": { $concatArrays: [ "$f1", "$f2"] }
}
},
{ $unwind: { path: "$rt"} },
{ $replaceRoot: {newRoot:"$rt"}}
]
);

Mongodb aggregate match query with priority on full match

I am attempting to do a mongodb regex query on a field. I'd like the query to prioritize a full match if it finds one and then partials afterwards.
For instance if I have a database full of the following entries.
{
"username": "patrick"
},
{
"username": "robert"
},
{
"username": "patrice"
},
{
"username": "pat"
},
{
"username": "patter"
},
{
"username": "john_patrick"
}
And I query for the username 'pat' I'd like to get back the results with the direct match first, followed by the partials. So the results would be ordered ['pat', 'patrick', 'patrice', 'patter', 'john_patrick'].
Is it possible to do this with a mongo query alone? If so could someone point me towards a resource detailing how to accomplish it?
Here is the query that I am attempting to use to perform this.
db.accounts.aggregate({ $match :
{
$or : [
{ "usernameLowercase" : "pat" },
{ "usernameLowercase" : { $regex : "pat" } }
]
} })
Given your precise example, this could be accomplished in the following way - if your real world scenario is a little bit more complex you may hit problems, though:
db.accounts.aggregate([{
$match: {
"username": /pat/i // find all documents that somehow match "pat" in a case-insensitive fashion
}
}, {
$addFields: {
"exact": {
$eq: [ "$username", "pat" ] // add a field that indicates if a document matches exactly
},
"startswith": {
$eq: [ { $substr: [ "$username", 0, 3 ] }, "pat" ] // add a field that indicates if a document matches at the start
}
}
}, {
$sort: {
"exact": -1, // sort by our primary temporary field
"startswith": -1 // sort by our seconday temporary
}
}, {
$project: {
"exact": 0, // get rid of the "exact" field,
"startswith": 0 // same for "startswith"
}
}])
Another way would be using $facet which may prove a bit more powerful by enabling more complex scenarios but slower (several people here will hate me, though, for this proposal):
db.accounts.aggregate([{
$facet: { // run two pipelines against all documents
"exact": [{ // this one will capture all exact matches
$match: {
"username": "pat"
}
}],
"others": [{ // this one will capture all others
$match: {
"username": { $ne: "pat", $regex: /pat/i }
}
}]
}
}, {
$project: {
"result": { // merge the two arrays
$concatArrays: [ "$exact", "$others" ]
}
}
}, {
$unwind: "$result" // flatten the resulting array into separate documents
}, {
$replaceRoot: { // restore the original document structure
"newRoot": "$result"
}
}])

How to use $regex inside $or as an Aggregation Expression

I have a query which allows the user to filter by some string field using a format that looks like: "Where description of the latest inspection is any of: foo or bar". This works great with the following query:
db.getCollection('permits').find({
'$expr': {
'$let': {
vars: {
latestInspection: {
'$arrayElemAt': ['$inspections', {
'$indexOfArray': ['$inspections.inspectionDate', {
'$max': '$inspections.inspectionDate'
}]
}]
}
},
in: {
'$in': ['$$latestInspection.description', ['Fire inspection on property', 'Health inspection']]
}
}
}
})
What I want is for the user to be able to use wildcards which I turn into regular expressions: "Where description of the latest inspection is any of: Health inspection or Found a * at the property".
The regex I get, don't need help with that. The problem I'm facing is, apparently the aggregation $in operator does not support matching by regular expressions. So I thought I'd build this using $or since the docs don't say I can't use regex. This was my best attempt:
db.getCollection('permits').find({
'$expr': {
'$let': {
vars: {
latestInspection: {
'$arrayElemAt': ['$inspections', {
'$indexOfArray': ['$inspections.inspectionDate', {
'$max': '$inspections.inspectionDate'
}]
}]
}
},
in: {
'$or': [{
'$$latestInspection.description': {
'$regex': /^Found a .* at the property$/
}
}, {
'$$latestInspection.description': 'Health inspection'
}]
}
}
}
})
Except I'm getting the error:
"Unrecognized expression '$$latestInspection.description'"
I'm thinking I can't use $$latestInspection.description as an object key but I'm not sure (my knowledge here is limited) and I can't figure out another way to do what I want. So you see I wasn't even able to get far enough to see if I can use $regex in $or. I appreciate all the help I can get.
Everything inside $expr is an aggregation expression, and the documentation may not "say you cannot explicitly", but the lack of any named operator and the JIRA issue SERVER-11947 certainly say that. So if you need a regular expression then you really have no other option than using $where instead:
db.getCollection('permits').find({
"$where": function() {
var description = this.inspections
.sort((a,b) => b.inspectionDate.valueOf() - a.inspectionDate.valueOf())
.shift().description;
return /^Found a .* at the property$/.test(description) ||
description === "Health Inspection";
}
})
You can still use $expr and aggregation expressions for an exact match, or just keep the comparison within the $where anyway. But at this time the only regular expressions MongoDB understands is $regex within a "query" expression.
If you did actually "require" an aggregation pipeline expression that precludes you from using $where, then the only current valid approach is to first "project" the field separately from the array and then $match with the regular query expression:
db.getCollection('permits').aggregate([
{ "$addFields": {
"lastDescription": {
"$arrayElemAt": [
"$inspections.description",
{ "$indexOfArray": [
"$inspections.inspectionDate",
{ "$max": "$inspections.inspectionDate" }
]}
]
}
}},
{ "$match": {
"lastDescription": {
"$in": [/^Found a .* at the property$/,/Health Inspection/]
}
}}
])
Which leads us to the fact that you appear to be looking for the item in the array with the maximum date value. The JavaScript syntax should be making it clear that the correct approach here is instead to $sort the array on "update". In that way the "first" item in the array can be the "latest". And this is something you can do with a regular query.
To maintain the order, ensure new items are added to the array with $push and $sort like this:
db.getCollection('permits').updateOne(
{ "_id": _idOfDocument },
{
"$push": {
"inspections": {
"$each": [{ /* Detail of inspection object */ }],
"$sort": { "inspectionDate": -1 }
}
}
}
)
In fact with an empty array argument to $each an updateMany() will update all your existing documents:
db.getCollection('permits').updateMany(
{ },
{
"$push": {
"inspections": {
"$each": [],
"$sort": { "inspectionDate": -1 }
}
}
}
)
These really only should be necessary when you in fact "alter" the date stored during updates, and those updates are best issued with bulkWrite() to effectively do "both" the update and the "sort" of the array:
db.getCollection('permits').bulkWrite([
{ "updateOne": {
"filter": { "_id": _idOfDocument, "inspections._id": indentifierForArrayElement },
"update": {
"$set": { "inspections.$.inspectionDate": new Date() }
}
}},
{ "updateOne": {
"filter": { "_id": _idOfDocument },
"update": {
"$push": { "inspections": { "$each": [], "$sort": { "inspectionDate": -1 } } }
}
}}
])
However if you did not ever actually "alter" the date, then it probably makes more sense to simply use the $position modifier and "pre-pend" to the array instead of "appending", and avoiding any overhead of a $sort:
db.getCollection('permits').updateOne(
{ "_id": _idOfDocument },
{
"$push": {
"inspections": {
"$each": [{ /* Detail of inspection object */ }],
"$position": 0
}
}
}
)
With the array permanently sorted or at least constructed so the "latest" date is actually always the "first" entry, then you can simply use a regular query expression:
db.getCollection('permits').find({
"inspections.0.description": {
"$in": [/^Found a .* at the property$/,/Health Inspection/]
}
})
So the lesson here is don't try and force calculated expressions upon your logic where you really don't need to. There should be no compelling reason why you cannot order the array content as "stored" to have the "latest date first", and even if you thought you needed the array in any other order then you probably should weigh up which usage case is more important.
Once reodered you can even take advantage of an index to some extent as long as the regular expressions are either anchored to the beginning of string or at least something else in the query expression does an exact match.
In the event you feel you really cannot reorder the array, then the $where query is your only present option until the JIRA issue resolves. Which is hopefully actually for the 4.1 release as currently targeted, but that is more than likely 6 months to a year at best estimate.