Combine $in with $split in mongo - mongodb

I want to combine $in with $split like in the following example, but it fails saying "$in needs an array". I understand the output of $split is an array so I don't know why it fails. Do you know how to solve it or another way to do it?
Thanks
db.mydoc.aggregate([
{
'$match': {
'myid': {
'$in': {
'$split': [
'136618,136620,136622',
',',
],
},
},
},
},
{
'$project': { ... },
},
]);

This "fail" is the expected behavior, let's understand why.
We must first take a look at the $match behavior as specified in the docs:
$match takes a document that specifies the query conditions. The query syntax is identical to the read operation query syntax; i.e. $match does not accept raw aggregation expressions. Instead, use a $expr query expression to include aggregation expression in $match.
This means when you use $match it uses the query language by default, now the "issue" comes from the difference between the two $in operators the query $in operator (which is being used) and the aggregation $in operator ( which you assume is being used ).
It is true that $split resolves to an array. but $split is also an aggregation operator, now I think this case should throw an error but for some reason as you mentioned this behavior just resolves with no results. the aggregation $in operator however , does accept raw aggregation expressions.
This means all you have to do is convert your $match query to use $expr so you can use the aggregation version of $in within the match, like so:
db.collection.aggregate([
{
"$match": {
$expr: {
$in: [
"$myid",
{
"$split": [
"136618,136620,136622",
","
]
}
]
}
}
}
])
Mongo Playground

#Tom Slabbaert gave a very comprehensive and good answer. Just for sake of completeness, an alternative solution (if you work with Javascript/Mongo shell) is this one:
db.mydoc.aggregate([
{
'$match': {
'myid': { '$in': '136618,136620,136622'.split(',') }
}
},
{
'$project': { ... },
},
]);
Be aware either solutions create an array of strings, i.e. [ "136618", "136620", "136622" ]. It does not match if your collection has numeric values, e.g. { myid: 136618 }
You may use
'136618,136620,136622'.split(',').map(x => NumberInt(x))
or
{ $map: { input: { "$split": ["136618,136620,136622", ","] }, in: { $toInt: "$$this" } } }

Related

what is syntactically wrong with this query in MongoDB? [duplicate]

This question already has answers here:
Syntax of $or in mongoDB
(1 answer)
MongoDB aggregation framework match OR
(2 answers)
Closed last month.
{a: {b: 1, c: 2}}
db.getCollection("col").aggregate([
{ $match: { "a.b": { $or: [2, 3] } } },
])
It is complaining that it doesn't recognize the $or operator.
The documentation for the $match stage states that:
The query syntax is identical to the read operation query syntax
If we inspect the documentation for the $or operator, you need to pass it expressions, or more specifically, expression objects. Expression objects have the form { <field1>: <expression1>, ... }.
So the correct way to perform this query using $or would be to do:
db.collection.aggregate([
{
"$match": {
"$or": [
{
"a.b": 2
},
{
"a.b": 3
}
]
}
}
])
Or as the other answer suggested, if both expressions are inspecting the same field, you can use $in. The syntax you would use for $in is more like what you tried initially: { field: { $in: [<value1>, <value2>, ... <valueN> ] } }. Put together it might look like:
db.collection.aggregate([
{
"$match": {
"a.b": {
$in: [
2,
3
]
}
}
}
])

How to use a concatenation as the field name for the MongoDB $in operator?

Assume I have a list of names and want to match documents which are not part of it:
{ firstname: { $not: { $in: ["Alice", "Bob"] } } }
But now I have to match against first name + last name (i.e. the given list is ["Alice Smith", "Bob Jones"]).
I know I can concatenate the two fields easily like this:
{ $concat: ["$firstname", " ", "$lastname"] }
But how do I use this new "field" in the initial query like I used firstname there? Obviously, I can't just replace the object key with this expression.
This answer is pretty close, but unfortunately it's missing the last piece of information on how exactly one uses that solution in the $in context. And since I think this is a general usage question but couldn't find anything about it (at least with the search terms I used), I'm opening this separate question.
Edit: If possible, I want to avoid using an aggregation. The query I'm looking for should be used as the filter parameter of the Node driver's deleteMany method.
You can use $expr, and for not equal to use $not outer side of $in,
db.collection.aggregate([
{
$match: {
$expr: {
$not: {
$in: [
{ $concat: ["$firstname", " ", "$lastname"] },
["Alice In wonderland", "Bob Marley"]
]
}
}
}
}
])
Playground
Indeed you are really close.
You have to use an aggregate. It's a sequence of "stages" where in each stage you can transform the data and pass the result to the next stage.
Here is a solution; Try it Here
with a $project i create a new field full_name by using your $concat
Then with a $match, I use your condition { firstname: { $not: { $in: ["Alice", "Bob"] } } } but I instead apply it to the newly created full_name
You can remove the $match in the mongoplayground and see what it does.
PS : there is a mongo operator $nin that does the combination of $not and $in
db.collection.aggregate([
{
"$project": {
"full_name": {
$concat: [
"$firstname",
" ",
"$lastname"
]
}
}
},
{
$match: {
full_name: {
$nin: [
"Alice In wonderland",
"Bob Marley"
]
}
}
}
])

Expected "[" or AggregationStage but "{" found.: MongoDB aggregation doesn't let me use $toLower inside a $eq

I wrote a MongoDB pipeline that has this code in it:
{
$eq: [
{
"$toLower": "HELLO"
},
"hello"
]
}
And here's a screenshot of it in Mongo Compass
I am expecting it to simply return true, and "$match" everything (for now).
Eventually I will swap "HELLO" with a field name etc.
Does anyone know why I am getting this error?
$match does not accept raw aggregation expressions. Instead, use a $expr query expression to include aggregation expression in $match.
https://docs.mongodb.com/manual/reference/operator/aggregation/match/index.html#pipe._S_match
$expr: {
$eq: [
{
$toLower: "HELLO"
},
"hello"
]
}
Aggregate command Find method

How to use $regex inside $or as an Aggregation Expression

I have a query which allows the user to filter by some string field using a format that looks like: "Where description of the latest inspection is any of: foo or bar". This works great with the following query:
db.getCollection('permits').find({
'$expr': {
'$let': {
vars: {
latestInspection: {
'$arrayElemAt': ['$inspections', {
'$indexOfArray': ['$inspections.inspectionDate', {
'$max': '$inspections.inspectionDate'
}]
}]
}
},
in: {
'$in': ['$$latestInspection.description', ['Fire inspection on property', 'Health inspection']]
}
}
}
})
What I want is for the user to be able to use wildcards which I turn into regular expressions: "Where description of the latest inspection is any of: Health inspection or Found a * at the property".
The regex I get, don't need help with that. The problem I'm facing is, apparently the aggregation $in operator does not support matching by regular expressions. So I thought I'd build this using $or since the docs don't say I can't use regex. This was my best attempt:
db.getCollection('permits').find({
'$expr': {
'$let': {
vars: {
latestInspection: {
'$arrayElemAt': ['$inspections', {
'$indexOfArray': ['$inspections.inspectionDate', {
'$max': '$inspections.inspectionDate'
}]
}]
}
},
in: {
'$or': [{
'$$latestInspection.description': {
'$regex': /^Found a .* at the property$/
}
}, {
'$$latestInspection.description': 'Health inspection'
}]
}
}
}
})
Except I'm getting the error:
"Unrecognized expression '$$latestInspection.description'"
I'm thinking I can't use $$latestInspection.description as an object key but I'm not sure (my knowledge here is limited) and I can't figure out another way to do what I want. So you see I wasn't even able to get far enough to see if I can use $regex in $or. I appreciate all the help I can get.
Everything inside $expr is an aggregation expression, and the documentation may not "say you cannot explicitly", but the lack of any named operator and the JIRA issue SERVER-11947 certainly say that. So if you need a regular expression then you really have no other option than using $where instead:
db.getCollection('permits').find({
"$where": function() {
var description = this.inspections
.sort((a,b) => b.inspectionDate.valueOf() - a.inspectionDate.valueOf())
.shift().description;
return /^Found a .* at the property$/.test(description) ||
description === "Health Inspection";
}
})
You can still use $expr and aggregation expressions for an exact match, or just keep the comparison within the $where anyway. But at this time the only regular expressions MongoDB understands is $regex within a "query" expression.
If you did actually "require" an aggregation pipeline expression that precludes you from using $where, then the only current valid approach is to first "project" the field separately from the array and then $match with the regular query expression:
db.getCollection('permits').aggregate([
{ "$addFields": {
"lastDescription": {
"$arrayElemAt": [
"$inspections.description",
{ "$indexOfArray": [
"$inspections.inspectionDate",
{ "$max": "$inspections.inspectionDate" }
]}
]
}
}},
{ "$match": {
"lastDescription": {
"$in": [/^Found a .* at the property$/,/Health Inspection/]
}
}}
])
Which leads us to the fact that you appear to be looking for the item in the array with the maximum date value. The JavaScript syntax should be making it clear that the correct approach here is instead to $sort the array on "update". In that way the "first" item in the array can be the "latest". And this is something you can do with a regular query.
To maintain the order, ensure new items are added to the array with $push and $sort like this:
db.getCollection('permits').updateOne(
{ "_id": _idOfDocument },
{
"$push": {
"inspections": {
"$each": [{ /* Detail of inspection object */ }],
"$sort": { "inspectionDate": -1 }
}
}
}
)
In fact with an empty array argument to $each an updateMany() will update all your existing documents:
db.getCollection('permits').updateMany(
{ },
{
"$push": {
"inspections": {
"$each": [],
"$sort": { "inspectionDate": -1 }
}
}
}
)
These really only should be necessary when you in fact "alter" the date stored during updates, and those updates are best issued with bulkWrite() to effectively do "both" the update and the "sort" of the array:
db.getCollection('permits').bulkWrite([
{ "updateOne": {
"filter": { "_id": _idOfDocument, "inspections._id": indentifierForArrayElement },
"update": {
"$set": { "inspections.$.inspectionDate": new Date() }
}
}},
{ "updateOne": {
"filter": { "_id": _idOfDocument },
"update": {
"$push": { "inspections": { "$each": [], "$sort": { "inspectionDate": -1 } } }
}
}}
])
However if you did not ever actually "alter" the date, then it probably makes more sense to simply use the $position modifier and "pre-pend" to the array instead of "appending", and avoiding any overhead of a $sort:
db.getCollection('permits').updateOne(
{ "_id": _idOfDocument },
{
"$push": {
"inspections": {
"$each": [{ /* Detail of inspection object */ }],
"$position": 0
}
}
}
)
With the array permanently sorted or at least constructed so the "latest" date is actually always the "first" entry, then you can simply use a regular query expression:
db.getCollection('permits').find({
"inspections.0.description": {
"$in": [/^Found a .* at the property$/,/Health Inspection/]
}
})
So the lesson here is don't try and force calculated expressions upon your logic where you really don't need to. There should be no compelling reason why you cannot order the array content as "stored" to have the "latest date first", and even if you thought you needed the array in any other order then you probably should weigh up which usage case is more important.
Once reodered you can even take advantage of an index to some extent as long as the regular expressions are either anchored to the beginning of string or at least something else in the query expression does an exact match.
In the event you feel you really cannot reorder the array, then the $where query is your only present option until the JIRA issue resolves. Which is hopefully actually for the 4.1 release as currently targeted, but that is more than likely 6 months to a year at best estimate.

MongoDB Aggregation Pipeline: $match with expression not possible?

I'm doing a rather complicated aggregation pipeline and have a rather strange phenomenon - I extracted a short example to visualize my problem here.
It seemed related to MongoDb $addFields and $match - but it doesn't contain any information for me to fix the problem at hand.
Note: Please note that my problem is not with the specific example of using date fields and or dealing with values, the problem is that I'm not able to $match using an expression - using a field that was added before with $addFields or not.
Given MongoDB: 3.6.3 (currently latest)
Let's insert some testdata:
db.testexample.insert({
"dateField": new ISODate("2016-05-18T16:00:00Z")
});
db.testexample.insert({
"dateField": new ISODate("2018-05-18T16:00:00Z")
});
Now let's make simple pipeline that computes only the year of the date and $matches on that:
db.testexample.aggregate([
{
"$addFields": {
"dateFieldYear": {"$year": "$dateField"}
}
},
{
"$match": {
"dateFieldYear": {"$eq": "$dateFieldYear"}}
}
}
])
--> No matches
It should match as it's the same field? Maybe with more trickery (using an $add)?
db.testexample.aggregate([
{
"$addFields": {
"dateFieldYear": {"$year": "$dateField"}
}
},
{
"$match": {
"dateFieldYear": {"$eq": {"$add": ["$dateFieldYear", 0]}}
}
}
])
--> No matches
Still no dice.. Next i thought that variables altogether are a problem. So let's fix the values:
db.testexample.aggregate([
{
"$addFields": {
"dateFieldYear": {"$year": "$dateField"}
}
},
{
"$match": {
"dateFieldYear": {"$eq": {"$add": [2016, 0]}}
}
}
])
--> No matches
Wait.. something is really wrong here.. Let's see with a static value:
db.testexample.aggregate([
{
"$addFields": {
"dateFieldYear": {"$year": "$dateField"}
}
},
{
"$match": {
"dateFieldYear": 2016
}
}
])
--> 1 record found!
So my conclusion seems to be that $match cannot take an expression on a field in an aggregate pipeline. But this doesn't seem possible - as the documentation states that $match follows the query syntax as described here.
Anybody can help how it can be done to $match using the simple example "dateFieldYear": {"$eq": "$dateFieldYear"}} - why doesn't this work as expected?
Thanks so much for any help
You can use $expr ( 3.6 mongo version operator ) to use aggregation functions in regular query.
Compare query operators vs aggregation comparison operators.
In your case
db.testexample.find({$expr:{$eq:["$dateFieldYear", "$dateFieldYear"]}})
Regular Query:
db.testexample.find({$expr:{$eq:["$dateFieldYear", {"$year": "$dateField"}]}})
Aggregation Query:
db.testexample.aggregate({$match:{$expr:{$eq:["$dateFieldYear", {"$year": "$dateField"}]}})