MongoDB aggregation - operator to read in documents - mongodb

Since Mongo only supports one $text field per aggregation pipeline (inside the first $match stage), that means you can't perform a logical AND, since you can't $and the results of multiple $text searches.
// Fails due to "too many text expressions"
db.Employees.aggregate([
{$match: {$and: [
{$text: {$search: "senior"}},
{$text: {$search: "manager"}}
]}}
])
Therefore I need to perform multiple separate $text searches, combine the results in my NodeJS code, and pass that result set back into an aggregation pipeline for further processing (e.g. $addFields, $match, $sort).
Is there a way to do something like...
let results1 = db.Employees.find({"$text":{"$search":"senior"}}, {"score":{"$meta":"textScore"}})
let results2 = db.Employees.find({"$text":{"$search":"manager"}}, {"score":{"$meta":"textScore"}})
let combinedResults = _.intersectionWith(results1, results2, _.isEqual)
let finalResults = /* pass combinedResults into aggregation pipeline and execute it */
Something like the opposite of the $out operator, where I'm reading in a result set instead.
I'm using NestJS and Mongoose if that helps.

There are restrictions in $text, that you already know,
There is a option if you have limited fields, using $regexMatch, I am not sure, in how many fields you have text index, but with this you can combine match conditions with $and operator for multiple fields,
Example Data:
[
{ _id: 1, f1: "senior", f2: "manager" },
{ _id: 2, f1: "junior", f2: "manager" },
{ _id: 3, f1: "fresher", f2: "developer" },
{ _id: 4, f1: "manager", f2: "senior" }
]
Aggregation Query 1:
$addFields to add new field matchResult for matching status in boolean
db.collection.aggregate([
{
$addFields: {
matchResult: {
$and: [
first $or condition match if f1 or f2 fields match senior then return true other wise return false
{
$or: [
{ $regexMatch: { input: "$f1", regex: "senior", options: "x" } },
{ $regexMatch: { input: "$f2", regex: "senior", options: "x" } }
]
},
second $or condition match if f1 or f2 fields match manager then return true other wise return false
{
$or: [
{ $regexMatch: { input: "$f1", regex: "manager", options: "x" } },
{ $regexMatch: { input: "$f2", regex: "manager", options: "x" } }
]
}
]
}
}
},
$match condition return result that have matchResult is equal to true
{ $match: { matchResult: true } }
])
Playground
Aggregation Query 2:
if you are not using array fields then this is sort way, directly you can concat all fields on one field, here i have merged f1 and f2 with space in allField
db.collection.aggregate([
{
$addFields: {
allField: { $concat: ["$f1", " ", "$f2"] }
}
},
this will match $and condition on both word match if both true then return true otherwise false
{
$addFields: {
matchResult: {
$and: [
{ $regexMatch: { input: "$allField", regex: "senior", options: "x" } },
{ $regexMatch: { input: "$allField", regex: "manager", options: "x" } }
]
}
}
},
$match condition return result that have matchResult is equal to true
{ $match: { matchResult: true } }
])
Playground
Note: This is alternate approach for limited fields, but imaging if more then 5 fields then it affects speed and performance of the query.

Related

Find in nested array with compare on last field

I have a collection with documents like this one:
{
f1: {
firstArray: [
{
secondArray: [{status: "foo1"}, {status: "foo2"}, {status: "foo3"}]
}
]
}
}
My expected result includes documents that have at least one item in firstArray, which is last object status on the secondArray is included in an input array of values (eg. ["foo3"]).
I don't must use aggregate.
I tried:
{
"f1.firstArray": {
$elemMatch: {
"secondArray.status": {
$in: ["foo3"],
},
otherField: "bar",
},
},
}
You can use an aggregation pipeline with $match and $filter, to keep only documents that their size of matching last items are greater than zero:
db.collection.aggregate([
{$match: {
$expr: {
$gt: [
{$size: {
$filter: {
input: "$f1.firstArray",
cond: {$in: [{$last: "$$this.secondArray.status"}, ["foo3"]]}
}
}
},
0
]
}
}
}
])
See how it works on the playground example
If you know that the secondArray have always 3 items you can do:
db.collection.find({
"f1.firstArray": {
$elemMatch: {
"secondArray.2.status": {
$in: ["foo3"]
}
}
}
})
But otherwise I don't think you can check only the last item without an aggregaation. The idea is that a regular find allows you to write a query that do not use values that are specific for each document. If the size of the array can be different on each document or even on different items on the same document, you need to use an aggregation pipeline

Mongoose custom $match function

I have objects that look like this
{
name: 'Object 1',
fruitList: ['apple','pear','orange','grape']
},
{
name: 'Object 2',
fruitList: ['melon','pear','apple','kiwi']
}
I need to retrieve all the objects that have apple before pear in their fruitList, in this example it would mean Object 1 only. Can I do a custom match function that iterates over that list and checks it it matches my criteria ?
You need a mechanism to compare the indexes of the fruits in question and use the comparison as a match condition with the $expr operator. Leverage the aggregation pipeline operators:
$indexOfArray - Searches an array for an occurrence of a specified value and returns the array index (zero-based) of the first occurrence.
$subtract - return the difference between the two indexes. If the value is negative then apple appears before pear in the list.
$lt - the comparison operator to use in $expr query that compares two values and returns true when the first value is less than the second value.
To get a rough idea of these operators at play in an aggregation pipeline, check out the following Mongo Playground.
The actual query you need is as follows:
db.collection.find({
$expr: {
lt: [
{
$subtract: [
{ $indexOfArray: [ '$fruitList', 'apple' ] },
{ $indexOfArray: [ '$fruitList', 'pear' ] }
]
},
0
]
}
})
Mongo Playground
For a generic regex based solution where the fruitList array may contain a basket of assorted fruits (in different cases) for instance:
"fruitList" : [
"mango",
"Apples",
"Banana",
"strawberry",
"peach",
"Pears"
]
The following query can address this challenge:
const getMapExpression = (fruit) => {
return {
$map: {
input: '$fruitList',
as: 'fruit',
in: {
$cond: [
{ $regexMatch: { input: '$$fruit', regex: fruit, options: 'i' } },
{ $literal: fruit },
'$$fruit'
]
}
}
}
}
db.collection.find({
$expr: {
$lt: [
{
$subtract: [
{ $indexOfArray: [ getMapExpression('apple'), 'apple' ] },
{ $indexOfArray: [ getMapExpression('pear'), 'pear' ] }
]
},
0
]
}
})

Select a document with the begin of string value like a number

I have a field with a number and unit.
db.createCollection("test")
db.test.insertOne({"curVal":"100°"})
I would like to select document with curVal > 50.
I found a solution but I'm not happy with it.
# 1. match record with curVal
# 2. add field _double_curVal with result of regexFind
# 3. convert the _double_curVal.match to double
# 4. filter curVal > 50
db.test.aggregate(
[
{"$match":{"curVal":{"$exists":true}}},
{"$addFields":
{"_double_curVal":
{"$regexFind":
{"input":"$curVal",
"regex":"[0-9]+"
}
}
}
},
{"$project":
{"_double_curVal":"$_double_curVal"
}
},
{"$project":
{"_double_curVal":
{"$convert":{"input":"$_double_curVal.match","to":"double"}
}
}
},
{ "$match":
{ "_double_curVal":{"$gte":50}
}
}
])
Can you propose a better solution?
I can not say this is better solution but you can try, do all operations in a single $match stage with $expr,
$let to declare vars for curVal to find number using $regexFind
$toDouble convert curVal.match string to number
$expr to match expression matching condition with $gte
db.test.aggregate([
{
$match: {
$expr: {
$gte: [
{
$let: {
vars: {
curVal: {
"$regexFind": {
"input": "$curVal",
"regex": "[0-9]+"
}
}
},
in: { $toDouble: "$$curVal.match" }
}
},
50
]
}
}
}
])
Playground

MongoDB, How Do I combine a find and sort with the $cond in aggregation?

I have written a find query, which works, the find query returns records where name and level exist
db.docs.find( { $and: [{name:{$exists:true}},{level:{ $exists:true}} ] },{_id:0, name:1}).sort({"name":1})
and now want to combine it with something like the code below which also works, but needs to be merged with the above to pull the correct data
db.docs.aggregate(
[
{
$project:
{
_id:0,
name: 1,
Honours:
{
$cond: { if: { $gte: [ "$level", 8 ] }, then: "True", else: "False" }
}
}
}
]
)
The find query returns records where name and level exist, but I need to enhance the result with new column called Honours, showing True of False depending on whether the level is gte (greater than or equal to 8)
So I am basically trying to combine the above find filter with the $cond function (which I found and modified example here : $cond)
I tried the below and a few other permutations to try and make find and sort with the $project and$cond aggregate, but it returned errors. I am just very new to how to construct mongodb syntax to make it all fit together. Can anyone please help?
db.docs.aggregate(
[{{ $and: [{name:{$exists:true}},{level:{ $exists:true}} ] },{_id:0, name:1}).sort({"name":1}
{
$project:
{
_id:0,
name: 1,
Honours:
{
$cond: { if: { $gte: [ "$level", 8 ] }, then: "True", else: "False" }
}
}
}
}
]
)
Try below aggregation pipeline :
db.docs.aggregate([
/** $match is used to filter docs kind of .find(), lessen the dataset size for further stages */
{
$match: {
$and: [{ name: { $exists: true } }, { level: { $exists: true } }]
}
},
/** $project works as projection - w.r.t. this projection it will lessen the each document size for further stages */
{
$project: {
_id: 0,
name: 1,
Honours: {
$cond: { if: { $gte: ["$level", 8] }, then: "True", else: "False" }
}
}
},
/** $sort should work as .sort() */
{ $sort: { name: 1 } }
]);

MongoDB - Autocomplete - Get all words starting with X

I have a collection (users) with the following structure:
{
propA: {
words: ["i", "have", "an","important", "question"]
}
}
I want to get autocomplete options from the db for some input in my website.
So first, i think that i need to create an index for propA.words.
Maybe something like this(?):
db.users.createIndex({ "propA.words" : 1 })
Second, how can i query this index to get all the words starting with X?
For example, for the string "i", the query will retrieve ["i", "important"].
Thanks!
EDIT:
This is the collection:
{
propA: {
words: ["aa","bb","cc","dd"]
}
}
{
propA: {
words: ["ab"]
}
}
{
propA: {
words: []
}
}
{
propB: []
}
Now, i want a query to get all the words that starts with "a".
The query should return ["aa","ab"] on the above collection.
I want the query to use only the index so the search will be efficient.
You can use this aggregation, which iterates over the words array and matches the regex search string.
db.collection.aggregate( [
{
$addFields: {
matches: {
$filter: {
input: "$propA.words",
as: "w",
cond: {
$regexMatch: { input: "$$w" , regex: "^i" }
}
}
}
}
}
] )
The output:
{
"_id" : 1,
"propA" : {
"words" : [
"i",
"have",
"an",
"important",
"question"
]
},
"matches" : [
"i",
"important"
]
}
[ EDIT ADD ]
Now, i want a query to get all the words that starts with "a". The
query should return ["aa","ab"] on the above collection. I want the
query to use only the index so the search will be efficient.
The aggregation:
db.collection.aggregate( [
{
$match: { "propA.words": { $regex: "^a" } }
},
{
$unwind: "$propA.words"
},
{
$group: {
_id: null,
matchedWords: {
$addToSet: {
$cond: [ { $regexMatch: { input: "$propA.words", regex: "^a" } },
"$propA.words",
"$DUMMY" ]
}
}
}
},
{
$project: { _id: 0 }
}
] )
The result:
{ "matchedWords" : [ "ab", "aa" ] }
Index usage:
The index is created on the collection as follows:
db.collection.createIndex( { "propA.words": 1 } )
You can verify the index usage on the aggregation's $match stage by applying the explain and generating a query plan. For example:
db.collection.explain("executionStats").aggregate( [ ... ] )
yes you make an index on the field, which is an array. then use regex query - the symbol ^ for 'starts with'... an index on an array field can create a big load... but your query being a 'start-with' is an efficient design....