Aggregate and $or operator in MongoDB - mongodb

This is my mongodb query:
Booking.aggregate([
{ $match:
{ $and: [
{ $or: [
{ isDoubleRoom },
{ chosenRoom }
]},
{ month },
{ year },
] }},
{ $group: { _id: "$fullDate", count: { $sum: 1 } } }
]
In a first stage I would like to filter out by month, year and conditionally: if isDoubleRoom then filter only by double rooms, if it is not then filter by chosenRoom property. The thing is that $or does not switch between filters. Query returns not filtered (by chosen isDoubleRoom $or chosenRoom) results. The same worked when I used it with find instead of aggregate. But here I need aggregate in order to count filtered results.

You should use $cond
{
$match: {
$and: [
{
$cond: [{isDoubleRoomBool}, {isDoubleRoom} , {chosenRoom}] //
},
{ month },
{ year },
]
}
}

Related

MongoDB optimizing big aggregation queries

I have a collection of documents in MongoDB, representing some entity. For every entity there are some statistics data gathered on a daily basis. The statistics are put as a separate documents into different collections.
Entity collection schema:
{
_id: ObjectId,
filterField1: String, //indexed
filterField2: String, //indexed
}
Example schema of statistics collection:
{
_id: ObjectId,
entityId: ObjectId, //indexed
statisticsValue: Int32,
date: Date //indexed
}
There is a dashboard that needs to display some aggregated statistics based on the gathered data over some time period e.x. average value, sum, count etc. The dashboard enables filtering in/out entities and applying different date ranges which makes precalculating those aggregated statistics impossible.
As for now, I've been using aggregation pipeline to:
apply the filters on the entities collection (using match stage)
make necessary lookups stages to acquire statistics for aggregation
make grouping and aggregation (avg, sum, count, etc.)
Here is the pipeline:
db.getCollection('entities').aggregate([
{ $match: { $expr: { $and: [
// ENTITIES FILTERS based on filterField1 and filterField2 fields
] } } },
{ $lookup: {
from: 'statistics',
let: { entityId: '$_id' },
pipeline: [{ $match: { $expr: { $and: [
{ $eq: ["$entityId", "$$entityId"] },
{ $gte: [ "$date", new ISODate("2022-06-01T00:00:00Z") ] },
{ $lte: [ "$date", new ISODate("2022-06-01T23:59:59Z") ] },
] } } },
as: 'stats_start_date_range',
} },
{ $lookup: {
from: 'statistics',
let: { key: '$_key' },
pipeline: [{ $match: { $expr: { $and: [
{ $eq: ["$entityId", "$$entityId"] },
{ $gte: [ "$date", new ISODate("2022-06-31T00:00:00Z") ] },
{ $lte: [ "$date", new ISODate("2022-06-31T23:59:59Z") ] },
] } } },
as: 'stats_end_date_range',
} },
{ $addFields:
{
start_stats: { $first: "$stats_start_date_range" },
end_stats: { $first: "$stats_end_date_range" }
}
},
{
$group: {
_id: null,
avg_start: { $avg: "$start_stats.statisticsValue" },
avg_end: { $avg: "$end_stats.statisticsValue" }
}
}
])
In case of this query, the expected result is the average value of the statisticsValue field for the start and end date for every entity matching the filters.
I applied the index on the field used to left join collections in lookup stage. as well as on the date field used for getting statistics for a specific date.
The problem is that the query takes about 1 second for the max number of documents after the match stage (about 1000 documents). And I need to perform 4 such queries. The statistic collection contains 800k documents and the number is growing every day.
I was wondering, if I can do anything to make the query execution faster, I considered:
time series collection
reorganizing collections structure (don't know how)
merging those 4 separate queries into 1, using facet stage
But I'm not sure if MongoDB is suitable data source for such operations and maybe I should consider another data source if I want to perform such queries.
Hard to guess, what you would like to get. An approach could be this one:
const entities = db.getCollection('entities').aggregate([
{ $match: { filterField1: "a" } }
]).toArray().map(x => x._id)
db.getCollection('statistics').aggregate([
{
$match: {
entityId: { $in: entities },
date: {
$gte: ISODate("2022-06-01T00:00:00Z"),
$lte: ISODate("2022-06-31T23:59:59Z")
}
}
},
{
$facet: {
stats_start_date_range: [
{
$match: {
date: {
$gte: ISODate("2022-06-01T00:00:00Z"),
$lte: ISODate("2022-06-01T23:59:59Z")
}
}
}
],
stats_end_date_range: [
{
$match: {
date: {
$gte: ISODate("2022-06-31T00:00:00Z"),
$lte: ISODate("2022-06-31T23:59:59Z")
}
}
}
]
}
},
{
$addFields: {
start_stats: { $first: "$stats_start_date_range" },
end_stats: { $first: "$stats_end_date_range" }
}
},
{
$group: {
_id: null,
avg_start: { $avg: "$start_stats.statisticsValue" },
avg_end: { $avg: "$end_stats.statisticsValue" }
}
}
]);

Mongo Compaas filter within Projection

Recently I am facing a challenege while creating a query in Mongo Compass. Below is the scenario.
I have a set of documents in mongo db like below:
{
_id :1,
'people':[
{
'grade' : ['A','B'],
'stream': [ {
'stream_id: 'CSE',
'stream_name': 'COMPUTER'
},
{
'stream_id: 'ECE',
'stream_name': 'ELECTRONICS'
},
]
},
{
'grade' : ['B'],
'stream': [ {
'stream_id: 'IT',
'stream_name': 'INFORMATION_TECH'
}
]
}
]
}
I need to find the 'PEOPLE' element which has grade as 'A' and stream_name as 'CSE'. So basically I want this output:
{
_id :1,
'people':[
{
'grade' : ['A','B'],
'stream': [ {
'stream_id: 'CSE',
'stream_name': 'COMPUTER'
}
]
]}
I have tried all the $elemMatch features but it's returning the whole document not only that particular index of the array. Please if anyone is aware of mongo compass, let me know.
Mongo is fun it seems :)
You can use aggregations
$unwind to deconstruct the array
$match to get necessary documents, others will be eliminated
$filter to filter the stream, since we get all the documents that passes the condition, we need to filter the stream objects which equal to condition
$group to reconstruct the array that we already deconstructed in first step
here is the code
db.collection.aggregate([
{ $unwind: "$people" },
{
$match: {
$expr: {
$and: [
{ $in: [ "A", "$people.grade" ] },
{ $in: [ "CSE", "$people.stream.stream_id" ] }
]
}
}
},
{
$addFields: {
"people.stream": {
$filter: {
input: "$people.stream",
cond: { $eq: [ "$$this.stream_id", "CSE" ] }
}
}
}
},
{
$group: {
_id: "$_id",
people: { $push: "$people" }
}
}
])
Working Mongo playground

Get documents given a range (up and down, not between)

Is it possible to get the documents that, given a date, obtain the closest to that date both above and below, I mean, greater and lower from that date.
Current code:
db.collection.aggregate({
$match: {
$or: [
{
"timestamp": {
$gte: new Date("2021-05-27T14:40:46Z")
}
},
{
"timestamp": {
$lt: new Date("2021-05-27T14:40:46Z")
}
}
]
}
},
{
$limit: 5
})
Playground
I don't think is there any way to do this straightway, temporary you can try the below query if it's really required,
$facet to separate the result for old date and new date as per conditions
old,
$match to check $lt condition
$sort by timestamp in descending order
$limit 2 documents
$sort by timestamp in ascending order
new,
$match to check $gte condition
$sort by timestamp in ascending order
$limit 3 documents
$project, $concatArrays to concat both arrays in single
The below process is optional if you want to format exactly what you need then use,
$unwind to deconstruct the above array
$replaceRoot to replace docs object to root
db.collection.aggregate([
{
$facet: {
old: [
{
$match: {
timestamp: {
$lt: new Date("2021-05-27T14:40:46Z")
}
}
},
{ $sort: { timestamp: -1 } },
{ $limit: 2 },
{ $sort: { timestamp: 1 } }
],
new: [
{
$match: {
timestamp: {
$gte: new Date("2021-05-27T14:40:46Z")
}
}
},
{ $sort: { timestamp: 1 } },
{ $limit: 3 }
]
}
},
{ $project: { docs: { $concatArrays: ["$old", "$new"] } } },
{ $unwind: "$docs" },
{ $replaceRoot: { newRoot: "$docs" } }
])
Playground

Find({ example: { $elemMatch: { $eq: userId } } }).. in Aggregate - is it possible?

database:
[{to_match: [ userID_1, userId_2 ], data: [{...}] },
{to_match: [ userID_1, userId_2, userId_3 ], data: [{...}] },
{to_match: [ ], data: [{...}] }]
Find by an element in the array 'to-match'.
Current solution:
Replacement.find(
{ applicants: { $elemMatch: { $eq: userId_1 } } },
Aggregate $lookup on the result of 1.
a. Can I Find and Aggregate?
b. Should I first Aggregate and then match ??
if yes, how to match on the element in the array?
I tried Aggregate:
$lookup // OK
{ $match: { applicants: { $in: { userId } } } } // issues
Thank you
Use $lookup and $match in aggregate
Instead of $in use $elemMatch like below:
{ $match: { applicants: { $elemMatch: { $eq: userId_1 } } } }
Doing an $elemMatch on just one field is equivalent to using find(link)
Generally, it is efficient to limit the data that the lookup stage will be working on.
So if I understand correctly, you want to filter the "to_match" array elements and then do a lookup on that result.
Here is what I would suggest:-
aggregate([
{
$project : {
to_match: {
$filter: {
input: "$to_match",
as: "item",
cond: { $eq: [ "$$item", "userId_3" ] }
}
},
data : 1
}
},
{
$match : { "to_match" : {$ne : []}}
},
//// Lookup stage here
])
Based on the field that you want to do a lookup on, you may want to unwind this result.

MongoDB - match multiple fields same value

So I am trying to do something where I can group MongoDB fields for a check.
Given I have following data structure:
{
//Some other data fields
created: date,
lastLogin: date,
someSubObject: {
anotherDate: date,
evenAnotherDate: date
}
On these I want to do a check like this:
collection.aggregate([
{
$match: {
"created": {
$lt: lastWeekDate
},
"someSubObject.anotherDate": {
$lt: lastWeekDate
},
"lastLogin": {
$lt ...
is there a possibility to group the fields and do something like
$match: {
[field1, field2, field3]: {
$lt: lastWeekDate
}
}
You need $expr to use $map to generate an array of boolean values and then $allElementsTrue to apply AND condition
db.collection.find({
$expr: {
$allElementsTrue: {
$map: {
input: [ "$field1", "$field2", "$field3" ],
in: { $lt: [ "$$this", lastWeekDate ] }
}
}
}
})
EDIT: if you need that logic as a part of aggregation you can use $match which is an equivalent of find
db.collection.aggregate([
{
$match: {
$expr: {
$allElementsTrue: {
$map: {
input: [ "$field1", "$field2", "$field3" ],
in: { $lt: [ "$$this", lastWeekDate ] }
}
}
}
}
}
])