How to know that aggregated group has previous/next values? - mongodb

Suppose I have the following aggregation pipeline:
db.getCollection('posts').aggregate([
{ $match: { _id: { $gt: "some id" }, tag: 'some tag' } },
{ $limit: 5 },
{ $group: { _id: null, hasNextPage: {??}, hasPreviousPage: {??} } }
])
As a result $match and $limit stages would result in a subset of all the posts with a tag some tag. How can I know that there're posts before and after my subSet?
One of the possible ways, I guess, is to have expression (with $let) inside hasPreviousPage and hasNextPage that would search for one post with _id less than "some id" and greater than $last: "$_id"respectively. But I'm not sure how I can reference my group as a variable in $let. Also, maybe there're some other more effective ways.

You can use below aggregation:
db.posts.aggregate([
{ $match: { tag: 'some tag' } },
{ $sort: { _id: 1 } },
{
$facet: {
data: [
{ $match: { _id: { $gt: 'some id' } } },
{ $limit: 5 }
],
hasPreviousPage: [
{ $match: { _id: { $lte: 'some id' } } },
{ $count: "totalPrev" }
],
hasNextPage: [
{ $match: { _id: { $gt: 'some id' } } },
{ $skip: 5 },
{ $limit: 1 }, // just to check if there's any element
{ $count: "totalNext" }
]
}
},
{
$unwind: { path: "$hasPreviousPage", preserveNullAndEmptyArrays: true }
},
{
$unwind: { path: "$hasNextPage", preserveNullAndEmptyArrays: true }
},
{
$project: {
data: 1,
hasPreviousPage: { $gt: [ "$hasPreviousPage.totalPrev", 0 ] },
hasNextPage: { $gt: [ "$hasNextPage.totalNext", 0 ] }
}
}
])
To apply any paging you have to $sort your collection to get results in deterministic order. On a set that's sorted and filtered by tag you can run $facet which allows you to apply multiple subaggregations. Pipelines that are representing previous and nextPage can be ended with $count. Every subaggregation in $facet will return an array so we can run $unwind to get nested document instead of array for hasPreviousPage and hasNextPage. Option preserveNullAndEmptyArrays is required here cause otherwise MongoDB will remove whole document from aggregation pipeline if there are no prev / next documents. In the last step we can just convert subaggregations to boolean values.

Related

MongdDB: Combining query results of two collections as one

There are two collections (view and click) like following:
# View collection
_id publisher_id created_at
617f8ea98e0f54f05e10e796 1 2021-11-01T00:00:00.000Z
617f8eab8e0f54f05e10e798 1 2021-11-01T00:00:00.000Z
617f8eac8e0f54f05e10e79a 1 2021-11-01T00:00:00.000Z
617f90cea187d30ebbecdee9 2 2021-11-01T00:00:00.000Z
# Click collection
_id publisher_id created_at
617f8ea98e0f54f05e10e796 1 2021-11-01T00:00:00.000Z
617f8eab8e0f54f05e10e798 2 2021-11-01T00:00:00.000Z
How can I get the following expected results with one query?
(or)
What is the best way for the following expected results?
# Expected For Publisher ID(1)
_id view_count click_count
2021/11/1 3 1
# Expected For Publisher ID(2)
_id view_count click_count
2021/11/1 1 1
Currently, I am using 2 queries for both collections and combining results as one in code.
For View
db.view.aggregate([
/*FirstStage*/
{
$match:
{
"$and":
[
{
"publisher_id": 1
},
{
"created_at": {$gte: new ISODate("2021-11-01"), $lt: new ISODate("2021-11-28")}
}
]
}
},
/*SecondStage*/
{
$group:
{
_id: {$dateToString: {format: '%Y/%m/%d', date: "$created_at"}},
count: {
$sum: 1
}
}
}
])
For Click
db.click.aggregate([
/*FirstStage*/
{
$match:
{
"$and":
[
{
"publisher_id": 1
},
{
"created_at": {$gte: new ISODate("2021-11-01"), $lt: new ISODate("2021-11-28")}
}
]
}
},
/*SecondStage*/
{
$group:
{
_id: {$dateToString: {format: '%Y/%m/%d', date: "$created_at"}},
count: {
$sum: 1
}
}
}
])
Because you are querying two different collections there is no "good" way to merge this into one query, the only way I can think of is using $facet, where the first stage is the "normal" one, and the other stage starts with a $lookup from the other collection.
This approach does add overhead, which is why I recommend to just keep doing the merge in code, however for the sake of answering here is a sample:
db.view.aggregate([
{
$facet: {
views: [
{
$match: {
"$and": [
{
"publisher_id": 1
},
{
"created_at": {
$gte: ISODate("2021-11-01"),
$lt: ISODate("2021-11-28")
}
}
]
}
},
],
clicks: [
{
$limit: 1
},
{
$lookup: {
from: "click",
let: {},
pipeline: [
{
$match: {
"$and": [
{
"publisher_id": 1
},
{
"created_at": {
$gte: ISODate("2021-11-01"),
$lt: ISODate("2021-11-28")
}
}
]
}
},
],
as: "clicks"
}
},
{
$unwind: "$clicks"
},
{
$replaceRoot: {
newRoot: "$clicks"
}
}
]
}
},
{
$project: {
merged: {
"$concatArrays": [
"$views",
"$clicks"
]
}
}
},
{
$unwind: "$merged"
},
{
$group: {
_id: {
$dateToString: {
format: "%Y/%m/%d",
date: "$merged.created_at"
}
},
count: {
$sum: 1
}
}
}
])
Mongo Playground

aggregate with unwind, how to limit per document and not globally? (mongodb)

If I have a collection with 300 documents, each document has a array field called items (each item of the array is an object), something like this:
*DOCUMENT 1:*
_id: **********,
title: "test",
desc: "test desc",
items (array)
0: (object)
title: (string)
tags: (array of strings)
1: (object)
etc.
and I need to retrieve items by tags, what I'm using is this query below. I have to $limit results to something like 200 or the query is too big, the problem is if the first document has more than 200 items what it returns are only items of that document, what I'd need is to limit results PER document, for instance I'd need to retrieve 5 items for each different document where tags match ($all) tags provided.
const foundItems = await db.collection('store').aggregate([
{
$unwind: '$items'
},
{
$match: {
'items.tags': { $all : tagsArray }
}
},
{
$project: {
myitem: '$items',
desc: 1,
title: 1
}
},
{
$limit: 200
}
]).toArray()
to make it more clear and simple what I'd need in a ideal world would be something like:
{
$limit: 5,
$per: _id,
$totalLimit: 200
}
instead of $limit: 200 , is this achievable somehow? I didn't find any explanation about it in the official documentation.
What I tried is to add $sort right before $limit which would make sense if it had the behaviour I'm looking for put it that way and maybe not if placed AFTER the limit, but unfortunately it doesn't work that way and placed before or after the limit doesn't make any difference.
And I can't really use $sample since results are more than the 5%
Updated demo - https://mongoplayground.net/p/nM6T9XVa-XK
db.collection.aggregate([
{ $unwind: "$items" },
{
$match: {
"items.tags": {
$all: [ "a","b" ]
}
}
},
{
"$group": {
"_id": "$_id",
"myitem": { "$push": "$items" },
desc: { "$first": "$desc" },
title: { "$first": "$title" }
}
},
{
"$project": {
"_id": 1,
desc: 1,
title: 1,
"myitem": { $slice: [ "$myitem", 2 ]
}
}
},
{
$unwind: "$myitem"
}
])
Demo - https://mongoplayground.net/p/BESptnyUfSS
After matching the records you can $group them according to id and $project them and limit them using Use $slice
db.collection.aggregate([
{ $unwind: "$items" },
{
$match: {
"items.tags": { $all: [ "a", "b" ]
}
}
},
{
$project: {
_id: 1, myitem: "$items", desc: 1,title: 1
}
},
{
"$group": {
"_id": "$_id",
"myitem": { "$push": "$myitem" }
}
},
{
"$project": {
"_id": 1,
"myitem": {
$slice: [ "$myitem", 1 ] // limit records here per group / id
}
}
}
])

How to slice some fileds in aggregation query MongoDB

I am trying to aggregate poems-collection. Each poem has "lines" files which is array of lines like
lines: [
{
id: '123'
text: 'ABC'
},
{
id: '567'
text: 'AKA'
},
{
id: '890'
text: 'ZXZ'
}
...
]
db.getCollection('poems').aggregate([
{
$match: {
"languageId": "en",
"published": { $exists: true, $ne: false }
}
},
{
$group: {
_id: {
"userId": "$userId"
},
"lastPoem": {
$last: "$$ROOT" // take just last document alternatives $first or $push (all)
},
"count": {
$sum: 1
}
}
},
{ "$sort": { 'lastPoem.publishedDate': -1 } },
{ "$skip": 0 },
{ "$limit": 10 }
])
I need to slice number of "lines" to 5 for example.
How do I use slice in this case with aggregation?
I tried to put different places, but did not get it to work.
{ "lastPoem.lines": { "$slice": [ "$lines", 10 ] } }
Thank you!
The lines field is inside lastPoem it should $lastPoem.lines and you have used just $lines in $slice,
$addFields after $group stage and before $sort stage
{
$addFields: {
"lastPoem.lines": {
$slice: ["$lastPoem.lines", 5]
}
}
}
Playground

Insert documents to MongoDB when count is equal to expected value?

Is it possible to insert/upsert multiple documents in MongoDB 4.2 only if the the number of documents matching a particular query is of a particular size?
Example:
Let's say I have an items collection with the following 2 documents:
{ item: "ZZZ137", type="type1"}
{ item: "ZZZ138", type="type1"}
Now I want to insert these two documents:
{ item: "ZZZ139", type="type1"}
{ item: "ZZZ140", type="type1"}
but only of there are currently 2 items of type type1 in the collection (i.e. count of type1 is equal to 2).
Is it possible to somehow do this in MongoDB with a single command?
Update
To further illustrate my question let's imagine that insertMany had support for conditions. Then I'd like to do something like this (pseudo code that doesn't work):
db.items.insertMany({ { $count: { type: "type" } } : { $eq : 2 } } , [{ item: "ZZZ139", type="type1"}, { item: "ZZZ140", type="type1"}])
Where { { $count: { type: "type" } } : { $eq : 2 } } would be the query that must be fulfilled in order to insert item ZZZ139 and ZZZ140.
This can be achieved using $out or $merge if you insist on doing this in 1 call, however it's very inefficient due to the logic and restriction of these 2 operators. I personally recommend splitting it into 2 calls:
let typeTwoCount = await db.collection.countDocuments({type: "2"})
if (typeTwoCount === 2) {
await db.collection.insertMany(newItems)
}
Now we can use $out but due to the fact that it re-writes the collection we'll have to carry the entire collection through the pipeline and into the $out stage, which is ridiculous:
db.collection.aggregate([
{
$facet: {
typeTwo: [
{
$match: {
type: "2"
}
},
{
$count: "doc_count"
},
{
$addFields: {
newDocs: {
$cond: [
{$eq: ["$doc_count", 2]},
items,
[]
]
}
}
},
{
$unwind: "$newDocs"
},
{
$replaceRoot: {
newRoot: "$newDocs"
}
},
],
all: [
{
$match: {}
}
]
}
},
{
$addFields: {
merged: { $concatArrays: ["$all", "$typeTwo"]}
}
},
{
$unwind: "$merged"
},
{
$replaceRoot: {
newRoot: "$merged"
}
},
{
$out: "collection"
}
])
Now the issue with $merge is the following restriction:
The output collection cannot be the same collection as the collection being aggregated.
So we can employ similar tactic to the $out pipeline (with using the typeTwo pipeline for the $merge), but we'll have to start the aggregation with a different none empty dummy collection:
db.any_other_none_empty_collection.aggregate([
{
$limit: 1
},
{
$lookup: {
from: "collection",
let: {},
pipeline: [
{
$match: {
type: "2"
}
}
],
as: "all"
}
},
{
$addFields: {
doc_count: {$size: "$all"}
}
},
{
$addFields: {
newDocs: {
$cond: [
{$eq: ["$doc_count", 2]},
items,
[]
]
}
}
},
{
$unwind: "$newDocs"
},
{
$replaceRoot: {
newRoot: "$newDocs"
}
},
{
$merge: {
into: "collection"
}
}
])

MongoDB multiple levels embedded array query

I have a document like this:
{
_id: 1,
data: [
{
_id: 2,
rows: [
{
myFormat: [1,2,3,4]
},
{
myFormat: [1,1,1,1]
}
]
},
{
_id: 3,
rows: [
{
myFormat: [1,2,7,8]
},
{
myFormat: [1,1,1,1]
}
]
}
]
},
I want to get distinct myFormat values as a complete array.
For example: I need the result as: [1,2,3,4], [1,1,1,1], [1,2,7,8]
How can I write mongoDB query for this?
Thanks for the help.
Please try this, if every object in rows has only one field myFormat :
db.getCollection('yourCollection').distinct('data.rows')
Ref : mongoDB Distinct Values for a field
Or if you need it in an array & also objects in rows have multiple other fields, try this :
db.yourCollection.aggregate([{$project :{'data.rows.myFormat':1}},{ $unwind: '$data' }, { $unwind: '$data.rows' },
{ $group: { _id: '$data.rows.myFormat' } },
{ $group: { _id: '', distinctValues: { $push: '$_id' } } },
{ $project: { distinctValues: 1, _id: 0 } }])
Or else:
db.yourCollection.aggregate([{ $project: { values: '$data.rows.myFormat' } }, { $unwind: '$values' }, { $unwind: '$values' },
{ $group: { _id: '', distinctValues: { $addToSet: '$values' } } }, { $project: { distinctValues: 1, _id: 0 } }])
Above aggregation queries would get what you wanted, but those can be tedious on large datasets, try to run those and check if there is any slowness, if you're using for one-time then if needed you can consider using {allowDiskUse: true} & irrespective of one-time or not you need to check on whether to use preserveNullAndEmptyArrays:true or not.
Ref : allowDiskUse , $unwind preserveNullAndEmptyArrays