Aggregation: adding fields to mongo document from another object - mongodb

I have an array which looks like this
const posts = [{ _id: '1', viewsCount: 52 }, ...]
Which corresponds to mongodb documents in posts collection
{
_id: '1',
title: 'some title',
body: '....',
}, ...
I want to perform an aggregation which would result in documents fetched from the posts collection to have a viewsCount field. I'm not sure how I should form my aggregation pipeline:
[
{ $match: {} },
{ $addFields: { viewsCount: ?? } }
]
UPDATE
So far the following code almost does the trick:
[
{ $match: {} },
{ $addFields: { viewsCount: { $arrayElemAt: [posts, { $indexOfArray: [ posts, '$_id' ] } ] } } },
]
But viewsCount in this case turns to be an object, so I guess I need to add $project
UPDATE
I've found out one possible solution which is to use $addFields stage twice - overriding the first viewsCount
[
{ $match: {} },
{ $addFields: { viewsCount: { $arrayElemAt: [posts, { $indexOfArray: [ posts, '$_id' ] } ] } } },
{ $addFields: { viewsCount: '$viewsCount.viewsCount' } }
]
But is there a better/more concise solution?
UPDATE
This pipeline actually works correct:
[
{ $match: {} },
{ $addFields: { viewsCount: { $arrayElemAt: [posts, { $indexOfArray: [ postsIds, '$_id' ] } ] } } },
{ $addFields: { viewsCount: '$viewsCount.viewsCount' } }
]
I have updated the second stage by replacing posts with postsIds

To have a more concise solution (one-stage) you can use $let operator which lets you to define temporary variable that can be then used inside your expression, try:
db.posts.aggregate([
{ $addFields: {
viewsCount: {
$let: {
vars: { viewsCountObj: { $arrayElemAt: [posts, { $indexOfArray: [ posts, '$_id' ] } ] } },
in: "$$viewsCountObj.viewsCount"
}
} }
}
])

Related

$filter inside $reduce or inside $map from array without unwind

I need some help:
I want to optimize this query to be faster , it need to filter by events.eventType:"log" all docs with server:"strong" , but without separate unwind & filter stages , maybe somehow inside the $reduce stage to add $filter.
example single document:
{
server: "strong",
events: [
{
eventType: "log",
createdAt: "2022-01-23T10:26:11.214Z",
visitorInfo: {
visitorId: "JohnID"
}
}
current aggregation query:
db.collection.aggregate([
{
$match: {
server: "strong"
}
},
{
$project: {
total: {
$reduce: {
input: "$events",
initialValue: {
visitor: [],
uniquevisitor: []
},
in: {
visitor: {
$concatArrays: [
"$$value.visitor",
[
"$$this.visitorInfo.visitorId"
]
]
},
uniquevisitor: {
$cond: [
{
$in: [
"$$this.visitorInfo.visitorId",
"$$value.uniquevisitor"
]
},
"$$value.uniquevisitor",
{
$concatArrays: [
"$$value.uniquevisitor",
[
"$$this.visitorInfo.visitorId"
]
]
}
]
}
}
}
}
}
}
])
expected output , two lists with unique visitorId & list of all visitorId:
[
{
"total": {
"uniquevisitor": [
"JohnID"
],
"visitor": [
"JohnID",
"JohnID"
]
}
}
]
playground
In the example query no filter is added for events.eventType:"log" , how can this be implemented without $unwind?
I am not sure this approach is more optimized than yours but might be this will help,
$filter to iterate loop of events and filter by eventType
$let to declare a variable events and store the above filters result
return array of visitor by using dot notation $$events.visitorInfo.visitorId
return array of unique visitor uniquevisitor by using dot notation $$events.visitorInfo.visitorId and $setUnion operator
db.collection.aggregate([
{ $match: { server: "strong" } },
{
$project: {
total: {
$let: {
vars: {
events: {
$filter: {
input: "$events",
cond: { $eq: ["$$this.eventType", "log"] }
}
}
},
in: {
visitor: "$$events.visitorInfo.visitorId",
uniquevisitor: {
$setUnion: "$$events.visitorInfo.visitorId"
}
}
}
}
}
}
])
Playground
Or similar approach without $let and two $project stages,
db.collection.aggregate([
{ $match: { server: "strong" } },
{
$project: {
events: {
$filter: {
input: "$events",
cond: { $eq: ["$$this.eventType", "log"] }
}
}
}
},
{
$project: {
total: {
visitor: "$events.visitorInfo.visitorId",
uniquevisitor: {
$setUnion: "$events.visitorInfo.visitorId"
}
}
}
}
])
Playground

How to reference added field in $match?

Given this aggregation pipeline:
[
{
$addFields: {
_myVar: "x"
}
},
{
$match: {
array: "x"
}
}
]
How can the field with value x only be set once?
For example, this does not work, it times out:
[
{
$addFields: {
_myVar: "x"
}
},
{
$match: {
$expr: {
$in: [
"$_myVar", "$array"
]
}
}
}
]
The variable needs to be available throughout the pipeline, so only using the value in the $match stage as condition is not a solution.
What is the solution?
You can do something like this here i added two fields and checking if _myArray has _myVar, this is just to explain how can you check... in your case you have to replace _myArray with your actual array against which you want t to match
[{
$addFields: {
_myVar: "x",
_myArray: ['X', 'Y', 'x']
}
}, {
$addFields: {
has: {
$in: ["$_myVar", "$_myArray"]
}
}
}, {
$match: {
has: true
}
}]

How to use $addFields in mongo to add elements to just existing documents?

I am trying to add a new field to an existing document by using a combination of both $ifnull and $cond but an empty document is always added at the end.
Configuration:
[
{
line: "car",
number: "1",
category: {
FERRARI: {
color: "blue"
},
LAMBORGHINI: {
color: "red"
}
}
},
{
line: "car",
number: "2",
category: {
FERRARI: {
color: "blue"
}
}
}
]
Query approach:
db.collection.aggregate([
{
$match: {
$and: [
{ line: "car" },
{ number: { $in: ["1", "2"] } }
]
}
},
{
"$addFields": {
"category.LAMBORGHINI.number": {
$cond: [
{ "$ifNull": ["$category.LAMBORGHINI", false] },
"$number",
"$$REMOVE"
]
}
}
},
{
$group: {
_id: null,
CATEGORIES: {
$addToSet: "$category.LAMBORGHINI"
}
}
}
])
Here is the link to the mongo play ground:
https://mongoplayground.net/p/RUnu5BNdnrR
I tried the mentioned query but I still get that ugly empty set added at the end.
$$REMOVE will remove last field/key, from your field category.LAMBORGHINI.number the last field is number that is why it is removing number from the end, you can try another approach,
specify just category.LAMBORGHINI, if condition match then it will return object of current category.LAMBORGHINI and number object after merging using $mergeObjects
{
"$addFields": {
"category.LAMBORGHINI": {
$cond: [
{ "$ifNull": ["$category.LAMBORGHINI", false] },
{
$mergeObjects: [
"$category.LAMBORGHINI",
{ number: "$number" }
]
},
"$$REMOVE"
]
}
}
}
Playground

How to find prev/next document after sort in MongoDB

I want to find prev/next blog documents whose publish date is closest to the input document.
Below is the document structure.
Collection Examples (blog)
{
blogCode: "B0001",
publishDate: "2020-09-21"
},
{
blogCode: "B0002",
publishDate: "2020-09-22"
},
{
blogCode: "B0003",
publishDate: "2020-09-13"
},
{
blogCode: "B0004",
publishDate: "2020-09-24"
},
{
blogCode: "B0005",
publishDate: "2020-09-05"
}
If the input is blogCode = B0003
Expected output
{
blogCode: "B0005",
publishDate: "2020-09-05"
},
{
blogCode: "B0001",
publishDate: "2020-09-21"
}
How could I get the output result? In sql, it seems using ROW_NUMBER can solve my problem, however I can't find a solution to achieve the feature in MongoDB. The alternate solution may be reference to this answer (But, it seems inefficient). Maybe using mapReduce is another better solutions? I'm confused at the moment, please give me some help.
You can go like following.
We need to compare existing date with given date. So I used $facet to categorize both dates
The original data should be one Eg : B0003. So that I just get the first element of the origin[] array to compare with rest[] array
used $unwind to flat the rest[]
Substract to get the different between both dates
Again used $facet to find previous and next dates.
Then combined both to get your expected result
NOTE : The final array may have 0<elements<=2. The expected result given by you will not find out whether its a prev or next date if there is a one element. So my suggestion is add another field to say which date it is as the mongo playground shows
[{
$facet: {
origin: [{
$match: { blogCode: 'B0001' }
}],
rest: [{
$match: {
$expr: {
$ne: ['$blogCode','B0001']
}
}
}]
}
}, {
$project: {
origin: {
$arrayElemAt: ['$origin',0]
},
rest: 1
}
}, {
$unwind: {path: '$rest'}
}, {
$project: {
diff: {
$subtract: [{ $toDate: '$rest.publishDate' },{ $toDate: '$origin.publishDate'}]
},
rest: 1,
origin: 1
}
}, {
$facet: {
prev: [{
$sort: {diff: -1}
},
{
$match: {
diff: {$lt: 0 }
}
},
{
$limit: 1
},
{
$addFields:{"rest.type":"PREV"}
}
],
next: [{
$sort: { diff: 1 }
},
{
$match: {
diff: { $gt: 0 }
}
},
{
$limit: 1
},
{
$addFields:{"rest.type":"NEXT"}
}
]
}
}, {
$project: {
combined: {
$concatArrays: ["$prev", "$next"]
}
}
}, {
$unwind: {
path: "$combined"
}
}, {
$replaceRoot: {
newRoot: "$combined.rest"
}
}]
Working Mongo playground
Inspire for the solution of varman proposed. I also find another way to solve my problem by using includeArrayIndex.
[
{
$sort: {
"publishDate": 1
},
},
{
$group: {
_id: 1,
root: {
$push: "$$ROOT"
}
},
},
{
$unwind: {
path: "$root",
includeArrayIndex: "rownum"
}
},
{
$replaceRoot: {
newRoot: {
$mergeObjects: [
"$root",
{
rownum: "$rownum"
}
]
}
}
},
{
$facet: {
currRow: [
{
$match: {
blogCode: "B0004"
},
},
{
$project: {
rownum: 1
}
}
],
root: [
{
$match: {
blogCode: {
$exists: true
}
}
},
]
}
},
{
$project: {
currRow: {
$arrayElemAt: [
"$currRow",
0
]
},
root: 1
}
},
{
$project: {
rownum: {
prev: {
$add: [
"$currRow.rownum",
-1
]
},
next: {
$add: [
"$currRow.rownum",
1
]
}
},
root: 1
}
},
{
$unwind: "$root"
},
{
$facet: {
prev: [
{
$match: {
$expr: {
$eq: [
"$root.rownum",
"$rownum.prev"
]
}
}
},
{
$replaceRoot: {
newRoot: "$root"
}
}
],
next: [
{
$match: {
$expr: {
$eq: [
"$root.rownum",
"$rownum.next"
]
}
}
},
{
$replaceRoot: {
newRoot: "$root"
}
}
],
}
},
{
$project: {
prev: {
$arrayElemAt: [
"$prev",
0
]
},
next: {
$arrayElemAt: [
"$next",
0
]
},
}
},
]
Working Mongo playground

Returning a document with two fields from the same array in MongoDB

Given documents such as
{
_id: 'abcd',
userId: '12345',
activities: [
{ status: 'login', timestamp: '10000001' },
{ status: 'logout', timestamp: '10000002' },
{ status: 'login', timestamp: '10000003' },
{ status: 'logout', timestamp: '10000004' },
]
}
I am trying to create a pipeline such as all users that have their latest login/logout activities recorded between two timestamps will be returned. For example, if the two timestamp values are between 10000002 and 10000003, the expected document should be
{
_id: 'abcd',
userId: '12345',
login: '10000003',
logout: '10000002'
}
Of if the two timestamp values are between -1 and 10000001, the expected document should be :
{
_id: 'abcd',
userId: '12345',
login: '10000001',
logout: null
}
Etc.
I know it has to do with aggregations, and I need to $unwind, etc., but I'm not sure about the rest, namely evaluating two fields from the same document array
You can try below aggregation:
db.col.aggregate([
{
$unwind: "$activities"
},
{
$match: {
$and: [
{ "activities.timestamp": { $gte: "10000001" } },
{ "activities.timestamp": { $lte: "10000002" } }
]
}
},
{
$sort: {
"activities.timestamp": -1
}
},
{
$group: {
_id: "$_id",
userId: { $first: "$userId" },
activities: { $push: "$activities" }
}
},
{
$addFields: {
login: { $arrayElemAt: [ { $filter: { input: "$activities", as: "a", cond: { $eq: [ "$$a.status", "login" ] } } } , 0 ] },
logout: { $arrayElemAt: [ { $filter: { input: "$activities", as: "a", cond: { $eq: [ "$$a.status", "logout" ] } } } , 0 ] }
}
},
{
$project: {
_id: 1,
userId: 1,
login: { $ifNull: [ "$login.timestamp", null ] },
logout: { $ifNull: [ "$logout.timestamp", null ] }
}
}
])
We need to use $unwind + $sort + $group to make sure that our activities will be sorted by timestamp. After $unwind you can use $match to apply filtering condition. Then you can use $filter with $arrayElemAt to get first (latest) value of filtered array. In the last $project you can explicitly use $ifNull (otherwise JSON key will be skipped if there's no value)
You can use below aggregation
Instead of $unwind use $lte and $gte with the $fitler aggregation.
db.collection.aggregate([
{ "$project": {
"userId": 1,
"login": {
"$max": {
"$filter": {
"input": "$activities",
"cond": {
"$and": [
{ "$gte": ["$$this.timestamp", "10000001"] },
{ "$lte": ["$$this.timestamp", "10000004"] },
{ "$lte": ["$$this.status", "login"] }
]
}
}
}
},
"logout": {
"$max": {
"$filter": {
"input": "$activities",
"cond": {
"$and": [
{ "$gte": ["$$this.timestamp", "10000001"] },
{ "$lte": ["$$this.timestamp", "10000004"] },
{ "$lte": ["$$this.status", "logout"] }
]
}
}
}
}
}}
])