Mongodb merging multiple rows base on computed condition on row value - mongodb

I have a sample data like this:
[
{ objectId: 1, user: 1, phones: [1, 2], emails: ['a'] },
{ objectId: 2, user: 1, phones: [1, 5], emails: ['a', 'f'] },
{ objectId: 3, user: 1, phones: [8, 9], emails: ['f', 'g'] },
{ objectId: 4, user: 1, phones: [10], emails: ['h'] },
{ objectId: 5, user: 2, phones: [1, 2, 3], emails: ['aa', 'bb', cc'] },
]
Now I need to merge all related rows into one on these conditions:
Have same user
Have at least either one common phone or email
So output something like this:
[
{ objectId: 1, user: 1, phones: [1, 2, 5, 8, 9], emails: ['a', 'f', 'g'] },
{ objectId: 4, user: 1, phones: [10], emails: ['h'] },
{ objectId: 5, user: 2, phones: [1, 2, 3], emails: ['aa', 'bb', cc'] },
]
This is what I have came up with so far:
[
{
$unwind: {
path: "$phones",
preserveNullAndEmptyArrays: true
}
},
{
$group: {
_id: {
user: "$user",
phone: "$phones"
},
objectIds: {
$addToSet: "$_id"
},
emailsList: {
$push: "$emails"
},
user: { $first: "$user" },
phones: {
$first: "$phones"
}
}
},
{
"$addFields": {
"emails": {
"$reduce": {
"input": "$emailsList",
"initialValue": [],
"in": { "$setUnion": ["$$value", "$$this"] }
}
}
}
},
{
"$project": {
"emailsList": 0
}
},
{
$unwind: {
path: "$emails",
preserveNullAndEmptyArrays: true
}
},
{
$group: {
_id: {
user: "$user",
phone: "$emails"
},
objectIdsList: {
$push: "$objectIds"
}
}
},
{
"$project": {
"mergedObjectIds": {
"$reduce": {
"input": "$objectIdsList",
"initialValue": [],
"in": { "$setUnion": ["$$value", "$$this"] }
}
}
}
}
]
And then we have a list of objectIds need to be merged in, then I will merge it all in application code. So is there anyway I can do that in aggregation framework alone, or pipe the result of this aggregate in to the next one

Unless I'm missing something, these are just the "sets" for each user. So simply unwind both arrays and accumulate via $addToSet for each of "phones" and "emails":
db.collection.aggregate([
{ "$unwind": "$phones" },
{ "$unwind": "$emails" },
{ "$group": {
"_id": "$user",
"phones": { "$addToSet": "$phones" },
"emails": { "$addToSet": "$emails" }
}}
])
Which returns:
{ "_id" : 2, "phones" : [ 3, 2, 1 ], "emails" : [ "cc", "bb", "aa" ] }
{ "_id" : 1, "phones" : [ 9, 1, 2, 5, 8 ], "emails" : [ "g", "f", "a" ] }
A "set" is not really considered to be "ordered", so if you expect a certain order then you need to sort elsewhere, and probably best in the client.
Any "unique" id's don't really apply here. If anything you would use a different accumulator like $min or $max, or maybe $first depending on what you want, however the only relevant details I see here is the "user" for grouping and the other accumulated "set" values.
Even though unwinding multiple arrays produces a "cartesian product" of the other values, it really does not matter when everything being pulled out is as "distinct" values anyway. This typically only matters where you need to "count" elements, and that is something your output is not looking for in the question.

Related

How to get argmax/argmin of multiple fields simultaneously in mongodb?

Here's the data example I'm working with.
[
{
"uid": "111",
"a": 1,
"b": 3,
"c": 1,
},
{
"uid": "222",
"a": 2,
"b": 2,
"c": 2
},
{
"uid": "333",
"a": 3,
"b": 1,
"c": 3
}
]
Then I want to perform argmax on fields "a" and "b", and argmin on field "c" and return the "uid" as the result.
For example:
For "a", it's maximum value is 3, the corresponding "uid" is "333", so argmax of "a" should be "uid" : "333".
The question is what query should be executed so that I can get the result as below?
[
{
"argmax_of_a": "333",
"argmax_of_b": "111",
"argmin_of_c": "111",
}
]
Here's the code snipped I'm playing with https://mongoplayground.net/p/gEDuHd-aCiZ
I can find someway to get argmax/argmin of one specific field, but I have no idea how to work on multiple fields simultaneously.
Thanks in advance!
give this aggreation pipeline a try:
db.collection.aggregate(
[
{
$group: {
_id: null,
a: { $push: { uid: '$uid', val: '$a' } },
b: { $push: { uid: '$uid', val: '$b' } },
c: { $push: { uid: '$uid', val: '$c' } }
}
},
{
$project: {
_id: 0,
max_of_a: { $arrayElemAt: ["$a", { $indexOfArray: ["$a.val", { $max: '$a.val' }] }] },
max_of_b: { $arrayElemAt: ["$b", { $indexOfArray: ["$b.val", { $max: '$b.val' }] }] },
max_of_c: { $arrayElemAt: ["$c", { $indexOfArray: ["$c.val", { $max: '$c.val' }] }] }
}
},
{
$project: {
arg_max_of_a: '$max_of_a.uid',
arg_max_of_b: '$max_of_b.uid',
arg_max_of_c: '$max_of_c.uid'
}
}
])

Add field to $map entry in MongoDB

I have MongoDB collection items with following document:
{
"values": [
{ "number1": 5, "number2": 6, "anotherProp": "...", "anotherProp2": "..." },
{ "number1": 8, "number2": 1, "anotherProp": "...", "anotherProp2": "..." }
]
}
Is there any way to add sum property to each item of values (sum = number1 + number2)? I would like to avoid naming all other properties (number1, number2, anotherProp, anotherProp2, ...), only add new one (sum). My current solution is:
db.items.aggregate([{
$project: {
values: {
$map: {
input: "$values",
as: "v",
in: {
sum: {$add: ["$$v.number1", "$$v.number2"]},
number1: "$$v.number1", // This and next 3 lines I would like to omit.
number2: "$$v.number2",
anotherProp: "$$v.anotherProp",
anotherProp2: "$$v.anotherProp2"
}
}
}
}
}])
Desired result is:
{
"values": [
{ "number1": 5, "number2": 6, "anotherProp": "...", "anotherProp2": "...", "sum": 11 },
{ "number1": 8, "number2": 1, "anotherProp": "...", "anotherProp2": "...", "sum": 9 }
]
}
Is there any way to do this? I tried use $addFields instead of $project, however result is same.
Yes, you can use $mergeObjects
db.collection.aggregate([
{
$project: {
values: {
$map: {
input: "$values",
as: "v",
in: {
"$mergeObjects": [
{
sum: {
$add: [
"$$v.number1",
"$$v.number2"
]
}
},
"$$v"
]
}
}
}
}
}
])
MongoPlayground

Can we push object value into $project using mongodb

db.setting.aggregate([
{
$match: {
status: true,
deleted_at: 0,
_id: {
$in: [
ObjectId("5c4ee7eea4affa32face874b"),
ObjectId("5ebf891245aa27c290672325")
]
}
}
},
{
$lookup: {
from: "site",
localField: "_id",
foreignField: "admin_id",
as: "data"
}
},
{
$project: {
name: 1,
status: 1,
price: 1,
currency: 1,
numberOfRecord: {
$size: "$data"
}
}
},
{
$sort: {
numberOfRecord: 1
}
}
])
how to push the currency into price object using project please guide thanks a lot, also eager to know what is difference between $addtoSet and $push, what is good option to opt it from project or fix it from $addField
https://mongoplayground.net/p/RiWnnRtksb4
Output should be like this:
[
{
"_id": ObjectId("5ebf891245aa27c290672325"),
"currency": "USD",
"name": "Menz",
"numberOfRecord": 0,
"price": {
"numberDecimal": "20",
"currency": "USD",
},
"status": true
},
{
"_id": ObjectId("5c4ee7eea4affa32face874b"),
"currency": "USD",
"name": "Dave",
"numberOfRecord": 2,
"price": {
"numberDecimal": "10",
"currency": "USD"
},
"status": true
}
]
You can insert a field into an object with project directly, like this (field price):
$project: {
name: 1,
status: 1,
price: {
numberDecimal: "$price.numberDecimal",
currency: "$currency"
},
numberOfRecord: {
$size: "$data"
}
}
By doing it with project, there is no need to use $addField.
For the difference between $addToSet and $push, read this great answer.
You can just set the object structure while projecting, so in this case there's no need for either $push or $addToSet.
{
$project: {
name: "1",
status: 1,
price: {
currency: "$currency",
numberDecimal: "$price.numberDecimal"
},
currency: 1,
numberOfRecord: {
$size: "$data",
}
}
}
Now the difference between $push and $addToSet is pretty trivial and derived from the name, $push saves all items while $addToSet will just create a set of them, for example:
input:
[
//doc1
{
item: 1
},
//doc2
{
item: 2
},
//doc3
{
item: 1
}
]
Now this:
{
$group: {
_id: null,
items: {$push: "$item"}
}
}
Will result in:
{_id: null, items: [1, 2, 1]}
While:
{
$group: {
_id: null,
items: {$addToSet: "$item"}
}
}
Will result in:
{_id: null, items: [1, 2]}

MongoDB aggregating multiple arrays of objects based on shared key

I'm writing a query to calculate multiple metrics for each user in my DB.
I've calculated all of the metrics, and have a structure like this
{
"metric1": [{"user_id": 1, "val": 13},{"user_id": 2, "val": 100}],
"metric2": [{"user_id": 2, "val": 29},{"user_id": 1, "val": 123}],
"metric3": [{"user_id": 1, "val": 46},{"user_id": 2, "val": 111]
}
I'm trying to convert the above into this structure
{
"user_id": [1,2],
"metric1": [13, 100],
"metric2": [29,123],
"metric3": [46,111]
}
So that I can display a table showing each user and the three metrics (one metric per column, and one user per row).
considering that your data is what you've said:
{
"metric1": [
{"id1": 1}, {"id2": 2}
],
"metric2": [
{"id2": 22}, {"id1": 11}
],
"metric3": [
{"id2": 222}, {"id1": 111}
]
}
all you've to do is using $unwind to be able to break the array and then $objectToArray to have access to keys
db.blah.aggregate([
{ $unwind: '$metric1' },
{ $unwind: '$metric2' },
{ $unwind: '$metric3' },
{ $project: {'metric1': { $objectToArray: '$metric1' }, 'metric2': { $objectToArray: '$metric2' }, 'metric3': { $objectToArray: '$metric3' }} },
{ $sort: { 'metric1.k' : -1} },
{ $sort: { 'metric2.k' : -1} },
{ $sort: { 'metric3.k' : -1} },
{ $unwind: '$metric1' },
{ $unwind: '$metric2' },
{ $unwind: '$metric3' },
{ $group: {
_id: null,
user_id: { $addToSet: '$metric1.k' },
metric1: { $addToSet: '$metric1.v' },
metric2: { $addToSet: '$metric2.v' },
metric3: { $addToSet: '$metric3.v' },
} },
{ $project: { _id: 0 } }
]).pretty()
which results
{
"user_id" : [
"id1",
"id2"
],
"metric1" : [
1,
2
],
"metric2" : [
11,
22
],
"metric3" : [
111,
222
]
}

Unable to filter by date the array field of a Collection in MongoDB

I am struggling with MongoDb in order to achieve a desirable result.
My Collection looks like:
{
_id: ...
place: 1
city: 6
user: 306
createDate: 2014-08-10 12:20:21,
lastUpdate: 2014-08-14 10:11:01,
data: [
{
customId4: 4,
entryDate: 2014-07-12 12:01:11,
exitDate: 2014-07-12 13:12:12
},
{
customId4: 4,
entryDate: 2014-07-14 00:00:01,
},
{
customId4: 5,
entryDate: 2014-07-15 11:01:11,
exitDate: 2014-07-15 11:05:15
},
{
customId4: 5,
entryDate: 2014-07-22 21:01:11,
exitDate: 2014-07-22 21:23:22
},
{
customId4: 4,
entryDate: 2014-07-23 14:00:11,
},
{
customId4: 4,
entryDate: 2014-07-29 22:00:11,
exitDate: 2014-07-29 23:00:12
},
{
customId4: 5,
entryDate: 2014-08-12 12:01:11,
exitDate: 2014-08-12 13:12:12
},
]
}
So what I would like to achieve is the array data that meets the requirements of a certain interval and that has both, entryDate and exitDate values set.
For example, if I filter by the interval "2014-07-23 00:00:00 to 2014-08-31 00:00:00" I would like the result like:
{
result: [
{
_id: {
place: 1,
user: 306
},
city: 6,
place: 1,
user: 306,
data: [
{
customMap: 4,
entryDate: 2014-07-22 21:01:11,
exitDate: 2014-07-22 21:23:22
},
{
customId4: ,
entryDate: 2014-07-29 22:00:11,
exitDate: 2014-07-29 23:00:12
},
]
}
],
ok: 1
}
My custom mongodb query looks like (from, to and placeIds are variables properly configured)
db.myColl.aggregate(
{ $match: {
'user': 1,
'data.entryDate': { $gte: from, $lte: to },
'place': { $in: placeIds },
}},
{ $unwind : "$data" },
{ $project: {
'city': 1,
'place': 1,
'user': 1,
'lastUpdate': 1,
'data.entryDate': 1,
'data.exitDate': 1,
'data.custom': 1,
fromValid: { $gte: ["$'data.entryDate'", from]},
toValid: { $lte: ["$'data.entryDate'", to]}}
},
{ $group: {
'_id': {'place': '$place', 'user': '$user'},
'city': {'$first': '$city'},
'place': {'$first': '$place'},
'user': {'$first': '$user'},
'data': { '$push': '$data'}
}}
)
But this doesn't filter the way I want because it outputs every document that meets the $match operand conditions, inside the $project operand I am unable to define the condition (I don't know if this is how it has to be done in mongoDB)
Thanks in advance!
You were on the right track, but what you might be missing with the aggregation "pipeline" is that just like the "|" pipe operator in the unix shell you "chain" the pipeline stages together just as you would chain commands.
So in fact to can have a second $match pipeline stage that does the filtering for you:
db.myColl.aggregate([
{ "$match": {
"user": 1,
"data.entryDate": { "$gte": from, "$lte": to },
"place": { "$in": "placeIds" },
}},
{ "$unwind": "$data" },
{ "$match": {
"data.entryDate": { "$gte": from, "$lte": to },
}},
{ "$group": {
"_id": "$_id",
"place": { "$first": "$place" },
"city": { "$first": "$city" },
"user": { "$first": "$user" },
"data": { "$push": "$data" }
}}
])
Using the actual _id of the document as a grouping key presuming that you want the document back but just with a filtered array.
From MongoDB 2.6, as long as your matching array elements are unique, you could just do the same thing within $project using the $map and $setDifference** operators:
db.myColl.aggregate([
{ "$match": {
"user": 1,
"data.entryDate": { "$gte": from, "$lte": to },
"place": { "$in": "placeIds" },
}},
{ "$project": {
"place": 1,
"city": 1,
"user": 1,
"data": {
"$setDifference": [
{ "$map": {
"input": "$data",
"as": "el",
"in": {"$cond": [
{ "$and": [
{ "$gte": [ "$$el.entryDate", from ] },
{ "$lte": [ "$$el.entryDate", to ] }
]},
"$$el",
false
]}
}},
[false]
]
}
}}
])
That does the same logical thing by processing each array element and evaluating whether it meets the conditions. If so then the element content is returned, if not the false is returned. The $setDifference filters out all the false values so that only those that match remain.