Lookup and sort the foreign collection - mongodb

so I have a collection users, and each document in this collection, as well as other properties, has an array of ids of documents in the other collection: workouts.
Every document in the collection workouts has a property named date.
And here's what I want to get:
For a specific user, I want to get an array of {workoutId, workoutDate} for the workouts that belong to that user, sorted by date.
This is my attempt, which is working fine.
Users.aggregate([
{
$match : {
_id : ObjectId("whateverTheUserIdIs")
}
},
{
$unwind : {
path : "$workouts"
}
}, {
$lookup : {
from : "workouts",
localField : "workouts",
foreignField : "_id",
as : "workoutDocumentsArray"
}
}, {
$project : {
_id : false,
workoutData : {
$arrayElemAt : [
$workoutDocumentsArray,
0
]
}
}
}, {
$project : {
date : "$workoutData.date",
id : "$workoutData._id"
}
}, {
$sort : {date : -1}
}
])
However I refuse to believe I need all this for what would be such a simple query in SQL!? I believe I must at least be able to merge the two $project stages into one? But I've not been able to figure out how looking at the docs.
Thanks in advance for taking the time! ;)
====
EDIT - This is some sample data
Collection users:
[{
_id:xxx,
workouts: [2,4,6]
},{
_id: yyy,
workouts: [1,3,5]
}]
Colleciton workouts:
[{
_id:1,
date: 1/1/1901
},{
_id:2,
date: 2/2/1902
},{
_id:3,
date: 3/3/1903
},{
_id:4,
date: 4/4/1904
},{
_id:5,
date: 5/5/1905
},{
_id:6,
date: 6/6/1906
}]
And after running my query, for example for user xxx, I would like to get only the workouts that belong to him (whose ids appear in his workouts array), so the result I want would look like:
[{
id:6,
date: 6/6/1906
},{
id:4,
date: 4/4/1904
},{
id:2,
date: 2/2/1902
}]

You don't need to $unwind the workouts array as it already contains array of _ids and use $replaceRoot instead of doing $project
Users.aggregate([
{ "$match": { "_id" : ObjectId("whateverTheUserIdIs") }},
{ "$lookup": {
"from" : "workouts",
"localField" : "workouts",
"foreignField" : "_id",
"as" : "workoutDocumentsArray"
}},
{ "$unwind": "$workoutDocumentsArray" },
{ "$replaceRoot": { "newRoot": "$workoutDocumentsArray" }}
{ "$sort" : { "date" : -1 }}
])
or even with new $lookup syntax
Users.aggregate([
{ "$match" : { "_id": ObjectId("whateverTheUserIdIs") }},
{ "$lookup" : {
"from" : "workouts",
"let": { "workouts": "$workouts" },
"pipeline": [
{ "$match": { "$expr": { "$in": ["$_id", "$$workouts"] }}},
{ "$sort" : { "date" : -1 }}
]
"as" : "workoutDocumentsArray"
}},
{ "$unwind": "$workoutDocumentsArray" },
{ "$replaceRoot": { "newRoot": "$workoutDocumentsArray" }}
])

Related

Mongo query how to retrieve the latest inserted array value?

I have a mongodb collection which contains some array values such as ActivityType, Note and ActivityDate. The array name is called activities. I need to rename some fields so I used aggregate and $project to rename some columns for the output. But I only need to return the latest inserted ActivityDate for the array value.
My current query returns all the array value in the Activity array:
db.test.aggregate([
{$match: {}
}, {$unwind: "$activities"},
{$match: {}},
{ "$project": {
"_id" : 0,
"Project Number": "$ProjectNumber" ,
"Activity Type": "$activities.activityTypeDesc" ,
"Date of Activity": {
"$dateToString": { "format": "%Y-%m-%d", "date": "$activities.dateOfActivity" }
}
}}
])
It is sort of like getting the top 1 order by in sql server. How do I do that in Mongodb? After some reading seems like I need to use $sort and $group, but I don't know how to fit in here.
I have some sample data below:
{
"_id" : ObjectId("5fd289a93f7cf02c36837ca7"),
"ProjectNumber" : "ABC1234567",
"activities" : [
{
"activityTypeDesc" : "Type1",
"dateOfActivity" : ISODate("2021-02-20T06:00:00.000Z"),
"activityNote" : ""
},
{
"activityTypeDesc" : "Type2",
"dateOfActivity" : ISODate("2021-03-04T06:00:00.000Z"),
"activityNote" : ""
},
{
"activityTypeDesc" : "Type3",
"dateOfActivity" : ISODate("2021-01-04T06:00:00.000Z"),
"activityNote" : ""
},
{
"activityTypeDesc" : "Type4",
"dateOfActivity" : ISODate("2021-04-15T05:00:00.000Z"),
"activityNote" : ""
}
]
}
{
"_id" : ObjectId("5fd2ca65d1a01d157c0179be"),
"ProjectNumber" : "12345",
"activities" : []
}
The result of the query should return two rows, one with the lastest activitydate , one with no activitydate (as no array value)
Any help will be appreciated!
$unwind deconstruct activities array
$sort by dateOfActivity in descending order
$group by _id and get first activity required fields
db.collection.aggregate([
{
$unwind: {
path: "$activities",
preserveNullAndEmptyArrays: true
}
},
{ $sort: { "activities.dateOfActivity": -1 } },
{
$group: {
_id: "$_id",
"Project Number": { $first: "$ProjectNumber" },
"Activity Type": { $first: "$activities.activityTypeDesc" },
"Date Of Activity": {
$first: {
"$dateToString": {
"format": "%Y-%m-%d",
"date": "$activities.dateOfActivity"
}
}
}
}
}
])
Playground

MongoDB aggregate return count of documents or 0

I have the following aggregate query:
db.user.aggregate()
.match({"business_account_id" : ObjectId("5e3377bcb1dbae5124e4b6bf")})
.lookup({
'localField': 'profile_id',
'from': 'profile',
'foreignField' : '_id',
'as': 'profile'
})
.unwind("$profile")
.match({"profile.type" : "consultant"})
.group({_id:"$business_account_id", count:{$sum:1}})
My goal is to count how many consultant users belong to a given company.
Using the query above, if there is at least one user belonging to the provided business_account_id I get a correct count value.
But if there are none users, the .match({"business_account_id" : ObjectId("5e3377bcb1dbae5124e4b6bf")}) will return an empty (0 documents) result.
How can I get a count: 0 if the there are no users assigned to the company ?
I tried many approach based on other threads but I coundn't get a count: 0
UPDATE 1
A simple version of my problem:
user collection
{
"_id" : ObjectId("5e36beb7b1dbae5124e4b6dc"),
"business_account_id" : ObjectId("5e3377bcb1dbae5124e4b6bf"),
},
{
"_id" : ObjectId("5e36d83db1dbae5124e4b732"),
"business_account_id" : ObjectId("5e3377bcb1dbae5124e4b6bf"),
}
Using the following aggregate query:
db.getCollection("user").aggregate([
{ "$match" : {
"business_account_id" : ObjectId("5e3377bcb1dbae5124e4b6bf")
}
},
{ "$group" : {
"_id" : "$business_account_id",
"count" : { "$sum" : 1 }
}
}
]);
I get:
{
"_id" : ObjectId("5e3377bcb1dbae5124e4b6bf"),
"count" : 2
}
But if I query for an ObjectId that doesn't exist, such as:
db.getCollection("user").aggregate([
{ "$match" : {
"business_account_id" : ObjectId("5e335c873e8d40676928656d")
}
},
{ "$group" : {
"_id" : "$business_account_id",
"count" : { "$sum" : 1 }
}
}
]);
I get an result completely empty. I would expect to get:
{
"_id" : ObjectId("5e335c873e8d40676928656d"),
"count" : 0
}
The root of the problem is if there is no document in the user collection that satisfies the initial $match there is nothing to pass to the next stage of the pipeline. If the business_account_id actually exists somewhere (perhaps another collection?) run the aggregation against that collection so that the initial match finds at least one document. Then use $lookup to find the users. If you are using MongoDB 3.6+, you can might combine the user and profile lookups. Lastly, use $size to count the elements in the users array.
(You will probably need to tweak the collection and field names)
db.businesses.aggregate([
{$match:{_id : ObjectId("5e3377bcb1dbae5124e4b6bf")}},
{$project: { _id:1 }},
{$lookup:{
from: "users",
let: {"busId":"$_id"},
as: "users",
pipeline: [
{$match: {$expr:{$eq:[
"$$busId",
"$business_account_id"
]}}},
{$lookup:{
localField: "profile_id",
from: "profile",
foreignField : "_id",
as: "profile"
}},
{$match: { "profile.type" : "consultant"}}
]
}},
{$project: {
_id: 0,
business_account_id: "$_id",
count:{$size:"$users"}
}}
])
Playground
Since you match non-existing business_account_id value, aggregation process will stop.
Workaround: We perform 2 aggregations in parallel with $facet operator to get default value if matching has no result.
Note: Make sure user collection has at least 1 record, otherwise this won't work
db.user.aggregate([
{
$facet: {
not_found: [
{
$project: {
"_id": ObjectId("5e3377bcb1dbae5124e4b6bf"),
"count": { $const: 0 }
}
},
{
$limit: 1
}
],
found: [
{
"$match": {
"business_account_id": ObjectId("5e3377bcb1dbae5124e4b6bf")
}
},
{
"$group": {
"_id": "$business_account_id",
"count": { "$sum": 1 }
}
}
]
}
},
{
$replaceRoot: {
newRoot: {
$mergeObjects: [
{
$arrayElemAt: ["$not_found", 0]
},
{
$arrayElemAt: ["$found", 0]
}
]
}
}
}
])
MongoPlayground

Why mongodb upsert counter doesn't reflect actual changes?

Consider following query
var collectionName = "test";
db.createCollection(collectionName);
db.getCollection(collectionName).insert({
"_id" : 1, "char" : "Gandalf", "class" : "barbarian", "lvl" : 20
});
db.getCollection(collectionName).bulkWrite([
{
insertOne: {
"document": {
"_id" : 2, "char" : "Dithras", "class" : "barbarian", "lvl" : 4
}
}
},
{
updateOne: {
filter: { "_id": 1},
update: { $set: {"class" : "mage"} },
upsert: true
}
}
])
Which results in:
{
"acknowledged" : true,
"deletedCount" : 0.0,
"insertedCount" : 1.0,
"matchedCount" : 1.0,
"upsertedCount" : 0.0,
"insertedIds" : {
"0" : 2.0
},
"upsertedIds" : {}
}
my question is why updating of a document with id:1 doesn't get into upsertedIds? Isn't this document just being updated with upsert? Or am I missing anything?
According to the documentation it only adds information to upsert if it doesn't find any document (so it's actually more like inserted), but this case I don't know which items got updated.
Is it possible to get which documents got modified when executing a query?
To avoid XY problem: I want to see bulk operation items taht failed (e.g. when trying to update non-existing document with upsert:false) and log IDs that triggered the failures.
You can collect document ids when you prepare your bulk update, then check which of them don't exist after the update. E.g. for document ids 1, 2, and 3 that you updated with upsert:false
db.test.aggregate([
{ $group: { _id: null }},
{ $project: { _id: [ 1, 2, 3 ] }},
{ $unwind: "$_id" },
{ $lookup: {
from: "test",
localField: "_id",
foreignField: "_id",
as: "exists"
}},
{ $match: { exists: { $size:0 }}},
{ $project: { exists:0 }}
])
will return id's of filters that didn't match any document, i.e. "failed" in your terminology.

Create unique ObjectId during $project pipeline

I have an aggregation framework query that is summarizing certain document data into a lookup set. Unfortunately, I can't provide the data since it's company-related. Here is the query and data fragment from the last stages of the pipeline:
...
{ $group: { _id: "$SectionId", "Questions": { $addToSet: "$Questions" } } },
{ $unwind: "$Questions" },
which returns data like this: Note that _id is not unique.
{
"_id" : "Tonometry",
"Questions" : {
"MappingId" : "Exophoria",
"PositiveLabel" : "Positive",
"NegativeLabel" : "Negative"
}
},
{
"_id" : "Tonometry",
"Questions" : {
"MappingId" : "Heterophoria",
"PositiveLabel" : "Positive",
"NegativeLabel" : "Negative"
}
},
The next stage in the pipeline is this:
{
$project: {
"_id": 1,
"id": ObjectId(),
"SectionId": "$_id",
"MappingId": "$Questions.MappingId",
"PositiveLabel": "$Questions.PositiveLabel",
"NegativeLabel": "$Questions.NegativeLabel",
}
},
which produces:
{
"_id" : "Tonometry",
"id" : ObjectId("5d1cf66cf526f23524f865c6"),
"SectionId" : "Tonometry",
"MappingId" : "Exophoria",
"PositiveLabel" : "Positive",
"NegativeLabel" : "Negative"
},
{
"_id" : "Tonometry",
"id" : ObjectId("5d1cf66cf526f23524f865c6"),
"SectionId" : "Tonometry",
"MappingId" : "Heterophoria",
"PositiveLabel" : "Positive",
"NegativeLabel" : "Negative"
},
I tried creating a new field Id that has a unique ObjectId but unfortunately just re-uses the same ObjectId in all the nodes. This is important because when I attempt to use $out, it requires a unique _id.
How do I add a unique ObjectId to each node?
Using $out does not need a unique _id, you can use $replaceRoot together with $mergeObjects prior to $out pipeline, this will merge the Question document into the desired document without an _id field and $out will create the _id field for you in the new collection:
[
....
{ "$group": { "_id": "$SectionId", "Questions": { "$addToSet": "$Questions" } } },
{ "$unwind": "$Questions" },
{ "$replaceRoot": {
"newRoot": {
"$mergeObjects": [
{ "Section": "$_id" },
"$Questions"
]
}
} },
{ "$out": "new-collection" }
]

Removing duplicates in mongodb with aggregate query

db.games.aggregate([
{ $unwind : "$rounds"},
{ $match: {
"rounds.round_values.gameStage": "River",
"rounds.round_values.decision": "BetPlus" }
},
{ $project: {"FinalFundsChange":1, "GameID":1}
}])
The resulting output is:
{ "_id" : ObjectId("57cbce66e281af12e4d0731f"), "GameID" : "229327202", "FinalFundsChange" : 0.8199999999999998 }
{ "_id" : ObjectId("57cbe2fce281af0f34020901"), "FinalFundsChange" : -0.1599999999999997, "GameID" : "755030199" }
{ "_id" : ObjectId("57cbea3ae281af0f340209bc"), "FinalFundsChange" : 0.10000000000000009, "GameID" : "231534683" }
{ "_id" : ObjectId("57cbee43e281af0f34020a25"), "FinalFundsChange" : 1.7000000000000002, "GameID" : "509975754" }
{ "_id" : ObjectId("57cbee43e281af0f34020a25"), "FinalFundsChange" : 1.7000000000000002, "GameID" : "509975754" }
As you can see the last element is a duplicate, that's because the unwind creates two elements of it, which it should. How can I (while keeping the aggregate structure of the query) keep the first element of the duplicate or keep the last element of the duplicate only?
I have seen that the ways to do it seem to be related to either $addToSet or $setUnion (any details how this works exactly are appreciated as well), but I don't understand how I can choose the 'subset' by which I want to identify the duplicates (in my case that's the 'GameID', other values are allowed to be different) and how I can select whether I want the first or the last element.
You could group by _id via $group and then use the $last and $first operator respectively to keep the last or first values.
db.games.aggregate([
{ $unwind : "$rounds"},
{ $match: {
"rounds.round_values.gameStage": "River",
"rounds.round_values.decision": "BetPlus" }
},
{ $group: {
_id: "$_id",
"FinalFundsChange": { $first: "$FinalFundsChange" },
"GameID": { $last: "$GameID" }
}
}
])
My problem was find all users who purchase same product, where a user can purchase a product multiple time.
https://mongoplayground.net/p/UTuT4e_N6gn
db.payments.aggregate([
{
"$lookup": {
"from": "user",
"localField": "user",
"foreignField": "_id",
"as": "user_docs"
}
},
{
"$unwind": "$user_docs",
},
{
"$group": {
"_id": "$user_docs._id",
"name": {
"$first": "$user_docs.name"
},
}
},
{
"$project": {
"_id": 0,
"id": "$_id",
"name": "$name"
}
}
])