MongoDB How to get rid of duplicates in response - mongodb

I need to get user chats by user id. The response adds the last message to select as follows: get one last message for each chat whose creation time $last_message.create_date is greater than or equal to chatsusers.start_message_id.
db.chat.aggregate([
{
$match: {
participants: "63ce54460aeee5e72c778d90"
}
},
{
$lookup: {
from: "chatsusers",
localField: "id",
foreignField: "chat_id",
as: "chat_user"
}
},
{
$unwind: {
path: "$chat_user"
}
},
{
$lookup: {
from: "message",
localField: "id",
foreignField: "chat_id",
let: {
smid: "$chat_user.start_message_id"
},
pipeline: [
{
$match: {
$expr: {
$gte: [
"$create_date",
"$$smid"
]
}
}
},
{
$sort: {
create_date: -1
}
},
{
$limit: 1
}
],
as: "last_message",
}
},
{
$unwind: {
path: "$last_message",
preserveNullAndEmptyArrays: true
}
},
{
$project: {
id: 1,
title: 1,
create_date: 1,
type: 1,
participants: 1,
owner_id: 1,
last_message: "$last_message",
unread: 1
}
}
])
Problem: You can see in the examples below:
First: duplicate messages in the response to my query, the first object is identical to the second First example
Second: the first object in the response is correct, but the second is a duplicate that for some reason didn't work sample and it has (though it shouldn't have) the last message that $last_message.create_date must be greater than or equal to chatsusers.start_message_id Second example
How do I get rid of the duplication in the response to my request?

Related

MongoDB Aggregation - How to keep only docs that has related value in foreign collection

In the lookup part in the aggregate method, how can I keep only documents that have a value in the foreign collection?
For instance, I have this collection users:
[
{ _id: 1, name: 'John', basketId: 4 },
{ _id: 2, name: 'mari', basketId: 9 },
{ _id: 3, name: 'tedd', basketId: 32 },
{ _id: 4, name: 'sara', basketId: 14 },
{ _id: 5, name: 'jane', basketId: 3 },
.
.
.
]
And another collection named baskets
[
{ _id: 1, items: 0 },
{ _id: 2, items: 2 },
{ _id: 3, items: 0 },
{ _id: 4, items: 0 },
{ _id: 5, items: 7 },
.
.
.
]
Now if I want to get users with basket items greater than 0, I use aggregate and lookup:
UserModel.aggregate([
{ $lookup:
{
from: 'baskets',
localField: 'basketId',
foreignField: '_id',
pipeline: [{ $match: { items: { $gt: 0 } } }],
as: 'basket'
}
}
])
It brings up ALL users with their basket data. For those users whose basket items are 0, it shows basket: [].
But I need to get ONLY users that have basket items greater than 0. How can it be done?
You shouldn't place the $match stage in the pipeline of $lookup. As what it did is filter the documents to be returned in the basket array.
Instead, you need a $match stage to filter the documents by comparing the first document's items value in the basket array.
UserModel.aggregate([
{
$lookup: {
from: "baskets",
localField: "basketId",
foreignField: "_id",
as: "basket"
}
},
{
$match: {
$expr: {
$gt: [
{
$first: "$basket.items"
},
0
]
}
}
}
])
Demo 1 # Mongo Playground
The question is ambiguous. You may look for the below query as well (but would return the same result as Demo 1):
UserModel.aggregate([
{
$lookup: {
from: "baskets",
localField: "basketId",
foreignField: "_id",
pipeline: [
{
$match: {
items: {
$gt: 0
}
}
}
],
as: "basket"
}
},
{
$match: {
$expr: {
$gt: [
{
$size: "$basket"
},
0
]
}
}
}
])
Or check is not an empty array
{
$ne: [
"$basket",
[]
]
}
Demo 2 # Mongo Playground

Boost search score from data in another collection

I use Atlas Search to return a list of documents (using Mongoose):
const searchResults = await Resource.aggregate()
.search({
text: {
query: searchQuery,
path: ["title", "tags", "link", "creatorName"],
},
}
)
.match({ approved: true })
.addFields({
score: { $meta: "searchScore" }
})
.exec();
These resources can be up and downvoted by users (like questions on Stackoverflow). I want to boost the search score depending on these votes.
I can use the boost operator for that.
Problem: The votes are not a property of the Resource document. Instead, they are stored in a separate collection:
const resourceVoteSchema = mongoose.Schema({
_id: { type: String },
userId: { type: mongoose.Types.ObjectId, required: true },
resourceId: { type: mongoose.Types.ObjectId, required: true },
upDown: { type: String, required: true },
After I get my search results above, I fetch the votes separately and add them to each search result:
for (const resource of searchResults) {
const resourceVotes = await ResourceVote.find({ resourceId: resource._id }).exec();
resource.votes = resourceVotes
}
I then subtract the downvotes from the upvotes on the client and show the final number in the UI.
How can I incorporate this vote points value into the score of the search results? Do I have to reorder them on the client?
Edit:
Here is my updated code. The only part that's missing is letting the resource votes boost the search score, while at the same time keeping all resource-votes documents in the votes field so that I can access them later. I'm using Mongoose syntax but an answer with normal MongoDB syntax will work for me:
const searchResults = await Resource.aggregate()
.search({
compound: {
should: [
{
wildcard: {
query: queryStringSegmented,
path: ["title", "link", "creatorName"],
allowAnalyzedField: true,
}
},
{
wildcard: {
query: queryStringSegmented,
path: ["topics"],
allowAnalyzedField: true,
score: { boost: { value: 2 } },
}
}
,
{
wildcard: {
query: queryStringSegmented,
path: ["description"],
allowAnalyzedField: true,
score: { boost: { value: .2 } },
}
}
]
}
}
)
.lookup({
from: "resourcevotes",
localField: "_id",
foreignField: "resourceId",
as: "votes",
})
.addFields({
searchScore: { $meta: "searchScore" },
})
.facet({
approved: [
{ $match: matchFilter },
{ $skip: (page - 1) * pageSize },
{ $limit: pageSize },
],
resultCount: [
{ $match: matchFilter },
{ $group: { _id: null, count: { $sum: 1 } } }
],
uniqueLanguages: [{ $group: { _id: null, all: { $addToSet: "$language" } } }],
})
.exec();
It could be done with one query only, looking similar to:
Resource.aggregate([
{
$search: {
text: {
query: "searchQuery",
path: ["title", "tags", "link", "creatorName"]
}
}
},
{$match: {approved: true}},
{$addFields: {score: {$meta: "searchScore"}}},
{
$lookup: {
from: "ResourceVote",
localField: "_id",
foreignField: "resourceId",
as: "votes"
}
}
])
Using the $lookup step to get the votes from the ResourceVote collection
If you want to use the votes to boost the score, you can replace the above $lookup step with something like:
{
$lookup: {
from: "resourceVote",
let: {resourceId: "$_id"},
pipeline: [
{
$match: {$expr: {$eq: ["$resourceId", "$$resourceId"]}}
},
{
$group: {
_id: 0,
sum: {$sum: {$cond: [{$eq: ["$upDown", "up"]}, 1, -1]}}
}
}
],
as: "votes"
}
},
{$addFields: { votes: {$arrayElemAt: ["$votes", 0]}}},
{
$project: {
"wScore": {
$ifNull: [
{$multiply: ["$score", "$votes.sum"]},
"$score"
]
},
createdAt: 1,
score: 1
}
}
As you can see on this playground example
EDIT: If you want to keep the votes on the results, you can do something like:
db.searchResults.aggregate([
{
$lookup: {
from: "ResourceVote",
localField: "_id",
foreignField: "resourceId",
as: "votes"
}
},
{
"$addFields": {
"votesCount": {
$reduce: {
input: "$votes",
initialValue: 0,
in: {$add: ["$$value", {$cond: [{$eq: ["$$this.upDown", "up"]}, 1, -1]}]}
}
}
}
},
{
$addFields: {
"wScore": {
$add: [{$multiply: ["$votesCount", 0.1]}, "$score"]
}
}
}
])
As can be seen here

MongoDB aggregate performance

I have two collections one is bids and another one is auctions. I am able to get bid inside customer wise count bids collection inside almost one million records. and auctions collection have 500k records. I need this same result as quick fetching in mongodb.
this is getting almost 29 seconds for response time. but i need quick time to get response
{ $match: { customer: '00000000823026' } },
{ $group: {
_id: '$auctioncode',
data: {
$last: '$$ROOT'
}
}
},
{
$lookup: {
from: 'auctions',
let: { auctioncode: '$_id' },
pipeline: [
{
$match: {
$expr: {
$and: [{ $eq: ['$_id', '$$auctioncode'] }],
},
},
},
],
as: 'auction',
},
},
{ $match: { auction: { $exists: true, $not: { $size: 0 } } } },
{
$addFields: {
_id: '$data._id',
auctioncode: '$data.auctioncode',
amount: '$data.amount',
customer: '$data.customer',
customerName: '$data.customerName',
maxBid: '$data.maxBid',
stockcode: '$data.stockcode',
watchlistHidden: '$data.watchlistHidden',
winner: '$data.winner'
}
},
{
$match: {
$and: [
{
// to get only RUNNING auctions in bid history
'auction.status': { $ne: 'RUNNING'},
// to filter auctions based on status
// tslint:disable-next-line:max-line-length
},
],
},
},
{ $sort: { 'auction.enddate': -1 } },
{ $count: 'totalCount'}
current result is totalCount 2640
how to optimize and need to find a way to performance changes in mongodb
If all that you require is the count of results, the below code is more optimized.
Index customer key for even better execution time.
Note: You can make use of pipeline method of $lookup if you are using MongoDB version >= 5.0 as it makes use if indexes unlike the lower version.
db.collection.aggregate([
{
$match: {
customer: '00000000823026'
}
},
{
$group: {
_id: '$auctioncode',
data: {
$last: '$$ROOT'
}
}
},
{
$lookup: {
from: 'auctions',
localField: '_id',
foreignField: '_id',
as: 'auction',
},
},
{
$match: {
// auction: {$ne: []},
// to get only RUNNING auctions in bid history
'auction.status': { $ne: 'RUNNING'},
// to filter auctions based on status
// tslint:disable-next-line:max-line-length
// {
// $addFields: { <-- Not Required
// _id: '$data._id',
// auctioncode: '$data.auctioncode',
// amount: '$data.amount',
// customer: '$data.customer',
// customerName: '$data.customerName',
// maxBid: '$data.maxBid',
// stockcode: '$data.stockcode',
// watchlistHidden: '$data.watchlistHidden',
// winner: '$data.winner'
// }
// },
// { $sort: { 'auction.enddate': -1 } }, <-- Not Required
{ $count: 'totalCount'}
], {
allowDiskUse: true
})

MongoDB use aggregate to format data from multiple collections

I have data in MongoDB collections as below:
Users:
{ name: String, email: String }
Books:
{ name: String, author: ref -> Users }
Chapters:
{ name: String, book: ref -> Books }
Paragraphs:
{ text: String, chapter: ref -> Chapters, created: Date, updated: Date, isRemoved: boolean }
I am trying to get some kind of statistical data in the following format:
[
{
book: { _id, name },
chaptersCount: 10,
paragraphs: {
count, mostRecent: { updated, created }}
author: { name, email },
},
{
...
},
...
]
So far, I have been able to get some data using the aggregate pipeline, but I am lost at this point. I have no idea how to convert it into the format I wish it to be. I can do it programmatically, but filters/sorting need to be applied on each of the final fields and it will be difficult to do that for a huge number of records (say a million).
Here is what I have so far:
const data = await ParagraphsDB.aggregate([
{ $match: { isRemoved: { $exists: false }, project: { $exists: true } } },
{ $lookup: { from: 'chapters', localField: 'chapter', foreignField: '_id', as: 'chapterDoc' }},
{ $unwind: '$chapterDoc' },
{ $lookup: { from: 'books', localField: 'chapterDoc.book', foreignField: '_id', as: 'bookDoc' }},
{ $unwind: '$bookDoc' },
{
$facet: {
paragraphCount: [
{ $count: 'value' },
],
pipelineResults: [
{ $project: { _id: 1, 'chapterDoc._id': 1, 'chapterDoc.name': 1, 'bookDoc._id': 1, 'bookDoc.name': 1 } },
],
},
},
{ $unwind: '$pipelineResults' },
{ $unwind: '$paragraphCount' },
{
$replaceRoot: {
newRoot: {
$mergeObjects: [ '$pipelineResults', { paragraphCount: '$paragraphCount.value' } ],
},
},
},
]);
I started with the Paragraph data because it is the smallest unit that could be sorted upon. How do I achieve the desired result?
Also, once I have formatted the data in the desired format, how can I sort by one of those fields?
Any help will be highly appreciated. Thanks.

When MongoDB aggregate can't find any result, it returns an array with an empty object

I want to get the list of my single project's bids within the project's aggregate.
but when I run my Mongodb aggregate, if it can't find any result for bids of the project, it returns this result:
[ { _id: 5b69f6afa1ad1827cc9e1dc6,
projectID: 100029,
bidsArray: [ { freelanceArray: {} } ]
} ]
But I want to return empty bidsArray if it can't find related bids, (something like this):
[ { _id: 5b69f6afa1ad1827cc9e1dc6,
projectID: 100029,
bidsArray: []
} ]
Here is my aggregate:
[
{
$match: {
projectID: projectID
}
},
{
$lookup: {
from: "bids",
localField: "projectID",
foreignField: "projectID",
as: "bidsArray"
}
},
{
$unwind: {
path: "$bidsArray",
preserveNullAndEmptyArrays: true
}
},
{
$lookup: {
from: "users",
localField: "bidsArray.freelanceID",
foreignField: "userID",
as: "freelanceArray"
}
},
{
$unwind: {
path: "$freelanceArray",
preserveNullAndEmptyArrays: true
}
},
{
$group: {
_id: "$_id",
projectID: { $first: "$projectID" },
bidsArray: {
$addToSet: {
bidID: "$bidsArray.bidID",
daysToDone: "$bidsArray.daysToDone",
freelanceArray: {
userID: "$freelanceArray.userID",
username: "$freelanceArray.username",
publicName: "$freelanceArray.publicName",
}
}
}
}
},
{
$project: {
projectID: 1,
bidsArray: {
bidID: 1,
daysToDone: 1,
freelanceArray: {
userID: 1,
username: 1,
publicName: 1,
}
}
}
}
]
In MongoDB 3.6, you can use the variable $$REMOVE in aggregation expressions to conditionally suppress a field.
Remove the last $project stage and update the freelanceArray expression inside $group as below. The $cond expression basically checks freelanceArray value when present ( not null value) output the fields else remove the freelanceArray.
"freelanceArray":{
"$cond":[
{"$gt":["$freelanceArray",0]},
{"userID":"$freelanceArray.userID","username":"$freelanceArray.username","publicName":"$freelanceArray.publicName"},
"$$REMOVE"]
}