Sorting aggregation addToSet result - mongodb

Is there a way to get result of $addToSet as sorted array ?
I tried to expand the pipeline and $unwind the array, sort it and group it again ,
but still the result isn't sorted.
The arrays are pretty big and i try to avoid sort them in the the application.
Document Example :
{
"_id" : ObjectId("52a84825cc8391ab188b4567"),
"id" : 129624
"message" : "Sample",
"date" : "12-09-2013,17:34:34",
"dt" : ISODate("2013-12-09T17:34:34.000Z"),
}
Query :
db.uEvents.aggregate(
[
{$match : {dt : {$gte : new Date(2014,01,01) , $lt : new Date(2015,01,17)}}}
,{$sort : {dt : 1}}
, {$group : {
_id : {
id : "$id"
, year : {'$year' : "$dt"}
, month : {'$month' : "$dt"}
, day : {'$dayOfMonth' : "$dt"}
}
,dt : {$addToSet : "$dt"}
}}
]
);

Yes it is possible, but approach it differently. I'm just provide my own data for this, but you'll get the concept.
My Sample:
{ "array" : [ 2, 4, 3, 5, 2, 6, 8, 1, 2, 1, 3, 5, 9, 5 ] }
I'm going to "semi-quote" the CTO on this and state that Sets are considered to be unordered.
There is an actual JIRA, Google groups statement that goes something like that. So let's take it from "Elliot" and accept that this will be the case.
So if you want an ordered result, you have to massage that way with stages like this
db.collection.aggregate([
// Initial unwind
{"$unwind": "$array"},
// Do your $addToSet part
{"$group": {"_id": null, "array": {"$addToSet": "$array" }}},
// Unwind it again
{"$unwind": "$array"},
// Sort how you want to
{"$sort": { "array": 1} },
// Use $push for a regular array
{"$group": { "_id": null, "array": {"$push": "$array" }}}
])
And then do whatever. But now your array is sorted.

Since mongoDB version 5.2 you can do it without $unwind:
The $setIntersection create a set out of the array and the $sortArray sorts it:
db.collection.aggregate([
{$set: {array: {$setIntersection: ["$array"]}}},
{$set: {$sortArray: {input: "$array", sortBy: 1}}}
])

Related

MongoDb: How to aggregate linked documents?

I have two collections, Product and Stock
Below are example values
Product
{
"_id" : ObjectId("63513c705f31b4bcb75b80ce"),
"name" : "Coca-cola",
"stocks" : [
ObjectId("63513c705f31b4bcb75b80d0")
ObjectId("63513c705f31b4bcb75b80d1")
]
}
Stock
[{
"_id" : ObjectId("63513c705f31b4bcb75b80d0"),
"count" : 9,
"remaining" : 6,
"costPerItem" : 10,
"createdAt" : ISODate("2022-10-20T12:17:52.985+0000"),
},
{
"_id" : ObjectId("63513c705f31b4bcb75b80d1"),
"count" : 10,
"remaining" : 3,
"costPerItem" : 10,
"createdAt" : ISODate("2022-10-20T12:17:52.985+0000"),
}]
How do I query products whose sum of remaining stock (remaining field of stocks) is less than for example 100?
One option is to use:
$lookup with pipeline to get all the remaining stock count per stock
Sum it up using $sum
$match the relevant products
db.products.aggregate([
{$lookup: {
from: "stock",
let: {stocks: "$stocks"},
pipeline: [
{$match: {$expr: {$in: ["$_id", "$$stocks"]}}},
{$project: {_id: 0, remaining: 1}}
],
as: "remaining"
}},
{$set: {remaining: {$sum: "$remaining.remaining"}}},
{$match: {remaining: {$lt: 100}}}
])
See how it works on the playground example

MongoDB Closest Match on properties

Let says I have a Users collection in MongoDB whose schema looks like this:
{
name: String,
sport: String,
favoriteColor: String
}
And lets say I passed in values like this to match a user on:
{ name: "Thomas", sport: "Tennis", favoriteColor:"blue" }
What I would like to do is match the user based off all those properties. However, if no user comes back, I would like to match a user on just these properties:
{sport: "Tennis", favoriteColor:"blue" }
And if no user comes back, I would like to match a user on just this property:
{ favoriteColor: "blue" }
Is it possible to do something like this in one query with Mongo? I saw the $switch condition in Mongo that will match on a case and then immediately return, but the problem is that I can't access the document it would have retrieved in the then block. It looks like you can only write strings in there.
Any suggestions on how to accomplish what I'm looking for?
Is the best thing (and only way) to just execute multiple User.find({...}) queries?
This is a good case to use MongoDB text index:
First you need to create text index on those fields:
db.users.ensureIndex({ name: "text", sport: "text", favoriteColor: "text" });
Then you can search the best match with "$text" limited by a number to show:
db.users.find( { $text: { $search: "Tennis blue Thomas" } } ).limit(10)
Try adding rank to all documents with weightage in aggregation pipeline, and sum the rank, $sort descending to get most matched documents on top
name -> 1
sport -> 2
favoriteColor -> 4
by doing this matching favoriteColor will always have higher weightage then sport and name or combination of both
aggregate pipeline
db.col.aggregate([
{$match : {
$or : [
{"name" : {$eq :"Thomas"}},
{"sport" : {$eq : "Tennis"}},
{"favoriteColor" : {$eq : "blue"}}
]
}},
{$addFields : {
rank : {$sum : [
{$cond : [{$eq : ["$name", "Thomas"]}, 1, 0]},
{$cond : [{$eq : ["$sport", "Tennis"]}, 2, 0]},
{$cond : [{$eq : ["$favoriteColor", "blue"]}, 4, 0]}
]}
}},
{$match : {rank :{$gt : 0}}},
{$sort : {rank : -1}}
])
Hope this query will satisfy your require condition, you can get relevant result in single db hit. Just create a query in aggregate pipeline
db.collection.aggregate([
{$match : {
$or : [
{"name" : {$eq :"Thomas"}},
{"sport" : {$eq : "Tennis"}},
{"favoriteColor" : {$eq : "blue"}}
]
}},
{$addFields : {
rank : {$sum : [
{$cond : [{$and:[{$eq : ["$name", "Thomas"]},{$eq : ["$sport", "Tennis"]},{$eq : ["$favoriteColor", "blue"]}] } , 1, 0]},
{$cond : [{$and:[{$eq : ["$name", "Thomas"]},{$eq : ["$sport", "Tennis"]}] } , 2, 0]},
{$cond : [{$and:[{$eq : ["$name", "Thomas"]}] } , 3, 0]},
]}
}},
{$group:{
_id:null,
doc:{$push:'$$ROOT'},
rank:{$max:'$rank'}
}},
{$unwind:'$doc'},
{$redact: {
$cond: {
if: { $eq: [ "$doc.rank", '$rank' ] },
then: "$$KEEP",
else: "$$PRUNE"
}
}},
{
$project:{
name:'$doc.name',
sport:'$doc.sport',
favoriteColor:'$doc.favoriteColor',
}}
])
Simply create a query builder for $match pipe in mongoDB aggregate pipeline or use it for find also, create JavaScript object variable and build your query dynamically.
var query={};
if(name!=null){
query['name']={ '$eq': name};
}
if(sport!=null){
query['sport']={ '$eq': sport};
}
if(favoriteColor!=null){
query['favoriteColor']={ '$eq': favoriteColor};
}
db.collection.find(query)
It will give exactly matched result on dynamic basis
Did you try with $or:https://docs.mongodb.com/manual/reference/operator/query/or/
I used it when I wanted to check if username or email exists..

How to sort sub-documents in the array field?

I'm using the MongoDB shell to fetch some results, ordered. Here's a sampler,
{
"_id" : "32022",
"topics" : [
{
"weight" : 281.58551703724993,
"words" : "some words"
},
{
"weight" : 286.6695125796183,
"words" : "some more words"
},
{
"weight" : 289.8354232846977,
"words" : "wowz even more wordz"
},
{
"weight" : 305.70093587160807,
"words" : "WORDZ"
}]
}
what I want to get is, same structure, but ordered by "topics" : []
{
"_id" : "32022",
"topics" : [
{
"weight" : 305.70093587160807,
"words" : "WORDZ"
},
{
"weight" : 289.8354232846977,
"words" : "wowz even more wordz"
},
{
"weight" : 286.6695125796183,
"words" : "some more words"
},
{
"weight" : 281.58551703724993,
"words" : "some words"
},
]
}
I managed to get some ordered results, but no luck in grouping them by id field. is there a way to do this?
MongoDB doesn't provide a way to do this out of the box but there is a workaround which is to update your documents and use the $sort update operator to sort your array.
db.collection.update_many({}, {"$push": {"topics": {"$each": [], "$sort": {"weight": -1}}}})
You can still use the .aggregate() method like this:
db.collection.aggregate([
{"$unwind": "$topics"},
{"$sort": {"_id": 1, "topics.weight": -1}},
{"$group": {"_id": "$_id", "topics": {"$push": "$topics"}}}
])
But this is less efficient if all you want is sort your array, and you definitely shouldn't do that.
You could always do this client side using the .sort or sorted function.
If you don't want to update but only get documents, you can use the following query
db.test.aggregate(
[
{$unwind : "$topics"},
{$sort : {"topics.weight":-1}},
{"$group": {"_id": "$_id", "topics": {"$push": "$topics"}}}
]
)
It works for me:
db.getCollection('mycollection').aggregate(
{$project:{topics:1}},
{$unwind:"$topics"},
{$sort :{"topics.words":1}})

MongoDB nested grouping

I have the following MongoDB data model:
{
"_id" : ObjectId("53725814740fd6d2ee0ca2bb"),
"date" : "2014-01-01",
"establishmentId" : 1,
"products" : [
{
"productId" : 1,
"price" : 7.03,
"someOtherInfo" : 325,
"somethingElse" : 6878
},
{
"productId" : 2,
"price" : 4.6,
"someOtherInfo" : 243,
"somethingElse" : 1757
},
{
"productId" : 3,
"price" : 2.14,
"someOtherInfo" : 610,
"somethingElse" : 5435
},
{
"productId" : 4,
"price" : 1.45,
"someOtherInfo" : 627,
"somethingElse" : 5762
},
{
"productId" : 5,
"price" : 3.9,
"someOtherInfo" : 989,
"somethingElse" : 3752
}
}
What is the fastest way to get the average price across all establishments? Is there a better data model to achieve this?
An aggregation operation should handle this well. I'd suggest looking into the $unwind operation.
Something along these lines should work (just as an example):
db.collection.aggregate(
{$match: {<query parameters>}},
{$unwind: "$products"},
{
$group: {
_id: "<blank or field(s) to group by before averaging>",
$avg: "$price"
}
}
);
An aggregation built in this style should produce a JSON object that has the data you want.
Due to the gross syntax errors in anything else provided the more direct answer is:
db.collection.aggregate([
{ "$unwind": "$products" },
{ "$group": {
"_id": null,
"avgprice": { "$avg": "$products.price" }
}}
])
The usage of the aggregation framework here is to first $unwind the array, which is a way to "de-normalize" the content in the array into separate documents.
Then in the $group stage you pass in a value of null to the _id which means "group everything" and pass your $products.price ( note the dot notation ) in to the $avg operator to return the total average value across all of the sub-document entries in all of your documents in the collection.
See the full operator reference for more information.
The best solution I found was:
db.collection.aggregate([
{$match:{date:{$gte:"2014-01-01",$lte:"2014-01-31"},establishmentId:{$in:[1,2,3,4,5,6]}}
{ "$unwind": "$products" },
{ "$group": {
"_id": {date:"$date",product:"$products.productId"},
"avgprice": { "$avg": "$products.price" }
}}
])
And something I found out also was that it is much better to first use match and then unwind so there are fewer items to unwind. This results in a faster overall process.

MongoDB: Sort on an array and return the same document multiple times in the cursor

Say I have a collection that has these documents in it:
{sort:[1,2,4,6], fruit:'apple'}
{sort:[3], fruit:'cherry'}
{sort:[5], fruit:'orange'}
And I want to run a query similar to this:
db.collection.find().sort({sort: -1})
But have it return the documents without dedupeing them first like this:
{sort:[1,2,4,6], fruit:'apple'}
{sort:[1,2,4,6], fruit:'apple'}
{sort:[3], fruit:'cherry'}
{sort:[1,2,4,6], fruit:'apple'}
{sort:[5], fruit:'orange'}
{sort:[1,2,4,6], fruit:'apple'}
Instead of this:
{sort:[1,2,4,6], fruit:'apple'}
{sort:[3], fruit:'cherry'}
{sort:[5], fruit:'orange'}
Is there any way to achieve this in the current MongoDB?
You can use the aggregation framework's $unwind operator to unwind an array into multiple documents.
db.fruits.aggregate(
{$unwind: "$sort"},
{$sort: {sort: 1}}
)
The $unwind operation "unwinds" an array on a document by creating multiple documents, one for each value in that array. Then, we just sort on the criteria given. So for the inputs:
> db.fruits.insert({sort:[1,2,4,6], fruit:'apple'})
> db.fruits.insert({sort:[3], fruit:'cherry'})
> db.fruits.insert({sort:[5], fruit:'orange'})
We get the resulting output:
> db.fruits.aggregate({$unwind: "$sort"}, {$sort: {sort: 1}})
{
"result" : [
{
"_id" : ObjectId("51f0592c8a542caf3f07fa66"),
"sort" : 1,
"fruit" : "apple"
},
{
"_id" : ObjectId("51f0592c8a542caf3f07fa66"),
"sort" : 2,
"fruit" : "apple"
},
{
"_id" : ObjectId("51f059358a542caf3f07fa67"),
"sort" : 3,
"fruit" : "cherry"
},
{
"_id" : ObjectId("51f0592c8a542caf3f07fa66"),
"sort" : 4,
"fruit" : "apple"
},
{
"_id" : ObjectId("51f0593b8a542caf3f07fa68"),
"sort" : 5,
"fruit" : "orange"
},
{
"_id" : ObjectId("51f0592c8a542caf3f07fa66"),
"sort" : 6,
"fruit" : "apple"
}
],
"ok" : 1
}
If you need to maintain the original sort array, you can use $project to create a projection of the original document that copies the sort field into an expanded_sort field, which is then unwound, like so:
db.fruits.aggregate(
{$project: {sort: 1, fruit: 1, expanded_sort: "$sort"}},
{$unwind: "$expanded_sort"},
{$sort: {expanded_sort: 1}}
)
This gets you results like so:
"result" : [
{
"_id" : ObjectId("51f0592c8a542caf3f07fa66"),
"sort" : [ 1, 2, 4, 6 ],
"fruit" : "apple",
"expanded_sort" : 1
},