Results based on $sort in $lookup of mongodb - mongodb

I have to sort the final results based on the lookup result. Below is my aggregate query:
{ $match : {status:'active'},
{ $limit : 10},
{ $lookup:
{
from : "metas",
localField : "_id",
foreignField: "post_id",
as : "meta"
}
}
This query produce results as:
{
"_id": "594b6adc2a8c4f294025e46e",
"title": "Test 1",
"created_at": "2017-06-22T06:59:40.809Z",
"meta": [
{
"_id": "594b6b072a8c4f294025e46f",
"post_id": "594b6adc2a8c4f294025e46e",
"views": 1,
},
{
"_id": "594b6b1c2a8c4f294025e471",
"post_id": "594b6adc2a8c4f294025e46e",
}
],
},
{
"_id": "594b6adc2a8c4f29402f465",
"title": "Test 2",
"created_at": "2017-06-22T06:59:40.809Z",
"meta": [
{
"_id": "594b6b072a8c4f294025e46f",
"post_id": "594b6adc2a8c4f29402f465",
"views": 0,
},
{
"_id": "594b6b1c2a8c4f294025e471",
"post_id": "594b6adc2a8c4f29402f465",
}
],
},
{
"_id": "594b6adc2a8c4f29856d442",
"title": "Test 3",
"created_at": "2017-06-22T06:59:40.809Z",
"meta": [
{
"_id": "594b6b072a8c4f294025e46f",
"post_id": "594b6adc2a8c4f29856d442",
"views": 3,
},
{
"_id": "594b6b1c2a8c4f294025e471",
"post_id": "594b6adc2a8c4f29856d442",
}
],
}
Now what I want here is to sort these results based on 'views' under 'meta'. Like result will be list in descending order of 'meta.views'. First result will be meta with views=3, then views=1 and then views=0

$unwind operator splits an array into seperate documents for each object contained in an array
For eg
db.collection.aggregate(
// Pipeline
[
// Stage 1
{
$unwind: {
path : "$meta"
}
},
// Stage 2
{
$sort: {
'meta.views':-1
}
},
]
);

Although $lookup does not support sorting, the easiest solution I think, and probably also the fastest, is to create a proper index on the related collection.
In this case, an index on the metas collection on the foreign field post_id and the field on which sorting is wanted views. Make sure to make the index in the correct sorting order.
Not only is the result now sorted, the query is probably also faster now it can use an index.

Related

Mongo db - How perform unwind and match with condition

Let's say my products collection include products that each one has items of array as below.
[
{
"_id": "1",
"score": 200,
"items": [
{
"_id": "1",
"title": "title1",
"category": "sport"
},
{
"_id": "2",
"title": "title2",
"category": "sport"
},
{
"_id": "3",
"title": "title3",
"category": "tv"
},
{
"_id": "4",
"title": "title4",
"category": "movies"
}
]
},
{
"_id": "2",
"score": 1000000000,
"items": [
{
"_id": "9",
"title": "titleBoo",
"category": "food"
},
{
"title": "title4",
"category": "movies"
},
{
"title": "titlexx",
"category": "food"
},
{
"title": "titl113",
"category": "sport"
}
]
},
{
"_id": "3",
"score": 500,
"items": [
{
"title": "title3",
"category": "movies"
},
{
"title": "title3",
"category": "food"
},
{
"title": "title3",
"category": "sport"
},
{
"title": "title3",
"category": "sport"
}
]
}
]
I want to return Single Item by category that has the highest score by category, and if no category matched just return random/first product that have max score.
Example for category "food", the result should be:
{
"_id" : "9",
"title": "titleBoo",
"category": "food"
}
because it has the max score of 1000000000
and for other non exists category "Foo" the result should be some random from highest score product item let's say
{
"title": "titlexx",
"category": "food"
},
Basically what I did using java spring data aggregation pipeline
Aggregation agg1 = newAggregation(
unwind("items"),
match(Criteria.where("items.category").is(category)),
group().max("score").as("score")
);
BasicDBObject result = mongoTemplate.aggregate(
agg1, "products", BasicDBObject.class).getUniqueMappedResult();
if (result empty) { // didn't find any matched category so without match step !
Aggregation agg2 = newAggregation(
unwind("items"),
group().max("score").as("score")
);
// take some item inside max "score"
BasicDBObject res2 = mongoTemplate.aggregate(
agg2, "products", BasicDBObject.class).getUniqueMappedResult();
System.out.print(res2);
}
This code not ideal as I need to perform "unwind" twice (if not matched) do another time .. I know there is $cond / switch function, I'm wondering if I can use after unwind some switch case operation like here:
Aggregation agg = newAggregation(
unwind("items"),
// switch-case {
a. match(Criteria.where("items.category").is(category)),
if (result or size > 0) {
group().max("score").as("score") // max on matched result by category
}
b. group().max("score").as("score"). // max on random unwind score
}
);
BasicDBObject result = mongoTemplate.aggregate(
agg, "products", BasicDBObject.class).getUniqueMappedResult();
Any hints ?
Following the advice by #user20042973, one option is using $setWindowFields and $facet. It can also be done without steps 1-3, but since $unwind is considered as a less-efficient step, and $facet is not using the index, adding steps 1-3 may reduce a large part of the documents before these operations, and leave you with only two documents. After the $match step, we only left one document with the best score and documents that contain the wanted category (if there are any), sorted by the score (from the $setWindowFields step). This means we only want the first document (best score) or the second document if exist, which is highest score that guaranteed to have the category in it. So we can limit the reset of our search to these 2 documents:
db.collection.aggregate([
{$setWindowFields: {
sortBy: {score: -1},
output: {bestScore: {$max: "$score"}}
}},
{$match: {$expr: {
$or: [
{$eq: ["$score", "$bestScore"]},
{$in: [category, "$items.category"]}
]
}}},
{$limit: 2},
{$unwind: "$items"},
{$facet: {
category: [{$match: {"items.category": category}}, {$limit: 1}],
other: [{$limit: 1}]
}},
{$replaceRoot: {newRoot: {
$cond: [
{$eq: [{$size: "$category"}, 1]},
{$first: "$category"},
{$first: "$other"}
]
}}}
])
See how it works on the playground example.
You can also use $reduce to avoid the $unwind step altogether, but at this point it should have a minor effect.

Why doesn't mongoose aggregate method return all fields of a document?

I have the following document
[
{
"_id": "624713340a3d2901f2f5a9c0",
"username": "fotis",
"exercises": [
{
"_id": "624713530a3d2901f2f5a9c3",
"description": "Sitting",
"duration": 60,
"date": "2022-03-24T00:00:00.000Z"
},
{
"_id": "6247136a0a3d2901f2f5a9c6",
"description": "Coding",
"duration": 999,
"date": "2022-03-31T00:00:00.000Z"
},
{
"_id": "624713a00a3d2901f2f5a9ca",
"description": "Sitting",
"duration": 999,
"date": "2022-03-30T00:00:00.000Z"
}
],
"__v": 3
}
]
And I am executing the following aggregation (on mongoplayground.net)
db.collection.aggregate([
{
$match: {
"_id": "624713340a3d2901f2f5a9c0"
}
},
{
$project: {
exercises: {
$filter: {
input: "$exercises",
as: "exercise",
cond: {
$eq: [
"$$exercise.description",
"Sitting"
]
}
}
},
limit: 1
}
}
])
And the result is the following
[
{
"_id": "624713340a3d2901f2f5a9c0",
"exercises": [
{
"_id": "624713530a3d2901f2f5a9c3",
"date": "2022-03-24T00:00:00.000Z",
"description": "Sitting",
"duration": 60
},
{
"_id": "624713a00a3d2901f2f5a9ca",
"date": "2022-03-30T00:00:00.000Z",
"description": "Sitting",
"duration": 999
}
]
}
]
So my first question is why is the username field not included in the result?
And the second one is why aren't the exercises limited to 1? Is the limit currently applied to the whole user document? If so, is it possible to apply it only on exercises subdocument?
Thank you!
First Question
When you use $project stage, then only the properties that you specified in the stage will be returned. You only specified exercises property, so only that one is returned. NOTE that _id property is returned by default, even you didn't specify it.
Second Question
$limit is also a stage as $project. You can apply $limit to the whole resulting documents array, not to nested array property of one document.
Solution
In $project stage, you can specify username filed as well, so it will also be returned. Instead of $limit, you can use $slice to specify the number of documents that you want to be returned from an array property.
db.collection.aggregate([
{
"$match": {
"_id": "624713340a3d2901f2f5a9c0"
}
},
{
"$project": {
"username": 1,
"exercises": {
"$slice": [
{
"$filter": {
"input": "$exercises",
"as": "exercise",
"cond": {
"$eq": [
"$$exercise.description",
"Sitting"
]
}
}
},
1
]
}
}
}
])
Working example

aggregate group distinct on array of objects after querying

I have array of products where a product looks like this:
{
"invNumber":445,
"attributes": [
{
"id": "GR1",
"value": "4",
"description": "Re/Rek"
},
{
"id": "WEBAKKUNDE",
"value": "2",
"description": "NO"
},
{
"id": "WEBAKKUNDK",
"value": "1",
"description": "YES"
},
{
"id": "WEBAKMONTO",
"value": "2",
"description": "NO"
}
{
"id": "WEBPAKFTTH",
"value": "2",
"description": "NO"
}
]
}
What i want to to is get all products that have {"id":"WEBAKKUNDE",value:1} or {"id":"WEBPAKFTTH","value":"1"} and from these products than only return all distinct
{"id": "GR1"} objects.
I am trying to to something like this:
db.getCollection('products').aggregate([
{$unwind:'$attributes'},
{$match:{$or:[{$and:[{"attributes.id":"WEBAKKUNDE"},
{"attributes.value":"1"}]},{$and:[{"attributes.id":"WEBPAKFTTH"},
{"attributes.value":"1"}]}]}},
])
but i dont know how to get the distinct objects from the returned products.
You can use below aggregation query.
$match to check if the array has input criteria followed by $filter with $arrayElemAt to project the GR1 element.
$group on GR1 element to output distinct value.
Note - You will need to add GR1 criteria to the $match if you expect to have attributes without GR1 element for matching attributes.
db.products.aggregate([
{"$match":{
"attributes":{
"$elemMatch":{
"$or":[
{"id":"WEBAKKUNDE","value":"1"},
{"id":"WEBPAKFTTH","value":"1"}
]
}
}
}},
{"$group":{
"_id":{
"$arrayElemAt":[
{"$filter":{"input":"$attributes","cond":{"$eq":["$$this.id","GR1"]}}},
0
]
}
}}
])
Try the following query:
db.test.aggregate([
{ $match:{ "attributes.id" : "WEBAKKUNDE", "attributes.value":"1" } },
{ $unwind: "$attributes" },
{ $match: { "attributes.id": "GR1" } },
])
But lets explain it:
$match:{ "attributes.id" : "WEBAKKUNDE", "attributes.value":"1" } will find all documents that match the id and value attributes on the documents:
$unwind: "$attributes" will give us a document for array item, so in your example we end up with 5 documents.
$match: { "attributes.id": "GR1" } will filter out the remainding for the id being GR1
More reading:
https://docs.mongodb.com/manual/reference/operator/aggregation/match
https://docs.mongodb.com/manual/reference/operator/aggregation/unwind

Exclude fields in matching element retrieved by positional operator

I have a texts collection with documents looking like this:
{
title: 'A title',
author: 'Author Name',
published: 1944,
languages: [
{
code: 'en',
text: 'A long english text by the author...'
},
{
code: 'da',
text: 'En lang dansk tekst skrevet af forfatteren...'
}
// + many more languages
]
}
and would like to make a query that retrieves the title, author and published date, and the text in a given language, so I do this:
texts.findOne(
{ title: titleArg, language.code: languageArg },
{ 'title': 1, 'author': 1, 'published': 1, 'languages.$': 1 } ...
but I would like to return the matching language element WITHOUT mongodb's _id field.
If I do this in the projection:
{ '_id': 0, 'title': 1, 'author': 1, 'published': 1, 'languages.$': 1 }
I get the document back without it's main _id, but if I do this:
{ 'languages.$._id': 0, 'title': 1, 'author': 1, 'published': 1, 'languages.$': 1 }
or this:
{ 'languages._id': 0, 'title': 1, 'author': 1, 'published': 1, 'languages.$': 1 }
nothing at all is returned.
Does anyone know how to create a projection that returns an element in an array AND exludes some fields in that element?
You seem to be actually saying that your documents really look like this:
{
"_id": ObjectId("53b25ad420edfc7d0df16a0c"),
"title": "A title",
"author": "Author Name",
"published": 1944,
"languages": [
{
"_id": ObjectId("53b25af720edfc7d0df16a0d"),
"code": "en",
"text": "A long english text by the author..."
},
{
"_id": ObjectId("53b25b0720edfc7d0df16a0e"),
"code": "da",
"text": "En lang dansk tekst skrevet af forfatteren..."
}
]
}
Just to clear up the point, MongoDB does not insert the _id values in the array elements. This is something that certain Object Document Mappers or ODM's do, one such ODM is mongoose. But it is other software that does this as MongoDB and default drivers only place this field at the "top" level of the documents in a collection unless another value is specified in it's place.
Being specific or precise in the fields within in array that you wish to "project", is beyond the scope of what you can do with .find(). You actually need the .aggregate() method in order to "re-shape" the document and remove all the _id fields the way you want:
db.collection.aggregate([
// Match the document(s) that meet the conditions
{ "$match": {
"title": "A title",
"languages.code": "en"
}},
// Unwind the array to de-normalize for processing
{ "$unwind": "$languages" },
// Match to "filter" the actual array documents
{ "$match": { "languages.code": "en" } },
// Group back the array per document and keep only the wanted fields
{ "$group": {
"_id": "$_id",
"title": { "$first": "$title" },
"author": { "$first": "$author" },
"published": { "$first": "$published" },
"languages": {
"$push": {
"code": "$languages.code",
"text": "$languages.text"
}
}
}},
// Finally project to remove the "root" _id field
{ "$project": {
"_id": 0,
"title": 1,
"author": 1,
"published": 1,
"languages": 1
}}
])
MongoDB 2.6 introduces some new operators to make this process possible in a single $project stage:
db.collection.aggregate([
{ "$match": {
"title": "A title",
"languages.code": "en"
}},
{ "$project": {
"_id": 0,
"title": 1,
"author": 1,
"published": 1,
"languages": {
"$setDifference": [
{ "$map": {
"input": "$languages",
"as": "el",
"in": {
"$cond": [
{ "$eq": [ "$$el.code", "en" ] },
{ "code": "$$el.code", "text": "$$el.text" },
false
]
}
}},
[false]
]
}
}}
])
The general intent of such operations is usually for more involved document re-shaping than what you are doing, including the need to match multiple array elements which is something you cannot do with positional projection.
But this is also the only present way to "alter" the fields returned from within the array elements as you wish to happen.
Also look at the full list of aggregation operators for a more detailed explanation of each one.

MongoDB find subdocument and sort the results

I have a collection in MongoDB with a complex structure and subdocuments.
The document have an structure like this:
doc1 = {
'_id': '12345678',
'url': "http//myurl/...",
'nlp':{
"status": "OK",
"entities": {
"0": {
"type" : "Person",
"relevance": "0.877245",
"text" : "Neelie Kroes"
},
"1": {
"type": "Company",
"relevance": "0.36242",
"text": "ICANN"
},
"2": {
"type": "Company",
"relevance": "0.265175",
"text": "IANA"
}
}
}
}
doc2 = {
'_id': '987456321',
'url': "http//myurl2/...",
'nlp':{
"status": "OK",
"entities": {
"0": {
"type": "Company",
"relevance": "0.96",
"text": "ICANN"
},
"1": {
"type" : "Person",
"relevance": "0.36242",
"text" : "Neelie Kroes"
},
"2": {
"type": "Company",
"relevance": "0.265175",
"text": "IANA"
}
}
}
}
My task is to search for "type" AND "text" inside the subdocument, then sort by "relevance".
With the $elemMatch operator I'm able to perform the query:
db.resource.find({
'nlp.entities': {
'$elemMatch': {'text': 'Neelie Kroes', 'type': 'Person'}
}
});
Perfect, now I have to sort all the records with entities of type "Person" and value "Neelie Kroes" by relevance descending.
I tried with a normal "sort", but, as the manual said about the sort() in $elemMatch, the result may not reflect the sort order because the sort() was applied to the elements of the array before the $elemMatch projection.
In fact, the _id:987456321 will be the first (with a relevance of 0.96, but referenced to ICANN).
How can I do, to sort my documents by matched subdocument's relevance?
P.S.: I can't change the document structure.
As noted I hope your documents actually do have an array, but if $elemMatch is working for you then they should.
At any rate, you cannot sort by an element in an array using find. But there is a case where you can do this using .aggregate():
db.collection.aggregate([
// Match the documents that you want, containing the array
{ "$match": {
"nlp.entities": {
"$elemMatch": {
"text": "Neelie Kroes",
"type": "Person"
}
}
}},
// Project to "store" the whole document for later, duplicating the array
{ "$project": {
"_id": {
"_id": "$_id",
"url": "$url",
"nlp": "$nlp"
},
"entities": "$nlp.entities"
}},
// Unwind the array to de-normalize
{ "$unwind": "$entities" },
// Match "only" the relevant entities
{ "$match": {
"entities.text": "Neelie Kroes",
"entities.type": "Person"
}},
// Sort on the relevance
{ "$sort": { "entities.relevance": -1 } },
// Restore the original document form
{ "$project": {
"_id": "$_id._id",
"url": "$_id.url",
"nlp": "$_id.nlp"
}}
])
So essentially, after doing the $match condition for documents that contained the relevant match, you then use $project "store" the original document in the _id field and $unwind a "copy" of the "entities" array.
The next $match "filters" the array contents to only those ones that are relevant. Then you apply the $sort to the "matched" documents.
As the "original" document was stored under _id, you use $project to "restore" the structure that the document actually had to begin with.
That is how you "sort" on your matched element of an array.
Note that if you had multiple "matches" within an array for a parent document, then you would have to employ an additional $group stage to get the $max value for the "relevance" field in order to complete your sort.