MongoDB aggregate - filter by subdocument - mongodb

I have a mongodb collection with structure like that:
[
{
name: "name1",
instances: [{value:1, score:2}, {value:2, score:5}, {value:2.5, score:9}]
},
{
name: "name2",
instances: [{value:6, score:3}, {value:1, score:6}, {value:3.7, score:5.2}]
}
]
When I want to get all the data from a document, I use aggregate because I want each instance returned as a separate document:
db.myCollection.aggregate([{$match:{name:"name1"}}, {$unwind:"$instances"}, {$project:{name:1, value:"$instances.value", score:"$instances.score"}}])
And everything works like I want it to.
Now for my question: I want to filter the returned data by score or by value. For example, I want an array of all the subdocuments of name1 which have a value greater or equal to 2.
I tried to add to the $match object 'instances.value':{$gte:2}, but it didn't filter anything, and I still get all 3 documents for this query.
Any ideas?

After unwinding instances then again used $match as below
db.collectionName.aggregate({
"$match": {
"name": "name1"
}
}, {
"$unwind": "$instances"
}, {
"$match": {
"instances.value": {
"$gte": 2
}
}
}, {
$project: {
name: 1,
value: "$instances.value",
score: "$instances.score"
}
})
Or if you tried $match after project then used as below
db.collectionName.aggregate([{
$match: {
name: "name1"
}
}, {
$unwind: "$instances"
}, {
$project: {
name: 1,
value: "$instances.value",
score: "$instances.score"
}
}, {
"$match": {
"value": {
"$gte": 2
}
}
}])

Related

get document with same 3 fields in a collection

i have a collection with more then 1000 documents and there are some documents with same value in some fields, i need to get those
the collection is:
[{_id,fields1,fields2,fields3,etc...}]
what query can i use to get all the elements that have the same 3 fields for example:
[
{_id:1,fields1:'a',fields2:1,fields3:'z'},
{_id:2,fields1:'a',fields2:1,fields3:'z'},
{_id:3,fields1:'f',fields2:2,fields3:'g'},
{_id:4,fields1:'f',fields2:2,fields3:'g'},
{_id:5,fields1:'j',fields2:3,fields3:'g'},
]
i need to get
[
{_id:2,fields1:'a',fields2:1,fields3:'z'},
{_id:4,fields1:'f',fields2:2,fields3:'g'},
]
in this way i can easly get a list of "duplicate" that i can delete if needed, it's not really important get id 2 and 4 or 1 and 3
but 5 would never be included as it's not 'duplicated'
EDIT:
sorry but i forgot to mention that there are some document with null value i need to exclude those
This is the perfect use case of window field. You can use $setWindowFields to compute $rank in the grouping/partition you want. Then, get those rank not equal to 1 to get the duplicates.
db.collection.aggregate([
{
$match: {
fields1: {
$ne: null
},
fields2: {
$ne: null
},
fields3: {
$ne: null
}
}
},
{
"$setWindowFields": {
"partitionBy": {
fields1: "$fields1",
fields2: "$fields2",
fields3: "$fields3"
},
"sortBy": {
"_id": 1
},
"output": {
"duplicateRank": {
"$rank": {}
}
}
}
},
{
$match: {
duplicateRank: {
$ne: 1
}
}
},
{
$unset: "duplicateRank"
}
])
Mongo Playground
I think you can try this aggregation query:
First group by the feilds you want to know if there are multiple values.
It creates an array with the _ids that are repeated.
Then get only where there is more than one ($match).
And last project to get the desired output. I've used the first _id found.
db.collection.aggregate([
{
"$group": {
"_id": {
"fields1": "$fields1",
"fields2": "$fields2",
"fields3": "$fields3"
},
"duplicatesIds": {
"$push": "$_id"
}
}
},
{
"$match": {
"$expr": {
"$gt": [
{
"$size": "$duplicatesIds"
},
1
]
}
}
},
{
"$project": {
"_id": {
"$arrayElemAt": [
"$duplicatesIds",
0
]
},
"fields1": "$_id.fields1",
"fields2": "$_id.fields3",
"fields3": "$_id.fields2"
}
}
])
Example here

MongoDB - How to write a nested group aggregation query

I have a collection in this format:
{
"place":"land",
"animal":"Tiger",
"name":"xxx"
},
{
"place":"land",
"animal":"Lion",
"name":"yyy"
}
I want to result to be something like this:
{
"place":"land".
"animals":{"Lion":"yyy", "Tiger":"xxx"}
}
I wrote the below query. I think there needs to be another group stage but not able to write it.
db.collection.aggregate({
'$group': {
'_id':{'place':'$place', 'animal':'$animal'},
'animalNames': {'$addToSet':'$name'}
}
})
What changes need to be made to get the required result?
$group - Group by animals. Push objects with { k: "animal", v: "name" } type into animals array.
$project - Decorate output document. Convert animals array to key-value pair via $arrayToObject.
db.collection.aggregate([
{
"$group": {
"_id": "$place",
"animals": {
"$push": {
k: "$animal",
v: "$name"
}
}
}
},
{
$project: {
_id: 0,
place: "$_id",
animals: {
"$arrayToObject": "$animals"
}
}
}
])
Sample Mongo Playground
If you are on version >=4.4, a reasonable alternative is to use the $function operator:
db.foo.aggregate([
{$project: {
'income_statement.annual': {
$function: {
body: function(arr) {
return arr.sort().reverse();
},
args: [ "$income_statement.annual" ],
lang: "js"
}}
}}
]);

Add number field in $project mongodb

I have an issue that need to insert index number when get data. First i have this data for example:
[
{
_id : 616efd7e56c9530018e318ac
student : {
name: "Alpha"
email: null
nisn: "0408210001"
gender : "female"
}
},
{
_id : 616efd7e56c9530018e318af
student : {
name: "Beta"
email: null
nisn: "0408210001"
gender : "male"
}
}
]
and then i need the output like this one:
[
{
no:1,
id:616efd7e56c9530018e318ac,
name: "Alpha",
nisn: "0408210001"
},
{
no:2,
id:616efd7e56c9530018e318ac,
name: "Beta",
nisn: "0408210002"
}
]
i have tried this code but almost get what i expected.
{
'$project': {
'_id': 0,
'id': '$_id',
'name': '$student.name',
'nisn': '$student.nisn'
}
}
but still confuse how to add the number of index. Is it available to do it in $project or i have to do it other way? Thank you for the effort to answer.
You can use $unwind which can return an index, like this:
db.collection.aggregate([
{
$group: {
_id: 0,
data: {
$push: {
_id: "$_id",
student: "$student"
}
}
}
},
{
$unwind: {path: "$data", includeArrayIndex: "no"}
},
{
"$project": {
"_id": 0,
"id": "$data._id",
"name": "$data.student.name",
"nisn": "$data.student.nisn",
"no": {"$add": ["$no", 1] }
}
}
])
You can see it works here .
I strongly suggest to use a $match step before these steps, otherwise you will group your entire collection into one document.
You need to run a pipeline with a $setWindowFields stage that allows you to add a new field which returns the position of a document (known as the document number) within a partition. The position number creation is made possible by the $documentNumber operator only available in the $setWindowFields stage.
The partition could be an extra field (which is constant) that can act as the window partition.
The final stage in the pipeline is the $replaceWith step which will promote the student embedded document to the top-level as well as replacing all input documents with the specified document.
Running the following aggregation will yield the desired results:
db.collection.aggregate([
{ $addFields: { _partition: 'students' }},
{ $setWindowFields: {
partitionBy: '$_partition',
sortBy: { _id: -1 },
output: { no: { $documentNumber: {} } }
} },
{ $replaceWith: {
$mergeObjects: [
{ id: '$_id', no: '$no' },
'$student'
]
} }
])

How to query mongodb to fetch results based on values nested parameters?

I am working with MongoDB for the first time.
I have a collection whose each document is roughly of the following form in MongoDB:
{
"name":[
{
"value":"abc",
"created_on":"2020-02-06 06:11:21.340611+00:00"
},
{
"value":"xyz",
"created_on":"2020-02-07 06:11:21.340611+00:00"
}
],
"score":[
{
"value":12,
"created_on":"2020-02-06 06:11:21.340611+00:00"
},
{
"value":13,
"created_on":"2020-02-07 06:11:21.340611+00:00"
}
]
}
How will I form a query so that I get the latest updated values of each field in the given document. I went through Query Embedded Documents, but I wasn't able to figure out how It is.
My expected output is:
{
"name": "xyz",
"score": "13"
}
If you always do push new/latest values to arrays name & score, then you can try below query, it would get last element from array as in general new/latest values will always be added as last element in an array :
db.collection.aggregate([
{ $addFields: { name: { $arrayElemAt: ['$name', -1] }, score: { $arrayElemAt: ['$score', -1] } } },
{ $addFields: { name: '$name.value', score: '$score.value' } }])
Test : MongoDB-Playground

How do I query a mongo document containing subset of nested array

Here is a doc I have:
var docIHave = {
_id: "someId",
things: [
{
name: "thing1",
stuff: [1,2,3,4,5,6,7,8,9]
},
{
name: "thing2",
stuff: [4,5,6,7,8,9,10,11,12,13,14]
},
{
name: "thing3",
stuff: [1,4,6,8,11,21,23,30]
}
]
}
This is the doc I want:
var docIWant = {
_id: "someId",
things: [
{
name: "thing1",
stuff: [5,6,7,8,9]
},
{
name: "thing2",
stuff: [5,6,7,8,9,10,11]
},
{
name: "thing3",
stuff: [6,8,11]
}
]
}
stuff´s of docIWant should only contain items greater than min=4
and smaller than max=12.
Background:
I have a meteor app and I subscribe to a collection giving me docIHave. Based on parameters min and max I need the docIWant "on the fly". The original document should not be modified. I need a query or procedure that returns me docIWant with the subset of stuff.
A practical code example would be greatly appreciated.
Use the aggregation framework for this. In the aggregation pipeline, consider the $match operator as your first pipeline stage. This is quite necessary to optimize your aggregation as you would need to filter documents that match the given criteria first before passing them on further down the pipeline.
Next use the $unwind operator. This deconstructs the things array field from the input documents to output a document for each element. Each output document is the input document with the value of the array field replaced by the element.
Another $unwind operation would be needed on the things.stuff array as well.
The next pipeline stage would then filter dopcuments where the deconstructed things.stuff match the given min and max criteria. Use a $match operator for this.
A $group operator is then required to group the input documents by a specified identifier expression and applies the accumulator expression $push to each group. This creates an array expression to each group.
Typically your aggregation should end up like this (although I haven't actually tested it but this should get you going in the right direction):
db.collection.aggregate([
{
"$match": {
"things.stuff": { "$gt": 4, "$lte": 11 }
}
},
{
"$unwind": "$things"
},
{
"$unwind": "$things.stuff"
},
{
"$match": {
"things.stuff": { "$gt": 4, "$lte": 11 }
}
},
{
"$group": {
"_id": {
"_id": "$_id",
"things": "$things"
},
"stuff": {
"$push": "$things.stuff"
}
}
},
{
"$group": {
"_id": "$_id._id",
"things": {
"$push": {
"name": "$_id.things.name",
"stuff": "$stuff"
}
}
}
}
])
If you need to transform the document on the client for display purposes, you could do something like this:
Template.myTemplate.helpers({
transformedDoc: function() {
// get the bounds - maybe these are stored in session vars
var min = Session.get('min');
var max = Session.get('max');
// fetch the doc somehow that needs to be transformed
var doc = SomeCollection.findOne();
// transform the thing.stuff arrays
_.each(doc.things, function(thing) {
thing.stuff = _.reject(thing.stuff, function(n) {
return (n < min) || (n > max);
});
});
// return the transformed doc
return doc;
}
});
Then in your template: {{#each transformedDoc.things}}...{{/each}}
Use mongo aggregation like following :
First use $unwind this will unwind stuff and then use $match to find elements greater than 4. After that $group data based on things.name and add required fields in $project.
The query will be as following:
db.collection.aggregate([
{
$unwind: "$things"
}, {
$unwind: "$things.stuff"
}, {
$match: {
"things.stuff": {
$gt: 4,
$lt:12
}
}
}, {
$group: {
"_id": "$things.name",
"stuff": {
$push: "$things.stuff"
}
}
}, {
$project: {
"thingName": "$_id",
"stuff": 1
}
}])