MongoDB group by array subfield - mongodb

Hello I am new to mongoDB, please I hope you can help me with this question.
My collection will look like this:
{
"_id": { "$oid": "5f1fd47..." },
"email":"c#c.com",
"materials": [
{
"_id": { "$oid": "5f1fda2..." },
"title": "MDF 18mm Blanco",
"id": "mdf18blanco",
"thickness": "18",
"family": "MDF",
"color": ""
}, ...
//others materials with different family
],
}
I did an aggregate like this:
{ "$match" : { "email" : "c#c.com" } },
{ "$unwind" : "$materials" },
{ "$group" : { "_id" : "$_id", "list" : { "$push" : "$materials.family" } } }
and I return this:
{
"_id" : ObjectId("5f1fd47d502e00051c673dd1"),
"list" : [
"MDF",
"MDF",
"MDF",
"Melamina",
"Melamina",
"Melamina",
"Melamina",
"MDF",
"Melamina",
"Aglomerado",
"Aglomerado"
]
}
but i need get this
{
"_id" : ObjectId("5f1fd47d502e00051c673dd1"),
"list" : [
"MDF",
"Melamina",
"Aglomerado"
]
}
I hope you understand my question and can help me, thank you very much.

All you need to do is use $addToSet instead of $push in your group stage:
{ "$group" : { "_id" : "$_id", "list" : { "$addToSet" : "$materials.family" } } }
One thing to note is that $addToSet does not guarantee a specific order as opposed to $push in case it matters to you.

You only need change $push to $addToSet.
A set not contains repeat values so it works.
db.collection.aggregate([
{
"$match": {
"email": "c#c.com"
}
},
{
"$unwind": "$materials"
},
{
"$group": {
"_id": "$_id",
"list": {
"$addToSet": "$materials.family"
}
}
}
])
Mongo Playground example

Related

Count the objects inside of an array on each document MongoDB

My documents are organized this way:
{
"_id" : ObjectId("5ea899d7e7da54cabbc022e7"),
"date" : ISODate("2018-01-27T00:00:00Z"),
"vehicleid" : 32028,
"points" : [
{
"direction" : 225,
"location" : {
"type" : "Point",
"coordinates" : [
-3.801898,
-38.501078
]
},
"odometer" : 134746396,
"routecode" : 0,
"speed" : 0,
"deviceid" : 148590,
"metrictimestamp" : ISODate("2018-01-27T23:32:03Z")
}
Where points is an array of objects. I need to group this documents and return the amount of elements inside each array. I guess that is something like:
pipe = [
{
'$project':{
"_id":0
}
},
{
'$group':{
"_id":{
"vehicleid":"$vehicleid",
"date":"$date"
},'count':{'$size':'points'}
}
}
]
Detail: I need to run this on pymongo.
You have to use $sum to sum the size of each array like this
{
"$group": {
"_id": {
"vehicleid": "$vehicleid",
"date": "$date"
},
"count": { "$sum": { "$size": "$points" } }
}
}
You can use any of the following aggregation pipelines. You will get the size of the points array field. Each pipeline uses different approach, and the output details differ, but the size info will be same.
The code runs with PyMongo:
pipeline = [
{
"$unwind": "$points"
},
{
"$group": {
"_id": { "vehicleid": "$vehicleid", "date": "$date" },
"count": { "$sum": 1 }
}
}
]
pipeline = [
{
"$addFields": { "count": { "$size": "$points" } }
}
]
You can follow this code
$group : {
_id : {
"vehicleid":"$vehicleid",
"date":"$date"
count: { $sum: 1 }
}
}

$elemMatch for filtering out referenced($ref) array objects in mongodb is not working

I have 2 collections student_details and subject_details where each student can have multiple subjects which I am storing in student_details collection as reference array.
Now I need to fetch Student details along with the filtered subjects where subject_details.status=ACTIVE.
How can I achieve this using $elemMatch for $ref objects.
I was using something like below but it is not returning any records.
db.getCollection('student_details').find( { subjects: { $elemMatch: { $ref: "subject_details", status: 'ACTIVE' }}})
student_details
================
{
"_id" : "STD-1",
"name" : "XYZ",
"subjects" : [
{
"$ref" : "subject_details",
"$id" : "SUB-1"
},
{
"$ref" : "subject_details",
"$id" : "SUB-2"
},
{
"$ref" : "subject_details",
"$id" : "SUB-3"
}
]
}
subject_details
===============
{
"_id" : "SUB-1",
"name" : "MATHEMATICS",
"status" : "ACTIVE"
}
{
"_id" : "SUB-2",
"name" : "PHYSICS",
"status" : "ACTIVE"
}
{
"_id" : "SUB-3",
"name" : "CHEMISTRY",
"status" : "INACTIVE"
}
dbref's are troublesome when used in lookups. but you can work around it with the following aggregation pipeline:
db.student_details.aggregate([
{
$unwind: "$subjects"
},
{
$set: {
"fk": {
$arrayElemAt: [{
$objectToArray: "$subjects"
}, 1]
}
}
},
{
$lookup: {
"from": "subject_details",
"localField": "fk.v",
"foreignField": "_id",
"as": "subject"
}
},
{
$match: {
"subject.status": "ACTIVE"
}
},
{
$group: {
"_id": "$_id",
"name": {
$first: "$name"
},
"subjects": {
$push: {
$arrayElemAt: ["$subject", 0]
}
}
}
}
])
the resulting object would be like so:
{
"_id": "STD-1",
"name": "XYZ",
"subjects": [
{
"_id": "SUB-1",
"name": "MATHEMATICS",
"status": "ACTIVE"
},
{
"_id": "SUB-2",
"name": "PHYSICS",
"status": "ACTIVE"
}
]
}
because they are in 2 collections you need $lookUp to bring them together... before that I believe you need to $unwind the Subjects array... kind of aircode here so this isn't so much an answer as general advice... the aggregation pipeline is used to do these in stages...
am assuming you are abbreviating for the post...cause if Subject Details is really just 3 fields your schema is better served in the NoSQL world to just put that info with Student Details and use 1 collection rather than a normalized relational approach

Combine results based on condition during group by

Mongo query generated out of java code:
{
"pipeline": [{
"$match": {
"Id": "09cd9a5a-85c5-4948-808b-20a52d92381a"
}
},
{
"$group": {
"_id": "$result",
"id": {
"$first": "$result"
},
"labelKey": {
"$first": {
"$ifNull": ["$result",
"$result"]
}
},
"value": {
"$sum": 1
}
}
}]
}
Field 'result' can have values like Approved, Rejected, null and "" (empty string). What I am trying to achieve is combining the count of both null and empty together.
So that the empty string Id will have the count of both null and "", which is equal to 4
I'm sure theres a more "proper" way but this is what i could quickly come up with:
[
{
"$group" : {
"_id" : "$result",
"id" : {
"$first" : "$result"
},
"labelKey" : {
"$first" : {
"$ifNull" : [
"$result",
"$result"
]
}
},
"value" : {
"$sum" : 1.0
}
}
},
{
"$group" : {
"_id" : {
"$cond" : [{
$or: [
{"$eq": ["$_id", "Approved"]},
{"$eq": ["$_id", "Rejected"]},
]}},
"$_id",
""
]
},
"temp" : {
"$push" : {
"_id" : "$_id",
"labelKey" : "$labelKey"
}
},
"count" : {
"$sum" : "$value"
}
}
},
{
"$unwind" : "$temp"
},
{
"$project" : {
"_id" : "$temp._id",
"labelKey": "$temp.labelKey",
"count" : "$count"
}
}
],
);
Due to the fact the second group is only on 4 documents tops i don't feel too bad about doing this.
I have used $facet.
The MongoDB stage $facet lets you run several independent pipelines within the stage of a pipeline, all using the same data. This means that you can run several aggregations with the same preliminary stages, and successive stages.
var queries = [{
"$match": {
"Id": "09cd9a5a-85c5-4948-808b-20a52d92381a"
}
},{
$facet: {//
"empty": [
{
$match : {
result : { $in : ['',null]}
}
},{
"$group" : {
"_id" : null,
value : { $sum : 1}
}
}
],
"non_empty": [
{
$match : {
result : { $nin : ['',null]}
}
},{
"$group" : {
"_id" : '$result',
value : { $sum : 1}
}
}
]
}
},
{
$project: {
results: {
$concatArrays: [ "$empty", "$non_empty" ]
}
}
}];
Output :
{
"results": [{
"_id": null,
"value": 52 // count of both '' and null.
}, {
"_id": "Approved",
"value": 83
}, {
"_id": "Rejected",
"value": 3661
}]
}
Changing the group by like below solved the problem
{
"$group": {
"_id": {
"$ifNull": ["$result", ""]
},
"id": {
"$first": "$result"
},
"labelKey": {
"$first": {
"$ifNull": ["$result",
"$result"]
}
},
"value": {
"$sum": 1
}
}
}

How to get nested 3 label array object in Mongo Query?

Basically the structure is :
{
"_id" : ObjectId("123123"),
"stores" : [
{
"messages" : [
{
"updated_time" : "2018-05-15T05:12:25+0000",
"message_count" : 4,
"thread_id" : "123",
"messages" : [
{
"message" : "Hi User ",
"created_time" : "2018-05-15T05:12:25+0000",
"message_id" : "111",
},
{
"message" : "This is tes",
"created_time" : "2018-05-15T05:12:21+0000",
"message_id" : "222",
}
]
},
],
"store_id" : "123"
}
]
}
I have these values to get message_id object : 111. So how to get this object, any idea or help will be appreciated. THanks
store_id: 123,
thread_id:123,
message_id:111
The simplest way would be to $unwind all the nested arrays and then use $match to get single document. You can also add $replaceRoot to get only nested document. Try:
db.collection.aggregate([
{ $unwind: "$stores" },
{ $unwind: "$stores.messages" },
{ $unwind: "$stores.messages.messages" },
{ $match: { "stores.store_id": "123", "stores.messages.thread_id": "123", "stores.messages.messages.message_id": "111" } },
{ $replaceRoot: { newRoot: "$stores.messages.messages" } }
])
Prints:
{
"created_time": "2018-05-15T05:12:25+0000",
"message": "Hi User ",
"message_id": "111"
}
To improve the performance you can use $match after every $unwind to filter out unnecessary data as soon as possible, try:
db.collection.aggregate([
{ $unwind: "$stores" },
{ $match: { "stores.store_id": "123" } },
{ $unwind: "$stores.messages" },
{ $match: { "stores.messages.thread_id": "123" } },
{ $unwind: "$stores.messages.messages" },
{ $match: { "stores.messages.messages.message_id": "111" } },
{ $replaceRoot: { newRoot: "$stores.messages.messages" } }
])

Project an array with MongoDB

I'm using MongoDB's aggregation pipeline, to get my documents in the form that I want. As the last step of aggregation, I use $project to put the documents into their final form.
But I'm having trouble projecting and array of sub-documents. Here is what I currently get from aggrgation:
{
"_id": "581c8c3df1325f68ffd23386",
"count": 14,
"authors": [
{
"author": {
"author": "57f246b9e01e6c6f08e1d99a",
"post": "581c8c3df1325f68ffd23386"
},
"count": 13
},
{
"author": {
"author": "5824382511f16d0f3fd5aaf2",
"post": "581c8c3df1325f68ffd23386"
},
"count": 1
}
]
}
I want to $project the authors array so that the return would be this:
{
"_id": "581c8c3df1325f68ffd23386",
"count": 14,
"authors": [
{
"_id": "57f246b9e01e6c6f08e1d99a",
"count": 13
},
{
"_id": "5824382511f16d0f3fd5aaf2",
"count": 1
}
]
}
How would I go about achieving that?
You can unwind the array and wind it u again after projecting.
Something like this:
db.collectionName.aggregate([
{$unwind:'$authors'},
{$project:{_id:1,count:1,'author.id':'$authors.author.author','author.count':'$authors.count'}},
{$group:{_id:{_id:'$_id',count:'$count'},author:{$push:{id:'$author.id',count:'$author.count'}}}},
{$project:{_id:0,_id:'$_id._id',count:'$_id.count',author:1}}
])
the output for above will be:
{
"_id" : "581c8c3df1325f68ffd23386",
"author" : [
{
"id" : "57f246b9e01e6c6f08e1d99a",
"count" : 13.0
},
{
"id" : "5824382511f16d0f3fd5aaf2",
"count" : 1.0
}
],
"count" : 14.0
}
I have been having the same problem and just now found a simple and elegant solution that has not been mentioned anywhere, so i thought I'd share it here:
You can iterate the array using $map and project each author. With the given structure, the aggregation should look somewhat like this
db.collectionName.aggregate([
$project: {
_id: 1,
count:1,
authors: {
$map: {
input: "$authors",
as: "author",
in: {
id: "$$author.author.author",
count: $$author.author.count
}
}
}
}
])
Hope this helps anyone who is looking, like me :)
Question:
"customFields" : [
{
"index" : "1",
"value" : "true",
"label" : "isOffline",
"dataType" : "check_box",
"placeholder" : "cf_isoffline",
"valueFormatted" : "true"
},
{
"index" : "2",
"value" : "false",
"label" : "tenure_extended",
"dataType" : "check_box",
"placeholder" : "cf_tenure_extended",
"valueFormatted" : "false"
}
],
Answer:
db.subscription.aggregate([
{$match:{"autoCollect" : false,"remainingBillingCycles" : -1,"customFields.value":"false", "customFields.label" : "isOffline"}},
{$project: {first: { $arrayElemAt: [ "$customFields", 1 ] }}}
])