Create unique ObjectId during $project pipeline - mongodb

I have an aggregation framework query that is summarizing certain document data into a lookup set. Unfortunately, I can't provide the data since it's company-related. Here is the query and data fragment from the last stages of the pipeline:
...
{ $group: { _id: "$SectionId", "Questions": { $addToSet: "$Questions" } } },
{ $unwind: "$Questions" },
which returns data like this: Note that _id is not unique.
{
"_id" : "Tonometry",
"Questions" : {
"MappingId" : "Exophoria",
"PositiveLabel" : "Positive",
"NegativeLabel" : "Negative"
}
},
{
"_id" : "Tonometry",
"Questions" : {
"MappingId" : "Heterophoria",
"PositiveLabel" : "Positive",
"NegativeLabel" : "Negative"
}
},
The next stage in the pipeline is this:
{
$project: {
"_id": 1,
"id": ObjectId(),
"SectionId": "$_id",
"MappingId": "$Questions.MappingId",
"PositiveLabel": "$Questions.PositiveLabel",
"NegativeLabel": "$Questions.NegativeLabel",
}
},
which produces:
{
"_id" : "Tonometry",
"id" : ObjectId("5d1cf66cf526f23524f865c6"),
"SectionId" : "Tonometry",
"MappingId" : "Exophoria",
"PositiveLabel" : "Positive",
"NegativeLabel" : "Negative"
},
{
"_id" : "Tonometry",
"id" : ObjectId("5d1cf66cf526f23524f865c6"),
"SectionId" : "Tonometry",
"MappingId" : "Heterophoria",
"PositiveLabel" : "Positive",
"NegativeLabel" : "Negative"
},
I tried creating a new field Id that has a unique ObjectId but unfortunately just re-uses the same ObjectId in all the nodes. This is important because when I attempt to use $out, it requires a unique _id.
How do I add a unique ObjectId to each node?

Using $out does not need a unique _id, you can use $replaceRoot together with $mergeObjects prior to $out pipeline, this will merge the Question document into the desired document without an _id field and $out will create the _id field for you in the new collection:
[
....
{ "$group": { "_id": "$SectionId", "Questions": { "$addToSet": "$Questions" } } },
{ "$unwind": "$Questions" },
{ "$replaceRoot": {
"newRoot": {
"$mergeObjects": [
{ "Section": "$_id" },
"$Questions"
]
}
} },
{ "$out": "new-collection" }
]

Related

MongoDB group by array subfield

Hello I am new to mongoDB, please I hope you can help me with this question.
My collection will look like this:
{
"_id": { "$oid": "5f1fd47..." },
"email":"c#c.com",
"materials": [
{
"_id": { "$oid": "5f1fda2..." },
"title": "MDF 18mm Blanco",
"id": "mdf18blanco",
"thickness": "18",
"family": "MDF",
"color": ""
}, ...
//others materials with different family
],
}
I did an aggregate like this:
{ "$match" : { "email" : "c#c.com" } },
{ "$unwind" : "$materials" },
{ "$group" : { "_id" : "$_id", "list" : { "$push" : "$materials.family" } } }
and I return this:
{
"_id" : ObjectId("5f1fd47d502e00051c673dd1"),
"list" : [
"MDF",
"MDF",
"MDF",
"Melamina",
"Melamina",
"Melamina",
"Melamina",
"MDF",
"Melamina",
"Aglomerado",
"Aglomerado"
]
}
but i need get this
{
"_id" : ObjectId("5f1fd47d502e00051c673dd1"),
"list" : [
"MDF",
"Melamina",
"Aglomerado"
]
}
I hope you understand my question and can help me, thank you very much.
All you need to do is use $addToSet instead of $push in your group stage:
{ "$group" : { "_id" : "$_id", "list" : { "$addToSet" : "$materials.family" } } }
One thing to note is that $addToSet does not guarantee a specific order as opposed to $push in case it matters to you.
You only need change $push to $addToSet.
A set not contains repeat values so it works.
db.collection.aggregate([
{
"$match": {
"email": "c#c.com"
}
},
{
"$unwind": "$materials"
},
{
"$group": {
"_id": "$_id",
"list": {
"$addToSet": "$materials.family"
}
}
}
])
Mongo Playground example

$elemMatch for filtering out referenced($ref) array objects in mongodb is not working

I have 2 collections student_details and subject_details where each student can have multiple subjects which I am storing in student_details collection as reference array.
Now I need to fetch Student details along with the filtered subjects where subject_details.status=ACTIVE.
How can I achieve this using $elemMatch for $ref objects.
I was using something like below but it is not returning any records.
db.getCollection('student_details').find( { subjects: { $elemMatch: { $ref: "subject_details", status: 'ACTIVE' }}})
student_details
================
{
"_id" : "STD-1",
"name" : "XYZ",
"subjects" : [
{
"$ref" : "subject_details",
"$id" : "SUB-1"
},
{
"$ref" : "subject_details",
"$id" : "SUB-2"
},
{
"$ref" : "subject_details",
"$id" : "SUB-3"
}
]
}
subject_details
===============
{
"_id" : "SUB-1",
"name" : "MATHEMATICS",
"status" : "ACTIVE"
}
{
"_id" : "SUB-2",
"name" : "PHYSICS",
"status" : "ACTIVE"
}
{
"_id" : "SUB-3",
"name" : "CHEMISTRY",
"status" : "INACTIVE"
}
dbref's are troublesome when used in lookups. but you can work around it with the following aggregation pipeline:
db.student_details.aggregate([
{
$unwind: "$subjects"
},
{
$set: {
"fk": {
$arrayElemAt: [{
$objectToArray: "$subjects"
}, 1]
}
}
},
{
$lookup: {
"from": "subject_details",
"localField": "fk.v",
"foreignField": "_id",
"as": "subject"
}
},
{
$match: {
"subject.status": "ACTIVE"
}
},
{
$group: {
"_id": "$_id",
"name": {
$first: "$name"
},
"subjects": {
$push: {
$arrayElemAt: ["$subject", 0]
}
}
}
}
])
the resulting object would be like so:
{
"_id": "STD-1",
"name": "XYZ",
"subjects": [
{
"_id": "SUB-1",
"name": "MATHEMATICS",
"status": "ACTIVE"
},
{
"_id": "SUB-2",
"name": "PHYSICS",
"status": "ACTIVE"
}
]
}
because they are in 2 collections you need $lookUp to bring them together... before that I believe you need to $unwind the Subjects array... kind of aircode here so this isn't so much an answer as general advice... the aggregation pipeline is used to do these in stages...
am assuming you are abbreviating for the post...cause if Subject Details is really just 3 fields your schema is better served in the NoSQL world to just put that info with Student Details and use 1 collection rather than a normalized relational approach

Fetch distinct values from Mongo DB nested array and output to a single array

given below is my data in mongo db.I want to fetch all the unique ids from the field articles ,which is nested under the jnlc_subjects index .The result should contain only the articles array with distinct object Ids.
Mongo Data
{
"_id" : ObjectId("5c9216f1a21a4a31e0c7fa56"),
"jnlc_journal_category" : "Biology",
"jnlc_subjects" : [
{
"subject" : "Conservation Biology",
"views" : "123",
"articles" : [
ObjectId("5c4e93d0135edb6812200d5f"),
ObjectId("5c4e9365135edb6a12200d60"),
ObjectId("5c4e93a8135edb6912200d61")
]
},
{
"subject" : "Micro Biology",
"views" : "20",
"articles" : [
ObjectId("5c4e9365135edb6a12200d60"),
ObjectId("5c4e93d0135edb6812200d5f"),
ObjectId("5c76323fbaaccf5e0bae7600"),
ObjectId("5ca33ce19d677bf780fc4995")
]
},
{
"subject" : "Marine Biology",
"views" : "8",
"articles" : [
ObjectId("5c4e93d0135edb6812200d5f")
]
}
]
}
Required result
I want to get output in following format
articles : [
ObjectId("5c4e9365135edb6a12200d60"),
ObjectId("5c4e93a8135edb6912200d61"),
ObjectId("5c76323fbaaccf5e0bae7600"),
ObjectId("5ca33ce19d677bf780fc4995"),
ObjectId("5c4e93d0135edb6812200d5f")
]
Try as below:
db.collection.aggregate([
{
$unwind: "$jnlc_subjects"
},
{
$unwind: "$jnlc_subjects.articles"
},
{ $group: {_id: null, uniqueValues: { $addToSet: "$jnlc_subjects.articles"}} }
])
Result:
{
"_id" : null,
"uniqueValues" : [
ObjectId("5ca33ce19d677bf780fc4995"),
ObjectId("5c4e9365135edb6a12200d60"),
ObjectId("5c4e93a8135edb6912200d61"),
ObjectId("5c4e93d0135edb6812200d5f"),
ObjectId("5c76323fbaaccf5e0bae7600")
]
}
Try with this
db.collection.aggregate([
{
$unwind:{
path:"$jnlc_subjects",
preserveNullAndEmptyArrays:true
}
},
{
$unwind:{
path:"$jnlc_subjects.articles",
preserveNullAndEmptyArrays:true
}
},
{
$group:{
_id:"$_id",
articles:{
$addToSet:"$jnlc_subjects.articles"
}
}
}
])
If you don't want to $group with _id ypu can use null instead of $_id
According to description as mentioned into above question,as a solution to it please try executing following aggregate operation.
db.collection.aggregate(
// Pipeline
[
// Stage 1
{
$match: {
"_id": ObjectId("5c9216f1a21a4a31e0c7fa56")
}
},
// Stage 2
{
$unwind: {
path: "$jnlc_subjects",
}
},
// Stage 3
{
$unwind: {
path: "$jnlc_subjects.articles"
}
},
// Stage 4
{
$group: {
_id: null,
articles: {
$addToSet: '$jnlc_subjects.articles'
}
}
},
// Stage 5
{
$project: {
articles: 1,
_id: 0
}
},
]
);

How to get nested 3 label array object in Mongo Query?

Basically the structure is :
{
"_id" : ObjectId("123123"),
"stores" : [
{
"messages" : [
{
"updated_time" : "2018-05-15T05:12:25+0000",
"message_count" : 4,
"thread_id" : "123",
"messages" : [
{
"message" : "Hi User ",
"created_time" : "2018-05-15T05:12:25+0000",
"message_id" : "111",
},
{
"message" : "This is tes",
"created_time" : "2018-05-15T05:12:21+0000",
"message_id" : "222",
}
]
},
],
"store_id" : "123"
}
]
}
I have these values to get message_id object : 111. So how to get this object, any idea or help will be appreciated. THanks
store_id: 123,
thread_id:123,
message_id:111
The simplest way would be to $unwind all the nested arrays and then use $match to get single document. You can also add $replaceRoot to get only nested document. Try:
db.collection.aggregate([
{ $unwind: "$stores" },
{ $unwind: "$stores.messages" },
{ $unwind: "$stores.messages.messages" },
{ $match: { "stores.store_id": "123", "stores.messages.thread_id": "123", "stores.messages.messages.message_id": "111" } },
{ $replaceRoot: { newRoot: "$stores.messages.messages" } }
])
Prints:
{
"created_time": "2018-05-15T05:12:25+0000",
"message": "Hi User ",
"message_id": "111"
}
To improve the performance you can use $match after every $unwind to filter out unnecessary data as soon as possible, try:
db.collection.aggregate([
{ $unwind: "$stores" },
{ $match: { "stores.store_id": "123" } },
{ $unwind: "$stores.messages" },
{ $match: { "stores.messages.thread_id": "123" } },
{ $unwind: "$stores.messages.messages" },
{ $match: { "stores.messages.messages.message_id": "111" } },
{ $replaceRoot: { newRoot: "$stores.messages.messages" } }
])

Selecting Distinct values from Array in MongoDB

I have a collection name Alpha_Num, It has following structure. I am trying to find out which Alphabet-Numerals pair will appear maximum number of times ?
If we just go with the data below, pair abcd-123 appears twice so as pair efgh-10001, but the second one is not a valid case for me as it appears in same document.
{
"_id" : 12345,
"Alphabet" : "abcd",
"Numerals" : [
"123",
"456",
"2345"
]
}
{
"_id" : 123456,
"Alphabet" : "efgh",
"Numerals" : [
"10001",
"10001",
"1002"
]
}
{
"_id" : 123456567,
"Alphabet" : "abcd",
"Numerals" : [
"123"
]
}
I tried to use aggregation frame work, something like below
db.Alpha_Num.aggregate([
{"$unwind":"$Numerals"},
{"$group":
{"_id":{"Alpha":"$Alphabet","Num":"$Numerals"},
"count":{$sum:1}}
},
{"$sort":{"count":-1}}
])
Problem in this query is it gives pair efgh-10001 twice.
Question : How to select distinct values from array "Numerals" in the above condition ?
Problem solved.
db.Alpha_Num.aggregate([{
"$unwind": "$Numerals"
}, {
"$group": {
_id: {
"_id": "$_id",
"Alpha": "$Alphabet"
},
Num: {
$addToSet: "$Numerals"
}
}
}, {
"$unwind": "$Num"
}, {
"$group": {
_id: {
"Alplha": "$_id.Alpha",
"Num": "$Num"
},
count: {
"$sum": 1
}
}
}])
Grouping using $addToSet and unwinding again did the trick. Got the answer from one of 10gen online course.