How to merge mongo inline documents - mongodb

[{
"field1" : "1",
"field2" : [
{
"f1" : "a",
"f2" : "b"
},
{
"f1" : "aa",
"f2" : "bb"
}
]
},
{
"field1" : "2",
"field2" : [
{
"f1" : "c",
"f2" : "d"
},
{
"f1" : "cc",
"f2" : "dd"
}
]
}]
I want to find out the fields 2 and merge them into the document to the following format:
{
"f1" : "a",
"f2" : "b"
},
{
"f1" : "aa",
"f2" : "bb"
},
{
"f1" : "c",
"f2" : "d"
},
{
"f1" : "cc",
"f2" : "dd"
}

For input data:
[
{
"field1" : "1",
"field2" : [
{
"f1" : "a",
"f2" : "b"
},
{
"f1" : "aa",
"f2" : "bb"
}
]
}
,
{
"field1" : "2",
"field2" : [
{
"f1" : "c",
"f2" : "d"
},
{
"f1" : "cc",
"f2" : "dd"
}
]
}
]
use aggregation:
[
{
"$unwind": "$field2"
},
{
"$group": {
"_id": "$field2"
}
},
{
"$replaceRoot": { "newRoot": "$_id" }
}
]
to produce:
[
{
"f1": "c",
"f2": "d"
},
{
"f1": "cc",
"f2": "dd"
},
{
"f1": "aa",
"f2": "bb"
},
{
"f1": "a",
"f2": "b"
}
]
You can play with this on mongoDB playground: here

I think there is no need to group. Just unwind and replaceRoot
db.collection.aggregate([
{"$unwind": "$field2"},
{"$replaceRoot": { "newRoot": "$field2" }}
])

let fakeArray = [
{
"field1" : "1",
"field2" : [
{
"f1" : "a",
"f2" : "b"
},
{
"f1" : "aa",
"f2" : "bb"
}
]
},
{
"field1" : "2",
"field2" : [
{
"f1" : "c",
"f2" : "d"
},
{
"f1" : "cc",
"f2" : "dd"
}
]
}
];
let arr = fakeArray.map(el => {
return el.field2;
});
console.log(arr.flat(1));
you can easily use js features like map and flat

You can try using a $unwind and a $project within an aggregation query:
db.test.insertMany([
{ "field1" : "1", "field2" : [ { "f1" : "a", "f2" : "b" }, { "f1" : "aa", "f2" : "bb" } ] },
{ "field1" : "2", "field2" : [ { "f1" : "c", "f2" : "d" }, { "f1" : "cc", "f2" : "dd" } ] }
]);
db.test.aggregate([
{$unwind: "$field2"},
{$project: {
"f1": "$field2.f1",
"f2": "$field2.f2"
}}
]);
This will give you the following output:
{ "_id" : ObjectId("5c8b70474c52159bf9357567"), "f1" : "a", "f2" : "b" }
{ "_id" : ObjectId("5c8b70474c52159bf9357567"), "f1" : "aa", "f2" : "bb" }
{ "_id" : ObjectId("5c8b70474c52159bf9357568"), "f1" : "c", "f2" : "d" }
{ "_id" : ObjectId("5c8b70474c52159bf9357568"), "f1" : "cc", "f2" : "dd" }
https://play.db-ai.co/m/XItxEAgjhgAB8iV6

Related

Remove value From nested Array in MOngodb

How can I remove None From the array?
{
"_id" : ObjectId("someOBjectId"),
"key1" : "some Value"
"key2" : [
{
"a" : "A",
"b" : 5,
"c" : ["None", "some Value"]
},
{
"a" : "dsf",
"b" : 6,
"c" : ["None"]
},
{
"a" : "sf",
"b" : 7,
"c" : [ "some Value", "None"]
},
]
}
How can I remove None while updating this document?
{
"_id" : ObjectId("someOBjectId"),
"key1" : "some Value"
"key2" : [
{
"a" : "A",
"b" : 5,
"c" : ["some Value"]
},
{
"a" : "dsf",
"b" : 6,
"c" : []
},
{
"a" : "sf",
"b" : 7,
"c" : [ "some Value"]
},
]
}
$[] to select all elements of key2 array
$pull to remove element from c array
db.collection.updateMany({
"key2.c": "None"
},
{
$pull: {
"key2.$[].c": "None"
}
})
Playground

MongoDB data grouping with aggregation

I have data collection like this:
> db.LogBuff.find()
{ "_id" : ObjectId("578899d5d2b76f77d083f16c"), "SUBJECT" : "DD", "SYS" : "A" }
{ "_id" : ObjectId("578899d5d2b76f77d083f16d"), "SUBJECT" : "AA", "SYS" : "B" }
{ "_id" : ObjectId("578899d5d2b76f77d083f16e"), "SUBJECT" : "BB", "SYS" : "A" }
{ "_id" : ObjectId("578899d5d2b76f77d083f16f"), "SUBJECT" : "AA", "SYS" : "C" }
{ "_id" : ObjectId("578899d5d2b76f77d083f170"), "SUBJECT" : "BB", "SYS" : "A" }
{ "_id" : ObjectId("578899d5d2b76f77d083f171"), "SUBJECT" : "BB", "SYS" : "A" }
{ "_id" : ObjectId("578899d5d2b76f77d083f172"), "SUBJECT" : "CC", "SYS" : "B" }
And I want to extract output as below (with distinct "SYS" values)
{"SUBJECT" : "AA", "SYS" : ["A","B","C","D"]}
{"SUBJECT" : "BB", "SYS" : ["A","B","C","D"]}
{"SUBJECT" : "CC", "SYS" : ["A","B","C"]}
Here is my code and I am stuck in the middle, please help me to sort this
db.LogBuff.aggregate([{
"$unwind": "$SYS"
}, {
"$group": {
_id: {
"_id": "$SUBJECT"
},
SYST: {
$addToSet: "$SYS"
}
}
}, {
"$unwind": "$SYST"
}, {
"$group": {
_id: {
"SUBJECT": "$_id",
"SYST":"$SYST"
}
}
}])
Just group by the _id, and addToSet the SYS values:
db.LogBuff.aggregate([
{
"$group": {
_id: {
"_id": "$SUBJECT"
},
SYST: {
$addToSet: "$SYS"
}
}
}
])
No need for unwind, one group should get you the desired result.
Result of group aggregation on your example data:
{ "_id" : { "_id" : "CC" }, "SYST" : [ "B" ] }
{ "_id" : { "_id" : "BB" }, "SYST" : [ "A" ] }
{ "_id" : { "_id" : "AA" }, "SYST" : [ "C", "B" ] }
{ "_id" : { "_id" : "DD" }, "SYST" : [ "A" ] }

Aggregation on the basis of the set of nested docs

Let's say I have the next 5 docs:
{ "_id" : "1", "student" : "Oscar", "courses" : [ "A", "B" ] }
{ "_id" : "2", "student" : "Alan", "courses" : [ "A", "B", "C" ] }
{ "_id" : "3", "student" : "Kate", "courses" : [ "A", "B", "D" ] }
{ "_id" : "4", "student" : "John", "courses" : [ "A", "B", "C" ] }
{ "_id" : "5", "student" : "Bema", "courses" : [ "A", "B" ] }
I want to manipulate the collection so that it will return a group of students (with their _id) by set (combination) of courses they take and calculate how many students in each set.
In the example above I have 3 set (combination) of courses and number of students as below:
1 - [ "A", "B" ] <- 2 students take this combination
2 - [ "A", "B", "C" ] <- 2 students
3 - [ "A", "B", "D" ] <- 1 student
I feel like this is more like MapReduce task rather than Aggregation...not sure...
UPDATE 1
Thanks a lot to #ExplosionPills
So the following aggregation command:
db.students.aggregate([{
$group: {
_id: "$courses",
count: {$sum: 1},
students: {$push: "$_id"}
}
}])
gives me the following output:
{ "_id" : [ "A", "B", "D" ], "count" : 1, "students" : [ "3" ] }
{ "_id" : [ "A", "B", "C" ], "count" : 2, "students" : [ "2", "4" ] }
{ "_id" : [ "A", "B" ], "count" : 2, "students" : [ "1", "5" ] }
It groups by set of courses, counts number of students belong to it and their _ids.
UPDATE 2
I found out, the aggregation above treats combination [ "C", "A", "B" ] as different from [ "A", "B", "C" ]. But I need these 2 count as same.
So let's look at the following documents:
{ "_id" : "1", "student" : "Oscar", "courses" : [ "A", "B" ] }
{ "_id" : "2", "student" : "Alan", "courses" : [ "A", "B", "C" ] }
{ "_id" : "3", "student" : "Kate", "courses" : [ "A", "B", "D" ] }
{ "_id" : "4", "student" : "John", "courses" : [ "A", "B", "C" ] }
{ "_id" : "5", "student" : "Bema", "courses" : [ "A", "B" ] }
{ "_id" : "6", "student" : "Alex", "courses" : [ "C", "A", "B" ] }
Let's see this in output:
{ "_id" : [ "C", "A", "B" ], "count" : 1, "students" : [ "6" ] }
{ "_id" : [ "A", "B", "D" ], "count" : 1, "students" : [ "3" ] }
{ "_id" : [ "A", "B", "C" ], "count" : 2, "students" : [ "2", "4" ] }
{ "_id" : [ "A", "B" ], "count" : 2, "students" : [ "1", "5" ] }
See the lines 1 and 3 - this is not what I wanted.
So, to treat [ "C", "A", "B" ] and [ "A", "B", "C" ] as same combination I changed the aggregation as follows:
db.students.aggregate([
{$unwind: "$courses" },
{$sort : {"courses": 1}},
{$group: {_id: "$_id", courses: {$push: "$courses"}}},
{$group: {_id: "$courses", count: {$sum:1}, students: {$push: "$_id"}}}
])
Output:
{ "_id" : [ "A", "B", "D" ], "count" : 1, "students" : [ "3" ] }
{ "_id" : [ "A", "B" ], "count" : 2, "students" : [ "5", "1" ] }
{ "_id" : [ "A", "B", "C" ], "count" : 3, "students" : [ "6", "4", "2" ] }
This is an aggregate operation using grouping.
db.students.aggregate([{
$group: {
// Uniquely identify the document.
// The $ syntax queries on this field
_id: "$courses",
// Add 1 for each field found (effectively a counter)
count: {$sum: 1}
}
}]);
EDIT:
If the courses can be in any order, you can $unwind, $sort, and $group again as suggested in the edited question. It's also possible to do this via mapReduce, but I'm not sure which is faster.
db.students.mapReduce(
function () {
// Use the sorted courses as the key
emit(this.courses.sort(), this._id);
},
function (key, values) {
return {"students": values, count: values.length};
},
{out: {inline: 1}}
)

Count and group by with mongo db

I m actually facing a problem with mongoDB.
I need to display some statistics :
- A treatment is an information that contain a date, the user who treated, a list of anomalies
Can you help me with the request to get :
"The numbers of anomalies by users ?"
Thanks for all :D
db.treatment.aggregate(
{
$group : {_id : "$anomalies", totalUser : { $sum : 1 }}
}
);
Note : change your collection and document key name if I put wrong.
Source : http://www.mkyong.com/mongodb/mongodb-aggregate-and-group-example/
So, if your collection had the following documents:
> db.treatments.find()
{ "_id" : 1, "date" : ISODate("2014-08-29T15:44:45.843Z"), "user" : "A", "anomalies" : [ "a", "b", "c" ] }
{ "_id" : 2, "date" : ISODate("2014-08-29T15:45:01.782Z"), "user" : "A", "anomalies" : [ "e", "f", "g" ] }
{ "_id" : 3, "date" : ISODate("2014-08-29T15:45:34.889Z"), "user" : "B", "anomalies" : [ "a", "b", "c", "e", "f", "g" ] }
{ "_id" : 4, "date" : ISODate("2014-08-29T15:48:01.860Z"), "user" : "B", "anomalies" : [ "a", "b", "c", "e", "f", "g" ] }
{ "_id" : 5, "date" : ISODate("2014-08-29T15:48:28.937Z"), "user" : "A", "anomalies" : [ "x", "y", "z" ] }
You can use $group stage to $sum the $size of the anomalies array
> db.treatments.aggregate([ { $group: { _id: "$user", allAnomalies: { $sum: { $size: "$anomalies" } } } } ] )
{ "_id" : "B", "allAnomalies" : 12 }
{ "_id" : "A", "allAnomalies" : 9 }

projections and index on 2D nested arrays in mongodb

> db.foo.save({'foo': [{f0: 'a', f1: 'b'}, {f0: 'c', f1: 'd'}]})
> db.foo.save({'foo': [{f0: 'a', f1: 'e'}, {f0: 'f', f1: 'g'}]})
> db.foo.save({'foo': [['a', 'b'], ['c', 'd']]})
> db.foo.save({'foo': [['a', 'e'], ['f', 'g']]})
> db.foo.find({}, {'foo.f1': 1})
{ "_id" : ObjectId("52dddf7cbeb971f4081ea48a"), "foo" : [ { "f1" : "b" }, { "f1" : "d" } ] }
{ "_id" : ObjectId("52dddf83beb971f4081ea48b"), "foo" : [ { "f1" : "e" }, { "f1" : "g" } ] }
{ "_id" : ObjectId("52dddf88beb971f4081ea48c"), "foo" : [ [ ], [ ] ] }
{ "_id" : ObjectId("52dddf8dbeb971f4081ea48d"), "foo" : [ [ ], [ ] ] }
> db.foo.find({}, {'foo.1': 1})
{ "_id" : ObjectId("52dddf7cbeb971f4081ea48a"), "foo" : [ { }, { } ] }
{ "_id" : ObjectId("52dddf83beb971f4081ea48b"), "foo" : [ { }, { } ] }
{ "_id" : ObjectId("52dddf88beb971f4081ea48c"), "foo" : [ [ ], [ ] ] }
{ "_id" : ObjectId("52dddf8dbeb971f4081ea48d"), "foo" : [ [ ], [ ] ] }
I have a couple of questions related to nested arrays like this (note that virtually all the SO questions with nested array in the title actually refer to single arrays nested in the root document, not 2D nested arrays. To the best of my ability to tell, this isn't a duplicate).
Is there any way to perform a projection, as above, on 2D nested arrays?
How would I create an index on the 2nd element of the arrays in the foo array? Again, presumably foo.1 wouldn't work.
I know the Right Answer (TM) is to Not Do That And Use An Array Of Subdocuments, Dummy (NDTAUAAOSD) but a) curiosity - I can't seem to find an answer and b) unfortunately, circumstances beyond my control dictate the document structure.
UPDATE: Clarification of what I'd want to see from the projections:
db.foo.find({}, {'foo.1': 1})
{ "_id" : ObjectId("52dddf88beb971f4081ea48c"), "foo" : [ ['b'], ['d'] ] }
{ "_id" : ObjectId("52dddf8dbeb971f4081ea48d"), "foo" : [ ['e'], ['g'] ] }
Basically slicing across the inner arrays.
You can use the positional $ operator for projections:
http://docs.mongodb.org/manual/reference/operator/projection/positional/
While i'm not entirely sure what your query is trying to project on, here's an example:
> db.foo.find()
{ "_id" : ObjectId("5321ac073ac852396029fb90"), "foo" : [ { "f0" : "a", "f1" : "b" }, { "f0" : "c", "f1" : "d" } ] }
{ "_id" : ObjectId("5321ac073ac852396029fb91"), "foo" : [ { "f0" : "a", "f1" : "e" }, { "f0" : "f", "f1" : "g" } ] }
{ "_id" : ObjectId("5321ac073ac852396029fb92"), "foo" : [ { "f0" : "a", "f1" : "b" }, { "f0" : "c", "f1" : "d" }, { "f0" : "a", "f1" : "z" } ] }
{ "_id" : ObjectId("5321ac073ac852396029fb93"), "foo" : [ [ "a", "b" ], [ "c", "d" ] ] }
{ "_id" : ObjectId("5321ac073ac852396029fb94"), "foo" : [ [ "a", "e" ], [ "f", "g" ] ] }
{ "_id" : ObjectId("5321ac073ac852396029fb95"), "foo" : [ [ "a", "e" ], [ "f", "g" ], [ "a", "z" ] ] }
// get the array document which has a field "f0" which matches "a"
b.foo.find({ "foo.f0" : "a" }, { "foo.$" : 1 })
{ "_id" : ObjectId("5321ac073ac852396029fb90"), "foo" : [ { "f0" : "a", "f1" : "b" } ] }
{ "_id" : ObjectId("5321ac073ac852396029fb91"), "foo" : [ { "f0" : "a", "f1" : "e" } ] }
{ "_id" : ObjectId("5321ac073ac852396029fb92"), "foo" : [ { "f0" : "a", "f1" : "b" } ] }
// ^ see that the last return document only returns the first array element match
// get the array element of the foo array
> db.foo.find({ "foo" : { "$elemMatch" : { "$in" : [ "a" ] } } }, { "foo.$" : 1 })
{ "_id" : ObjectId("5321ac073ac852396029fb93"), "foo" : [ [ "a", "b" ] ] }
{ "_id" : ObjectId("5321ac073ac852396029fb94"), "foo" : [ [ "a", "e" ] ] }
{ "_id" : ObjectId("5321ac073ac852396029fb95"), "foo" : [ [ "a", "e" ] ] }
// ^ see that the last return document only returns the first array element match