I'm reviewing my MongoDB documents using Robo 3T, and I'd like to sort the keys in the document by their name.
My document might look like
{"Y":3,"X":"Example","A":{"complex_obj":{}}
and at the end I'd like the returned document to look like when I run a find query and apply a sort to it. {"A":{"complex_obj":{},"X":"Example","Y":3}
Is there a way to sort the returned keys / fields of a document? All the examples I see are for applying sort based on the value of a field, rather than the name of the key.
Not sure why the order of field does matter in a JSON document but you can try below aggregation query :
db.collection.aggregate([
{
$project: { data: { $objectToArray: "$$ROOT" } }
},
{
$unwind: "$data"
},
{
$sort: { "data.k": 1 }
},
{
$group: { _id: "_id", data: { $push: "$$ROOT.data" } }
},
{
$replaceRoot: { newRoot: { $arrayToObject: "$data" } }
},
{
$project: { _id: 0 }
}
])
Test : mongoplayground
There is a way but you won't like it. Technically you can do it with aggregation by converting objects to arrays, unwinding, sorting, grouping it back and converting the group to the object:
db.collection.aggregate([
{
$project: {
o: {
$objectToArray: "$$ROOT"
}
}
},
{
$unwind: "$o"
},
{
$sort: {
"o.k": 1
}
},
{
$group: {
_id: "$_id",
o: {
$push: "$o"
}
}
},
{
$replaceRoot: {
newRoot: {
$arrayToObject: "$o"
}
}
}
])
but you don't want to do it. Too much hassle, too expensive, too little benefits.
Mongo by design preserve order of keys as they were inserted. Well, apart from _id, and few other edge cases.
Related
I would like to merge several documents. Most of the fields have the same values but there might be one or two fields that have different values. These fields are unknown beforehand. Ideally I would like to merge all the documents keeping the fields that are the same as is but creating an array of values only for those fields that have some variation.
For my first approach I grouped by a common field to my documents and kept the first document, this however discards some information that varies in other fields.
group_documents = {
"$group": {
"_id": "$0020000E.Value",
"doc": {
"$first": "$$ROOT"
}
}
}
merge_documents = {
"$replaceRoot": {
"newRoot": "$doc"
}
}
write_collection = { "$out": { "db": "database", "coll": "records_nd" } }
objects = coll.aggregate(pipeline)
IF the fields that have different values where known I would have done something like this,
merge_sol1
or
merge_sol2
or
merge_sol3
The third solution is actually very close to my desired output and I could tweak it a bit. But these answers assume a-priori knowledge of the fields to be merged.
You can first convert $$ROOT to array of k-v tuples by $objectToArray. Then, $group all fields by $addToSet to put all distinct values into an array first. Then, check the size of the result array and conditionally pick the first item if the array size is 1 (i.e. the value is the same for every documents in the field); Otherwise, keep the result array. Finally, revert back to original document form by $arrayToObject.
db.collection.aggregate([
{
$project: {
_id: "$key",
arr: {
"$objectToArray": "$$ROOT"
}
}
},
{
"$unwind": "$arr"
},
{
$match: {
"arr.k": {
$nin: [
"key",
"_id"
]
}
}
},
{
$group: {
_id: {
id: "$_id",
k: "$arr.k"
},
v: {
"$addToSet": "$arr.v"
}
}
},
{
$project: {
_id: "$_id.id",
arr: [
{
k: "$_id.k",
v: {
"$cond": {
"if": {
$gt: [
{
$size: "$v"
},
1
]
},
"then": "$v",
"else": {
$first: "$v"
}
}
}
}
]
}
},
{
"$project": {
doc: {
"$arrayToObject": "$arr"
}
}
},
{
"$replaceRoot": {
"newRoot": {
"$mergeObjects": [
{
_id: "$_id"
},
"$doc"
]
}
}
}
])
Mongo Playground
I need to analyze some mongo db collections. What I need to extract the names and values of a collection.
Heres's how far I got:
db.collection(coll.name)
.aggregate([
{ $project: { arrayofkeyvalue: { $objectToArray: '$$ROOT' } } },
{ $unwind: '$arrayofkeyvalue' },
{
$group: {
_id: null,
allkeys: { $addToSet: '$arrayofkeyvalue.k' },
},
},
])
.toArray();
This works quite nicely. I get all the keys. However I'd like to get the values too.
So, I thought "piece o' cake" and replaced the allkeys section with the allkeysandvalues section, which is supposed to create a map with key and value pairs.
Like this:
db.collection(coll.name)
.aggregate([
{ $project: { arrayofkeyvalue: { $objectToArray: '$$ROOT' } } },
{ $unwind: '$arrayofkeyvalue' },
{
$group: {
_id: null,
allkeysandvalues: {
$map: {
input: '$arrayofkeyvalue',
as: 'kv',
in: {
k: '$$kv.k',
v: '$$kv.v',
},
},
},
},
},
])
.toArray();
But that's not working. I get the error message
MongoError: unknown group operator '$map'
Does anyone know hot to solve this?
The $group pipeline stage requires accumulator expression so you have to use $push instead of $map:
{
$group: {
_id: null,
allkeysandvalues: {
$push: "$arrayofkeyvalue"
}
}
}
or
{
$group: {
_id: null,
allkeysandvalues: {
$push: {
k: "$arrayofkeyvalue.k",
v: "$arrayofkeyvalue.v"
}
}
}
}
which returns the same result.
Please note that arrayofkeyvalue is an object since you run $unwind prior to $group
Mongo Playground
MongoError: unknown group operator '$map'
You can not use $map operator in $group stage directly in root level,
you can try adding one more group stage,
$group by k (key) and get the first v (value)
$group by null and construct the array of key-value pair
$arrayToObject convert key-value pair array to object
db.collection(coll.name).aggregate([
{ $project: { arrayofkeyvalue: { $objectToArray: "$$ROOT" } } },
{ $unwind: "$arrayofkeyvalue" },
{
$group: {
_id: "$arrayofkeyvalue.k",
value: { $first: "$arrayofkeyvalue.v" }
}
},
{
$group: {
_id: null,
allkeysandvalues: { $push: { k: "$_id", v: "$value" } }
}
},
{ $project: { allkeysandvalues: { $arrayToObject: "$allkeysandvalues" } } }
])
Playground
I have a filter + group operation on a bunch of documents (books). The grouping is to return only latest versions of books that share the same book_id (name). The below code works, but it's untidy since it returns redundant information:
return Book.aggregate([
{ $match: generateMLabQuery(rawQuery) },
{
$sort: {
"published_date": -1
}
},
{
$group: {
_id: "$book_id",
books: {
$first: "$$ROOT"
}
}
}
])
I end up with an array of objects that looks like this:
[{ _id: "aedrtgt6854earg864", books: { singleBookObject } }, {...}, {...}]
Essentially I only need the singleBookObject part, which is the original document (and what I'd be getting if I had done only the $match operation). Is there a way to get rid of the redundant _id and books parts within the aggregation pipeline?
You can use $replaceRoot
Book.aggregate([
{ "$match": generateMLabQuery(rawQuery) },
{ "$sort": { "published_date": -1 }},
{ "$group": {
"_id": "$book_id",
"books": { "$first": "$$ROOT" }
}},
{ "$replaceRoot": { "newRoot": "$books" } }
])
I'm doing a $group within my aggregation pipeline, where I $push one property to an array, and for all the remaining properties I simply take the $first:
{ $group: {
'_id': '$_id',
property1: { $push: '$property1' },
property2: { $first: '$property2' },
property3: { $first: '$property3' },
property4: { $first: '$property4' },
property5: { $first: '$property5' },
property6: { $first: '$property6' },
property7: { $first: '$property7' },
// …
}},
Is there a possibility to specify this in a more concise way? I am hoping for something like the following (which is not working), to say “use $push for property1, and $first for anything else”:
{ $group: {
'_id': '$_id',
property1: { $push: '$property1' },
'*': { $first: '$*' }
}},
No, There is no other way. You have to specify each field with the $first accumulator in the $group stage.
But you can avoid specifying $first to every field. Something like this
{ "$group": {
"_id": "$_id",
"property1": { "$push": "$property1" },
"data": {
"$first": {
"property2": "$property2",
"property3": "$property3",
"property4": "$property4",
"property5": "$property5",
"property6": "$property6",
"property7": "$property7"
}
}
}}
As stated above by Anthony Winzlet, there’s no way to achieve that through the $group operator. However, I used the following workaround, which saved me from having to list all those properties explicitly. It's probably not worth the hassle for a small amount of properties and if you do not need to take care of flexibility that additional ones might be added later, but in my case it made sense.
Here’s the idea of the aggregation pipeline:
Use $addFields to add the entire root document as a temporary copy.
Specify the group stage, where you add the $push for the desired property, and for the entire copied sub-document from before, use the $first operator.
Use $replaceRoot and $mergeObjects to take the copied root document, and replace the property with the $push aggregation.
Here’s an example:
db.getCollection('test').aggregate([
{ $addFields: { tempRoot: '$$ROOT' } },
{ $group: { '_id': '$property1', 'property2': { $push: '$property2' }, 'tempRoot': { $first: '$tempRoot' } } },
{ $replaceRoot: { newRoot: { $mergeObjects: [ '$tempRoot', { property2: '$property2' } ] } } }
]);
I use aggregation framework for group by of multiple fields as
{
_id:{_id:"$_id",feature_type:"$feature_type",feature_name:"$feature_name"},
features: { $push: "$features" }
}
it give result like
{_id:
{_id:1,feature_type:"Test",feature_name:"Tests"},
features:["23423","32423","2342342"]
}
but I want result like
{_id:1,feature_type:"Test",feature_name:"Tests",
features:["23423","32423","2342342"]
}
how can i acheve this using aggregration framework.
You need to use $replaceRoot to change your root document
db.collection.aggregate([
{
"$addFields": {
"_id.features": "$features"
}
},
{
"$replaceRoot": {
"newRoot": "$_id"
}
}
])
db.collection.aggregate([
{
$project: {
_id: "$_id._id",
feature_type:"$_id.feature_type",
feature_name:"$_id.feature_name",
features:1
}
}
])