Query nested doc's field with dot in the name - mongodb

Consider such a collection:
> db.test.find({})
{ "_id" : ObjectId("5f969419d40c1580f2d4aa31"), "users" : { "foo#bar.com" : "baz" } }
{ "_id" : ObjectId("5f9694d4d40c1580f2d4aa33"), "users" : { "baz#test.com" : "foo" } }
I want to find documents where users contains field foo#bar.com.
Firstly I tried
> db.test.find({"users.foo#bar.com": { $exists: true } })
But it returned nothing. Because of the dot (.) in field's name it was looking for a such a field: users > foo#bar > com which does not exist.
I learned that the dot in key's name can be escaped with \u002e so I tried
> db.test.find({"users.foo#bar\u002ecom": { $exists: true } })
But it also returns nothing. I guess that I am not escaping properly. How should I do this?

You can do it using aggregation. Try this query.
db.test.aggregate([
{
"$project": {
"users": {
"$objectToArray": "$users"
}
}
},
{
"$match": {
"users.k": "foo#bar.com"
}
},
{
"$project": {
"users": {
"$arrayToObject": "$users"
}
}
}
])
Here is Mongo Playground

Try this one:
db.test.find({$expr:{$literal:{"users.foo#bar.com":{$exists:true}}}})
Explanation
$expr allows us to use aggregation operators in MQL
$literal returns "users.foo#bar.com" as a field (no parsing)
$literal does not touch the inner level $exists still works.

Related

Select latest document after grouping them by a field in MongoDB

I got a question that I would expect to be pretty simple, but I cannot figure it out. What I want to do is this:
Find all documents in a collection and:
sort the documents by a certain date field
apply distinct on one of its other fields, but return the whole document
Best shown in an example.
This is a mock input:
[
{
"commandName" : "migration_a",
"executionDate" : ISODate("1998-11-04T18:46:14.000Z")
},
{
"commandName" : "migration_a",
"executionDate" : ISODate("1970-05-09T20:16:37.000Z")
},
{
"commandName" : "migration_a",
"executionDate" : ISODate("2005-11-08T11:58:52.000Z")
},
{
"commandName" : "migration_b",
"executionDate" : ISODate("2016-06-02T19:48:34.000Z")
}
]
The expected output is:
[
{
"commandName" : "migration_a",
"executionDate" : ISODate("2005-11-08T11:58:52.000Z")
},
{
"commandName" : "migration_b",
"executionDate" : ISODate("2016-06-02T19:48:34.000Z")
}
]
Or, in other words:
Group the input data by the commandName field
Inside each group sort the documents
Return the newest document from each group
My attempts to write this query have failed:
The distinct() function will only return the value of the field I am distinct-ing on, not the whole document. That makes it unsuitable for my case.
Tried writing an aggregate query, but ran into an issue of how to sort-and-select a single document from inside of each group? The sort aggreation stage will sort the groups among one other, which is not what I want.
I am not too well-versed in Mongo and this is where I hit a wall. Any ideas on how to continue?
For reference, this is the work-in-progress aggregation query I am trying to expand on:
db.getCollection('some_collection').aggregate([
{ $group: { '_id': '$commandName', 'docs': {$addToSet: '$$ROOT'} } },
{ $sort: {'_id.docs.???': 1}}
])
Post-resolved edit
Thank you for the answers. I got what I needed. For future reference, this is the full query that will do what was requested and also return a list of the filtered documents, not groups.
db.getCollection('some_collection').aggregate([
{ $sort: {'executionDate': 1}},
{ $group: { '_id': '$commandName', 'result': { $last: '$$ROOT'} } },
{ $replaceRoot: {newRoot: '$result'} }
])
The query result without the $replaceRoot stage would be:
[
{
"_id": "migration_a",
"result": {
"commandName" : "migration_a",
"executionDate" : ISODate("2005-11-08T11:58:52.000Z")
}
},
{
"_id": "migration_b",
"result": {
"commandName" : "migration_b",
"executionDate" : ISODate("2016-06-02T19:48:34.000Z")
}
}
]
The outer _id and _result are just "group-wrappers" around the actual document I want, which is nested under the result key. Moving the nested document to the root of the result is done using the $replaceRoot stage. The query result when using that stage is:
[
{
"commandName" : "migration_a",
"executionDate" : ISODate("2005-11-08T11:58:52.000Z")
},
{
"commandName" : "migration_b",
"executionDate" : ISODate("2016-06-02T19:48:34.000Z")
}
]
Try this:
db.getCollection('some_collection').aggregate([
{ $sort: {'executionDate': -1}},
{ $group: { '_id': '$commandName', 'doc': {$first: '$$ROOT'} } }
])
I believe this will result in what you're looking for:
db.collection.aggregate([
{
$group: {
"_id": "$commandName",
"executionDate": {
"$last": "$executionDate"
}
}
}
])
You can check it out here
Of course, if you want to match your expected output exactly, you can add a sort (this may not be necessary since your goal is to simply return the newest document from each group):
{
$sort: {
"executionDate": 1
}
}
You can check this version out here.
The use-case the question presents is nearly covered in the $last aggregation operator documentation.
Which summarises:
the $group stage should follow a $sort stage to have the input
documents in a defined order. Since $last simply picks the last
document from a group.
Query: Link
db.collection.aggregate([
{
$sort: {
executionDate: 1
}
},
{
$group: {
_id: "$commandName",
executionDate: {
$last: "$executionDate"
}
}
}
]);

MongoDB Aggregation: $Project (how to use a field on the other field of the same projection pipeline)

This is what i want my aggregation pipeline to look, i just don't know how to properly do it
db.Collection.aggregate([
{
$project: {
all_bills: ‘$all_count’,
settled_bills: { $size: ’$settled’ },
overdue_bills: { $size: ‘$overdue’ },
settled_percentage: { $divide: [‘$settled_bills’, ‘$overdue_bills’] }
}
}
])
I want to use the "settled_bills" and "overdue_bills" fields inside the "settled_percentage" field on same projection pipeline. How to?
From what i can see, i think you want $let.
You can create local variable which can be used inside the $let expression.
Try this:
db.Collection.aggregate([
{
$project: {
all_bills: ‘$all_count’,
settled_bills: { $size: ’$settled’ },
overdue_bills: { $size: ‘$overdue’ },
settled_percentage: {
$let : {
vars : {
local_settled_bills : { $size : "$settled"},
local_overdue_bills : { $size : "$overdue"}
},
in : {
$divide : ["$$local_settled_bills","$$local_overdue_bills"]
}
}
}
}
}
])
Here, you create local varialbes in vars expression, which can be used inside(and only inside in expression). I have created local_settles_bills, and local_overdue_bills, and which can be used in in expression with $$ as prefix.
I hope this helps you out.
Read MongoDb $let documentation for detailed information on $let.
Alternatively, you can do this as well :
db.Collection.aggregate([
{
$project: {
all_bills: ‘$all_count’,
settled_bills: { $size: ’$settled’ },
overdue_bills: { $size: ‘$overdue’ },
settled_percentage: {
$divide : [{"$size" : "$settled_bills"},{"$size":"$overdue_bills"}]
}
}
}
])
So i guess there is no way I can use fields on other fields that co-exist on same projection pipeline.
(assume the settled_bills and overdue_bills consist not just the 'size' but with long query operators )
I'll just do this instead, so i will not repeat the code on the $divide.
db.Collection.aggregate([
{
$project: {
all_bills: ‘$all_count’,
settled_bills: { $size: ’$settled’ },
overdue_bills: { $size: ‘$overdue’ },
},
$project: {
settled_percentage: {
$divide : ['$settled_bills','$overdue_bills']
}
}
}
])

Mongo - finding records with keys containing dots

Mongo does not allow documents to have dots in their keys (see MongoDB dot (.) in key name or https://softwareengineering.stackexchange.com/questions/286922/inserting-json-document-with-in-key-to-mongodb ).
However we have a huge mongo database where some documents do contain dots in their keys. These documents are of the form:
{
"_id" : NumberLong(2761632),
"data" : {
"field.with.dots" : { ... }
}
}
I don't know how these records got inserted. I suspect that we must have had the check_keys mongod option set to false at some point.
My goal is to find the offending documents, to update them and remove the dots. I haven't found how to perform the search query. Here is what I tried so far:
db.collection.find({"data.field.with.dots" : { $exists : true }})
db.collection.find({"data.field\uff0ewith\uff0edots" : { $exists : true}})
You can use $objectToArray to get your data in form of keys and values. Then you can use $filter with $indexOfBytes to check if there are any keys with . inside of it . In the next step you can use $size to filter out those documents where remaining array is empty (no fields with dots), try:
db.col.aggregate([
{
$addFields: {
dataKv: {
$filter: {
input: { $objectToArray: "$data" },
cond: {
$ne: [ { $indexOfBytes: [ "$$this.k", "." ] } , -1 ]
}
}
}
}
},
{
$match: {
$expr: {
$ne: [ { $size: "$dataKv" }, 0 ]
}
}
},
{
$project: {
dataKv: 0
}
}
])
Mongo playground

MongoDB: Create Object in Aggregation result

I want to return Object as a field in my Aggregation result similar to the solution in this question. However in the solution mentioned above, the Aggregation results in an Array of Objects with just one item in that array, not a standalone Object. For example, a query like the following with a $push operation
$group:{
_id: "$publisherId",
'values' : { $push:{
newCount: { $sum: "$newField" },
oldCount: { $sum: "$oldField" } }
}
}
returns a result like this
{
"_id" : 2,
"values" : [
{
"newCount" : 100,
"oldCount" : 200
}
]
}
}
not one like this
{
"_id" : 2,
"values" : {
"newCount" : 100,
"oldCount" : 200
}
}
}
The latter is the result that I require. So how do I rewrite the query to get a result like that? Is it possible or is the former result the best I can get?
You don't need the $push operator, just add a final $project pipeline that will create the embedded document. Follow this guideline:
var pipeline = [
{
"$group": {
"_id": "$publisherId",
"newCount": { "$sum": "$newField" },
"oldCount": { "$sum": "$oldField" }
}
},
{
"$project" {
"values": {
"newCount": "$newCount",
"oldCount": "$oldCount"
}
}
}
];
db.collection.aggregate(pipeline);

MongoDB Aggregation: Counting distinct fields

I am trying to write an aggregation to identify accounts that use multiple payment sources. Typical data would be.
{
account:"abc",
vendor:"amazon",
}
...
{
account:"abc",
vendor:"overstock",
}
Now, I'd like to produce a list of accounts similar to this
{
account:"abc",
vendorCount:2
}
How would I write this in Mongo's aggregation framework
I figured this out by using the $addToSet and $unwind operators.
Mongodb Aggregation count array/set size
db.collection.aggregate([
{
$group: { _id: { account: '$account' }, vendors: { $addToSet: '$vendor'} }
},
{
$unwind:"$vendors"
},
{
$group: { _id: "$_id", vendorCount: { $sum:1} }
}
]);
Hope it helps someone
I think its better if you execute query like following which will avoid unwind
db.t2.insert({_id:1,account:"abc",vendor:"amazon"});
db.t2.insert({_id:2,account:"abc",vendor:"overstock"});
db.t2.aggregate([
{ $group : { _id : { "account" : "$account", "vendor" : "$vendor" }, number : { $sum : 1 } } },
{ $group : { _id : "$_id.account", number : { $sum : 1 } } }
]);
Which will show you following result which is expected.
{ "_id" : "abc", "number" : 2 }
You can use sets
db.test.aggregate([
{$group: {
_id: "$account",
uniqueVendors: {$addToSet: "$vendor"}
}},
{$project: {
_id: 1,
vendorsCount: {$size: "$uniqueVendors"}
}}
]);
I do not see why somebody would have to use $group twice
db.t2.aggregate([ { $group: {"_id":"$account" , "number":{$sum:1}} } ])
This will work perfectly fine.
This approach doesn't make use of $unwind and other extra operations. Plus, this won't affect anything if new things are added into the aggregation. There's a flaw in the accepted answer. If you have other accumulated fields in the $group, it would cause issues in the $unwind stage of the accepted answer.
db.collection.aggregate([{
"$group": {
"_id": "$account",
"vendors": {"$addToSet": "$vendor"}
}
},
{
"$addFields": {
"vendorCount": {
"$size": "$vendors"
}
}
}])
To identify accounts that use multiple payment sources:
Use grouping to count data from multiple account records and group the result by account with count
Use a match case is to filter only such accounts having more than one payment method
db.payment_collection.aggregate([ { $group: {"_id":"$account" ,
"number":{$sum:1}} }, {
"$match": {
"number": { "$gt": 1 }
}
} ])
This will work perfectly fine,
db.UserModule.aggregate(
{ $group : { _id : { "companyauthemail" : "$companyauthemail", "email" : "$email" }, number : { $sum : 1 } } },
{ $group : { _id : "$_id.companyauthemail", number : { $sum : 1 } } }
);
An example
db.collection.distinct("example.item").forEach( function(docs) {
print(docs + "==>>" + db.collection.count({"example.item":docs}))
});