How to retrieve data from related documents by ID? - mongodb

I'm trying to understand how to set up basic relations in mongoDB. I've read a bit about it in the documentation but it's a little terse.
This should be pretty simple: I'm trying to log a list of impressions and the users who are responsible for the impressions. Here's some examples of log documents:
{type: '1', userId:'xxx-12345'}
{type: '1', userId:'xxx-12345'}
{type: '1', userId:'xxx-12345'}
{type: '2', userId:'zzz-84638'}
{type: '2', userId:'xxx-12345'}
Here's an example of a user document:
{userId: 'xxx-12345', location: 'US'}
Is there a way to count the total number of documents which "belong" to a userId of xxx-12345, where type is 1?
In the above case, I'd want to see a result like { '1':3, '2':1 }.
Also, is the above an acceptable way of creating the relationships?

For your 1st question Is there a way to count the total number of documents which "belong" to a userId of xxx-12345, where type is 1?, below is the solution:
db.impressions.aggregate({
$match: {
userId: 'xxx-12345',
type: 1
}
},
{
$group: { _id: null, count: { $sum: 1 } }
});
To get the solution in format you specified (In the above case, I'd want to see a result like { '1':3, '2':1 }.), use below code:
db.impressions.aggregate({
$match: {
userId: 'xxx-12345',
}
},
{
$group: { _id: '$type', totalImpressions: { $sum: 1 } }
});

You can use the Aggregation Pipeline introduced in version 2.2:
db.a.aggregate([
{ $match: { userId: 'xxx-12345' } },
{ $group: { _id: "$type", total: { $sum: 1 } } }
])
This will output:
{
"result" : [
{
"_id" : "2",
"total" : 1
},
{
"_id" : "1",
"total" : 3
}
],
"ok" : 1
}
where "_id" is the type and "total" is the count that type appears in user "xxx-12345".
However, if you want to get only the total number of documents which belong to "xxx-12345" where the type is "1" you can do it like this:
db.a.aggregate([
{ $match: { userId: 'xxx-12345', type: "1" } },
{ $group: { _id: null, count: { $sum: 1} } }
])
which will output the following:
{ "result" : [ { "_id" : null, "count" : 3 } ], "ok" : 1 }
where "count" is what you're looking for.

Related

Mongodb Query Aggregation and Groupby complex filter , sum , percent query

I have a complex group query.
Data is as follows:
Aggregation as follows:
match by doc_id
group by name
project: name, name_count, amount, desc as { value: identifed by max sum of amount in that list of desc , count: sum of (percent*100)^2, percent:its percent considering amount in that list}
same with L1 and L2. But L1 L2 are referenced field {_id, name} from another collection. So, I need to project both _id, name and what I do in point 3 above.
Therefore after execution lets say result would be :
...
},
"_id" : {
"name" : "abc"
},
"amount" : 45.0,
"count" : 4.0,
"desc" : {
"value" : "Laptop", // based on highest sum amount in group:'abc' i.e. 25.0 for laptop
"count" : 5061.72, // (56*100)^2 + (44*100)^2
"percent" : 25.0*100/45.0 = 56.0
},
...
Test Data Link: MonogoDb Playground
Udpated: 07/11/2019
Added example for calculating count
Hope I was clear. Kindly help.
Don't understand the calculation you need for count. However, here's the query you can use to fit your need :
db.collection.aggregate([
{
$match: {
"doc_id": 1
}
},
{
$group: {
_id: {
name: "$name",
desc: "$desc"
},
amount: {
$sum: "$amount"
},
count: {
$sum: 1
},
}
},
{
$sort: {
"_id.name": 1,
"amount": -1
}
},
{
$group: {
_id: "$_id.name",
amount: {
$sum: "$amount"
},
count: {
$sum: "$count"
},
desc: {
$first: {
value: "$_id.desc",
descAmount: "$amount"
}
}
},
},
{
$addFields: {
"desc.percent": {
$multiply: [
{
$divide: [
"$desc.descAmount",
"$amount"
]
},
100
]
}
}
}
])
The tip is to group twice, with a sort between, to get sub-total and first element (the one with the biggest sub-total for each name).
Now you can adapt you count calculation as you need.
You can test it here.

How to use nested grouping in MongoDB

I need to find total count of duplicate profiles per organization level. I have documents as shown below:
{
"OrganizationId" : 10,
"Profile" : {
"_id" : "75"
}
"_id" : "1"
},
{
"OrganizationId" : 10,
"Profile" : {
"_id" : "75"
}
"_id" : "2"
},
{
"OrganizationId" : 10,
"Profile" : {
"_id" : "77"
}
"_id" : "3"
},
{
"OrganizationId" : 10,
"Profile" : {
"_id" : "77"
}
"_id" : "4"
}
I have written query which is a group by ProfileId and OrganizationId. The results i am getting as shown below:
Organization Total
10 2
10 2
But i want to get the sum of total per organization level, that means Org 10 should have one row with sum of 4.
The query i am using as shown below:
db.getSiblingDB("dbName").OrgProfile.aggregate(
{ $project: { _id: 1, P: "$Profile._id", O: "$OrganizationId" } },
{ $group: {_id: { p: "$P", o: "$O"}, c: { $sum: 1 }} },
{ $match: { c: { $gt: 1 } } });
Any ideas ? Please help
The following pipeline should give you the desired output, whereas the last $project stage is just for cosmetic purposes to turn _id into OrganizationId but is not needed for the essential computation so you may omit it.
db.getCollection('yourCollection').aggregate([
{
$group: {
_id: { org: "$OrganizationId", profile: "$Profile._id" },
count: { $sum: 1 }
}
},
{
$group: {
_id: "$_id.org",
Total: {
$sum: {
$cond: {
if: { $gte: ["$count", 2] },
then: "$count",
else: 0
}
}
}
}
},
{
$project: {
_id: 0,
Organization: "$_id",
Total: 1
}
}
])
gives this output
{
"Total" : 4.0,
"Organization" : 10
}
To filter out organizations without duplicates you can use $match which will also result in a simplification of the second $group stage
...aggregate([
{
$group: {
_id: { org: "$OrganizationId", profile: "$Profile._id" },
count: { $sum: 1 }
}
},
{
$match: {
count: { $gte: 2 }
}
},
{
$group: {
_id: "$_id.org",
Total: { $sum: "$count" }
}
},
{
$project: {
_id: 0,
Organization: "$_id",
Total: 1
}
}
])
I think I have a solution for you. In that last step there, instead of matching, I think you want another $group.
.aggregate([
{ $project: { _id: 1, P: "$Profile._id", O: "$OrganizationId" } }
,{ $group: {_id: { p: "$P", o: "$O"}, c: { $sum: 1 }} }
,{ $group: { _id: "$_id.o" , c: { $sum: "$c" } }}
]);
You can probably read it and figure out yourself what's happening in that last step, but just in case I'll explain. the last step is group all documents that have the same organization id, and then summing the quantity specified by the previous c field. After the first group, you had two documents that both had a count c of 2 but different profile id. The next group ignores the profile id and just groups them if they have the same organization id and adds their counts.
When I ran this query, here is my result, which is what I think you're looking for:
{
"_id" : 10,
"c" : 4
}
Hope this helps. Let me know if you have any questions.

MongoDB aggregate and count

A document in collection called 'myCollection' looks like this:
{
_id : 57b4b4e028108d801738a472,
updatedAt : 2016-08-17T19:03:01.831+0000,
createdAt : 2016-08-17T19:02:56.887+0000,
from : 57b1c2fc4bf55ba009b36c84,
to : 57b1c75e4bf55ba009b36c85,
}
I need to count the occurrences of 'from' and 'to' and end up with collection of documents like this:
{
"_id" : 7b1c2fc4bf55ba009b36c84,
"occurredInFrom" : 12,
"occurredInTo" : 16
}
where _id comes from either '$from' or '$to'.
The incorrect aggregate query I've written is this:
{
$group: {
_id: "$from",
occurredInFrom: { $sum: 1 },
occurredInTo: { $sum: 1}
}
}
I can definitely see that _id: "$from" is not sufficient. Can you please show me the correct way?
Note: The structure of 'myCollection' is not final, if you think there is a better structure, please suggest it.
Try this
db.myCollection.aggregate([
{ $project:
{ _id: 0,
dir: [
{id:"$from", from:{"$sum":1}, to:{"$sum":0}},
{id:"$to", from:{"$sum":0}, to:{"$sum":1}}
]
}
},
{ $unwind : "$dir" },
{ $group:
{
_id: "$dir.id",
occurredInFrom: { $sum: "$dir.from" },
occurredInTo: { $sum: "$dir.to" }
}
}
])

"Structured" grouping query in MongoDB

I have the following items collection :
[{
"_id": 1,
"manufactureId": 1,
"itemTypeId": "Type1"
},
{
"_id": 2,
"manufactureId": 1,
"itemTypeId": "Type2"
},
{
"_id": 3,
"manufactureId": 2,
"itemTypeId": "Type1"
}]
I would like to create a query that will return the amount of items for each item type that each manufacturer have in the following structure (or something similar) :
[
{
_id:1, //this would be the manufactureId
itemsCount:{
"Type1":1, //Type1 items count
"Type2":1 //...
}
},
{
_id:2,
itemsCount:{
"Type1":1
}
}
]
I have tried to use the aggregation framework but i couldn't figure out if there is a way to create a "structured" groupby queries with it.
I can easily achieve the desired result by post-processing this simple aggregation query result :
db.items.aggregate([{$group:{_id:{itemTypeId:"$itemTypeId",manufactureId:"$manufactureId"},count:{$sum:1}}}])
but if possible I prefer not to post-process the result.
Data stays data
I would rather use this query which, I believe, will give you the closest data structure to what you want, without post-processing.
Query
db.items.aggregate(
{
$group:
{
_id:
{
itemTypeId: "$itemTypeId",
manufactureId: "$manufactureId"
},
count:
{
$sum: 1
}
},
},
{
$group:
{
_id: "$_id.manufactureId",
itemCounts:
{
"$push":
{
itemTypeId: "$_id.itemTypeId",
count: "$count"
}
}
}
})
Output
{
"_id" : 1,
"itemCounts" : [
{
"itemTypeId" : "Type1",
"count" : 1
},
{
"itemTypeId" : "Type2",
"count" : 1
}
]
},
{
"_id" : 2,
"itemCounts" : [
{
"itemTypeId" : "Type1",
"count" : 1
}
]
}
Data transformed to object fields
This is actually an approach that I wouldn't advice in general. It is harder to manage in your application, because the field names between different objects will be inconsistent and you won't know what object fields to expect in advance. This would be a crucial point if you use a strongly typed language—automatic data binding to your domain objects will become impossible.
Anyway, the only way to get the exact data structure you want is to apply post-processing.
Query
db.items.aggregate(
{
$group:
{
_id:
{
itemTypeId: "$itemTypeId",
manufactureId: "$manufactureId"
},
count:
{
$sum: 1
}
},
},
{
$group:
{
_id: "$_id.manufactureId",
itemCounts:
{
"$push":
{
itemTypeId: "$_id.itemTypeId",
count: "$count"
}
}
}
}).forEach(function(doc) {
var obj = {
_id: doc._id,
itemCounts: {}
};
doc.itemCounts.forEach(function(typeCount) {
obj.itemCounts[typeCount.itemTypeId] = typeCount.count;
});
printjson(obj);
})
Output
{ "_id" : 1, "itemCounts" : { "Type1" : 1, "Type2" : 1 } }
{ "_id" : 2, "itemCounts" : { "Type1" : 1 } }

Mongodb : Search for entries having one to many mapping between 2 fields

I have a mongodb database, containing entities of ECommerceProducts. There are two fields, "productId" and "skuId". The thing is many of the records are duplicated, i.e., it is possible that two entries have same "productId" as well as same "skuId".
I want to find the set of productIds that have multiple (distinct) skuIds present.
This is what I have till now:
db.urls.aggregate([
{ $group: {
_id: { productId: "$productId" },
count: { $sum: 1 }
} },
{ $match: {
count: { $gte: 2 }
} },
{ $sort : { count : -1} },
{ $limit : 10 }
]);
This code gives me the list of Duplicate productIds and how many times they have occurred. How can I also get the list of different skuIds these contain?
You can use the $addToSet accumulator
db.urls.aggregate([
{ $group: {
_id: { productId: "$productId" },
skuId: {$addToSet: "$skuId"},
count: { $sum: 1 }
} },
{ $match: {
count: { $gte: 2 }
} },
{ $sort : { count : -1} },
{ $limit : 10 }
]);
This will return all product IDs that appear more than once with a distinct set of all skuId used by them.