Here is the example of the JSON. I want to count the total customers by unique "email_address" and filter by "transactions_status" equal to "S".
Expected output:
{
_id: null,
count: total amount of customers with an unique email address and filtered by status equal S
}
You can try this aggregations pipeline:
Filter by "transactions_status" equal to "S".
Group distinct emails.
Project the count field with the emails's size.
db.collection.aggregate([
{
$match: {
"transaction_info.transaction_status": "S"
}
},
{
$group: {
_id: null,
emails: {
$addToSet: "$payer_info.email_address"
}
}
},
{
$project: {
count: {
$size: "$emails"
}
}
}
]);
Related
I have this collection(some irrelevant fields were omitted for brevity):
clients: {
userId: ObjectId,
clientSalesValue: Number,
currentDebt: Number,
}
Then I have this query that matches all the clients for a specific user, then calculates the sum of all debts and sales and put those results in a separate field each of them:
await clientsCollection.aggregate([
{
$match: { userId: new ObjectId(userId) }
},
{
$group: {
_id: null,
totalSalesValue: { $sum: '$clientSalesValue' },
totalDebts: { $sum: '$currentDebt' },
}
},
{
$unset: ['_id']
}
]).exec();
This works as expected, it returns an array with only one item which is an object, but now I need to also include in that resultant object a field for the amount of debtors, that is for the amount of clients that have currentDebt > 0, how can I do that is the same query? is it possible?
PD: I cannot modify the $match condition, it need to always return all the clients for the corresponding users.
To include a count of how many matching documents have a positive currentDebt, you can use the $sum and $cond operators like so:
await clientsCollection.aggregate([
{
$match: { userId: new ObjectId(userId) }
},
{
$group: {
_id: null,
totalSalesValue: { $sum: '$clientSalesValue' },
totalDebts: { $sum: '$currentDebt' },
numDebtors: {
$sum: {
$cond: [{ $gt: ['$currentDebt', 0] }, 1, 0]
}
},
}
},
{
$unset: ['_id']
}
]).exec();
I have the following collection in mongodb:
IDcustomer. idServicerequired. ...
001. 13
002. 15
002. 19
002. 10
003. null
From this, i want to get the average number of services required by each customer (in this case, the output should be (1+3+0)/3 = 1.34)
I tried as follows, but in this way, for each customer that has required no service, it is counted 1, as if he had required one service, so the average is higher than expected (in this case it would be (1+3+1)/3=1.67)
first group, check condition if idServicerequired is null then count 0
second $group by null and average count
db.collection.aggregate([
{
$group: {
_id: "$idCustomer",
count: {
$sum: {
$cond: [{ $eq: ["$idServicerequired", null] }, 0, 1]
}
}
}
},
{
$group: {
_id: null,
count: { $avg: "$count" }
}
}
])
Playground
Assume I have a collection with millions of documents. Below is a sample of how the documents look like
[
{ _id:"1a1", points:[2,3,5,6] },
{ _id:"1a2", points:[2,6] },
{ _id:"1a3", points:[3,5,6] },
{ _id:"1b1", points:[1,5,6] },
{ _id:"1c1", points:[5,6] },
// ... more documents
]
I want to query a document by _id and return a document that looks like below:
{
_id:"1a1",
totalPoints: 16,
rank: 29
}
I know I can query the whole document, sort by descending order then get the index of the document I want by _id and add one to get its rank. But I have worries about this method.
If the documents are in millions won't this be 'overdoing' it. Querying a whole collection just to get one document? Is there a way to achieve what I want to achieve without querying the whole collection? Or the whole collection has to be involved because of the ranking?
I cannot save them ranked because the points keep on changing. The actual code is more complex but the take away is that I cannot save them ranked.
Total points is the sum of the points in the points array. The rank is calculated by sorting all documents in descending order. The first document becomes rank 1 and so on.
an aggregation pipeline like the following can get the result you want. but how it operates on a collection of millions of documents remains to be seen.
db.collection.aggregate(
[
{
$group: {
_id: null,
docs: {
$push: { _id: '$_id', totalPoints: { $sum: '$points' } }
}
}
},
{
$unwind: '$docs'
},
{
$replaceWith: '$docs'
},
{
$sort: { totalPoints: -1 }
},
{
$group: {
_id: null,
docs: { $push: '$$ROOT' }
}
},
{
$set: {
docs: {
$map: {
input: {
$filter: {
input: '$docs',
as: 'x',
cond: { $eq: ['$$x._id', '1a3'] }
}
},
as: 'xx',
in: {
_id: '$$xx._id',
totalPoints: '$$xx.totalPoints',
rank: {
$add: [{ $indexOfArray: ['$docs._id', '1a3'] }, 1]
}
}
}
}
}
},
{
$unwind: '$docs'
},
{
$replaceWith: '$docs'
}
])
I have a documents like
{
data: [{"channel":"712064846325219432","message":1019},{"channel":"712064884812021801","message":4}],
user: '290494169783205888',
},
{
data: [{"channel":"712064846325219432","message":2000},{"channel":"712064884812021801","message":500}],
user: '534099893979971584',
}
So how can I count data's message and sort this documents by descending message?
Use aggregation pipeline stages $unwind and $group to count the message for each user then sort by the total number of messages. Check the example.
db.collection.aggregate([
{
$unwind: {
path: "$data"
}
},
{
$group: {
_id: "$user",
total_message: {
$sum: "$data.message"
}
}
},
{
$sort: {
total_message: -1
}
}
])
Results:
[
{
"_id": "534099893979971584",
"total_message": 2500
},
{
"_id": "290494169783205888",
"total_message": 1023
}
]
you can use Query.sort()
For descending order you can either use -1, desc or descending
Query.sort(message: -1)
How I can implement on MongoDb this SQL Query
SELECT TOP 100 * FROM Tracks
WHERE ID IN (SELECT MAX(ID) FROM Tracks WHERE UserID IN ([UserIDs...]) GROUP BY UserID)
Tracks structure:
Tracks[{_id, userId, {lat, lon}, dateCreate, ...}, ...]
Thanks!
You'd want to use the aggregation framework for this:
db.Tracks.aggregate( [
{ $match: { 'UserID': { $in: [ UserIDs ] } } },
{ $group: { _id: '$UserID', max: { $max: '$_id' } },
{ $sort: { $max: -1 } },
{ $limit: 100 }
] );
First we match against the wanted UserIDs, then we group depending on UserID and also put the maximum _id value in the new max field. Then we sort by max descendently to get the highest max numbers first and then we limit by the top 100.