Count of a nested value of all entries in mongodb collection - mongodb

I have a collection named outbox which has this kind of structure
"_id" :ObjectId("5a94e02bb0445b1cc742d795"),
"track" : {
"added" : {
"date" : ISODate("2020-12-03T08:48:51.000Z")
}
},
"provider_status" : {
"job_number" : "",
"count" : {
"total" : 1,
"sent" : 0,
"delivered" : 0,
"failed" : 0
},
"delivery" : []
}
I have 2 tasks. First I want the sum of all the "total","sent","failed" on all the entries in the collection no matter what their objectId is. ie I want sum of all the "total","sent","delivered" and "failed". Second I want all these only for a given object Id between Start and End date.
I am trying to find total using this query
db.outbox.aggregate(
{ $group: { _id : null, sum : { $sum: "$provider_status.count.total" } } });
But I am getting this error as shown
Since I do not have much experience in mongodb I don't have any idea how to do these two tasks. Need help here.

You are executing this in Robo3t seems like.
You need to enclose this in an array like
db.test.aggregate([ //See here
{
$group: {
_id: null,
sum: {
$sum: "$provider_status.count.total"
}
}
}
])//See here
But it's not the case with playground as they handle them before submitting to the server

Related

MongoDB $divide on aggregate output

Is there a possibility to calculate mathematical operation on already aggregated computed fields?
I have something like this:
([
{
"$unwind" : {
"path" : "$users"
}
},
{
"$match" : {
"users.r" : {
"$exists" : true
}
}
},
{
"$group" : {
"_id" : "$users.r",
"count" : {
"$sum" : 1
}
}
},
])
Which gives an output as:
{ "_id" : "A", "count" : 7 }
{ "_id" : "B", "count" : 49 }
Now I want to divide 7 by 49 or vice versa.
Is there a possibility to do that? I tried $project and $divide but had no luck.
Any help would be really appreciated.
Thank you,
From your question, it looks like you are assuming result count to be 2 only. In that case I can assume users.r can have only 2 values(apart from null).
The simplest thing I suggest is to do this arithmetic via javascript(if you're using it in mongo console) or in case of using it in progam, use the language you're using to access mongo) e.g.
var results = db.collection.aggregate([theAggregatePipelineQuery]).toArray();
print(results[0].count/results[1].count);
EDIT: I am sharing an alternative to above approach because OP commented about the constraint of not using javascript code and the need to be done only via query. Here it is
([
{ /**your existing aggregation stages that results in two rows as described in the question with a count field **/ },
{ $group: {"_id": 1, firstCount: {$first: "$count"}, lastCount: {$last: "$count"}
},
{ $project: { finalResult: { $divide: ['$firstCount','$lastCount']} } }
])
//The returned document has your answer under `finalResult` field

Querying date range in aggregate query returns nothing or ignores dates

In my aggregate query I'm trying to add conditions in the $match statement to return only records within given date range. Without converting to ISOString, I get a set of records that ignores the date range completely. When I convert to ISOString, I get nothing (returns empty set). I've tried using the $and operator, still nothing.
I've tried all the solutions on stack to no avail. Here's my code:
$match: {
$and: [
{'author.id': { $ne: req.user._id }},
{'blurtDate': { $gte: test1.toISOString() }},
{'blurtDate': { $lte: test2.toISOString() }}
]
}
test1 and test2 are correct, I checked them on console log they reflect as follows:
2019-06-02T12:44:39.000Z -- 2019-07-02T12:44:39.928Z
I also tried without the $and operator like so:
$match: {
'author.id': { $ne: req.user._id },
'blurtDate': { $gte: test1.toISOString() },
'blurtDate': { $lte: test2.toISOString() }
}
Which again returns nothing. Any help much appreciated!
EDIT: Wanted to emphasize that test1 and test2 are new date objects:
test1 = new Date(qryDateFrom); //Tried .toISOString() as well
test2 = new Date(qryDateTo);
Without .toISOString(), I get a return of values that ignores the dates. With .toISOString I get an empty return.
Here's an example document that should be returned:
{
"_id" : ObjectId("5d0a807345c85d00ac4b7217"),
"text" : "<p>Seriously RV style.</p>",
"blurtDate" : ISODate("2019-06-19T18:35:31.156Z"),
"blurtImg" : "04643410-92c1-11e9-80b6-a3262311afff.png",
"vote" : 0,
"author" : {
"id" : ObjectId("5cb5df0ef7a3570bb4ac6e05"),
"name" : "Benjamin Paine"
},
"__v" : 0
}
When I remove .toISOString(), I get documents outside of the expected date range, such as this one in May (query should only return between june 2 and july 2).
{
"_id" : ObjectId("5d07ebaf9a035117e4546349"),
"text" : "<p>A start to something more...</p>",
"blurtDate" : ISODate("2019-05-15T19:36:15.737Z"),
"blurtImg" : "2be7a160-9137-11e9-933f-6966b2e503c7.png",
"vote" : 0,
"author" : {
"id" : ObjectId("5cb5df0ef7a3570bb4ac6e05"),
"name" : "Benjamin Paine"
},
"__v" : 0
}
Your docs contain actual Date objects, so remove the .toISOString()s from your query. But you'll also need to combine your $gte and $lte terms into a single object:
$match: {
'author.id': { $ne: req.user._id },
'blurtDate': { $gte: test1, $lte: test2 }
}

Mongo DB - how to query for id dependent on oldest date in array of a field

Lets say I have a collection called phone_audit with document entries of the following form - _id which is the phone number, and value containing items that always contains 2 entries (id, and a date).
Please see below:
{
"_id" : {
"phone_number" : "+012345678"
},
"value" : {
"items" : [
{
"_id" : "c14b4ac1db691680a3fb65320fba7261",
"updated_at" : ISODate("2016-03-14T12:35:06.533Z")
},
{
"_id" : "986b58e55f8606270f8a43cd7f32392b",
"updated_at" : ISODate("2016-07-23T11:17:53.552Z")
}
]
}
},
......
I need to get a list of _id values for every entry in that collection representing the older of the two items in each document.
So in the above - result would be [c14b4ac1db691680a3fb65320fba7261,...]
Any pointers at the type of query to execute would be v.helpful even if the exact syntax is not correct.
With aggregate(), you can $unwind value.items, $sort by update_at, then use $first to get the oldest:
[
{
"$unwind": "$value.items"
},
{
"$sort": { "value.items.updated_at": 1 }
},
{
"$group":{
_id: "$_id.phone_number",
oldest:{$first:"$value.items"}
}
},
{
"$project":{
value_id: "$oldest._id"
}
}
]

check if value exists in array field in mongodb

I want to check if user id exists inside an array field of mongodb (using meteor)
db.posts.find().pretty()
{
"_id" : "hT3ezqEyTaiihoh6Z",
"body" : "hey\n",
"authorId" : "AyJo5nf2Lkdqd6aRh",
"createdAt" : ISODate("2016-05-13T06:19:34.726Z"),
"updatedAt" : ISODate("2016-05-13T06:19:34.726Z"),
"likecount" : 0,
"already_voted" : [ ] }
db.posts.find( { _id:"hT3ezqEyTaiihoh6Z"},{ already_voted: { $in : ["AyJo5nf2Lkdqd6aRh"]} }).count()
1
It gives count value 1 , where as I am expecting it to be 0 .
Your logic is fine. Just the syntax is wrong.
db.posts
.find({
_id: "hT3ezqEyTaiihoh6Z",
already_voted: { $in: ["AyJo5nf2Lkdqd6aRh"] },
})
.count();
This should work.
You can just simply use count method. Don't need to use two operation like Find and then count.
db.posts
.count({
_id: "hT3ezqEyTaiihoh6Z",
already_voted: { $in: ["AyJo5nf2Lkdqd6aRh"] }
});

Upsert with pymongo and a custom _id field

I'm attempting to store pre-aggregated performance metrics in a sharded mongodb according to this document.
I'm trying to update the minute sub-documents in a record that may or may not exist with an upsert like so (self.collection is a pymongo collection instance):
self.collection.update(query, data, upsert=True)
query:
{ '_id': u'12345CHA-2RU020130304',
'metadata': { 'adaptor_id': 'CHA-2RU',
'array_serial': 12345,
'date': datetime.datetime(2013, 3, 4, 0, 0, tzinfo=<UTC>),
'processor_id': 0}
}
data:
{ 'minute': { '16': { '45': 1.6693091}}}
The problem is that in this case the 'minute' subdocument always only has the last hour: { minute: metric} entry, the minute subdocument does not create new entries for other hours, it's always overwriting the one entry.
I've also tried this with a $set style data entry:
{ '$set': { 'minute': { '16': { '45': 1.6693091}}}}
but it ends up being the same.
What am I doing wrong?
In both of the examples listed you are simply setting a field ('minute')to a particular value, the only reason it is an addition the first time you update is because the field itself does not exist and so must be created.
It's hard to determine exactly what you are shooting for here, but I think what you could do is alter your schema a little so that 'minute' is an array. Then you could use $push to add values regardless of whether they are already present or $addToSet if you don't want duplicates.
I had to alter your document a little to make it valid in the shell, so my _id (and some other fields) are slightly different to yours, but it should still be close enough to be illustrative:
db.foo.find({'_id': 'u12345CHA-2RU020130304'}).pretty()
{
"_id" : "u12345CHA-2RU020130304",
"metadata" : {
"adaptor_id" : "CHA-2RU",
"array_serial" : 12345,
"date" : ISODate("2013-03-18T23:28:50.660Z"),
"processor_id" : 0
}
}
Now let's add a minute field with an array of documents instead of a single document:
db.foo.update({'_id': 'u12345CHA-2RU020130304'}, { $addToSet : {'minute': { '16': {'45': 1.6693091}}}})
db.foo.find({'_id': 'u12345CHA-2RU020130304'}).pretty()
{
"_id" : "u12345CHA-2RU020130304",
"metadata" : {
"adaptor_id" : "CHA-2RU",
"array_serial" : 12345,
"date" : ISODate("2013-03-18T23:28:50.660Z"),
"processor_id" : 0
},
"minute" : [
{
"16" : {
"45" : 1.6693091
}
}
]
}
Then, to illustrate the addition, add a slightly different entry (since I am using $addToSet this is required for a new field to be added:
db.foo.update({'_id': 'u12345CHA-2RU020130304'}, { $addToSet : {'minute': { '17': {'48': 1.6693391}}}})
db.foo.find({'_id': 'u12345CHA-2RU020130304'}).pretty()
{
"_id" : "u12345CHA-2RU020130304",
"metadata" : {
"adaptor_id" : "CHA-2RU",
"array_serial" : 12345,
"date" : ISODate("2013-03-18T23:28:50.660Z"),
"processor_id" : 0
},
"minute" : [
{
"16" : {
"45" : 1.6693091
}
},
{
"17" : {
"48" : 1.6693391
}
}
]
}
I ended up setting the fields like this:
query:
{ '_id': u'12345CHA-2RU020130304',
'metadata': { 'adaptor_id': 'CHA-2RU',
'array_serial': 12345,
'date': datetime.datetime(2013, 3, 4, 0, 0, tzinfo=<UTC>),
'processor_id': 0}
}
I'm setting the metrics like this:
data = {"$set": {}}
for metric in csv:
date_utc = metric['date'].astimezone(pytz.utc)
data["$set"]["minute.%d.%d" % (date_utc.hour,
date_utc.minute)] = float(metric['metric'])
which creates data like this:
{"$set": {'minute.16.45': 1.6693091,
'minute.16.46': 1.566343,
'minute.16.47': 1.22322}}
So that when self.collection.update(query, data, upsert=True) is run it updates those fields.