MongoDB: How to count a field if it's value matches a condition? - mongodb

I have a document that includes a field like this:
{
...
log: [
{
utc_timestamp: ISODate("2014-11-15T10:26:47.337Z"),
type: "clicked"
},
{
utc_timestamp: ISODate("2014-10-15T16:12:51.959Z"),
type: "emailed"
},
{
utc_timestamp: ISODate("2014-10-15T16:10:51.959Z"),
type: "clicked"
},
{
utc_timestamp: ISODate("2014-09-15T04:59:19.431Z"),
type: "emailed"
},
{
utc_timestamp: ISODate("2014-09-15T04:58:19.431Z"),
type: "clicked"
},
],
...
}
How do I get the count of log entries of type "clicked" from this month, only if there is not a log entry of type "emailed" this month?
In other words, I want to find out which clicks have not been sent a related email.
So, in this example, the count would be 1 since the most recent "clicked" entry doesn't have an "emailed" entry.
Note: For this use case, clicks don't have unique IDs - this is all the data that is logged.

Use the following aggregation pipeline:
db.click_log.aggregate([
{ "$match" : { "log.type" : { "$ne" : "emailed" } } }, // get rid of docs with an "emailed" value in log.type and docs not from this month
{ "$unwind" : "$log" }, // unwind to get log elements as separate docs
{ "$project" : { "_id" : 1, "log" : 1, "month" : { "$month" : "$log.utc_timestamp" } } },
{ "$match" : { "log" : "clicked", "month" : <# of month> } }, // get rid of log elements not from this month and that aren't type clicked
{ "$group" : { "_id" : "$_id", "count" : { "$sum" : 1 } } } // collect clicked elements from same original doc and count number
])
This will return, for each document not having "emailed" as a value of log.type, the count of elements of the array log that have log.type value clicked and with timestamp from the current month. If you want a sliding 30-day period for month, change the $match to be a range query with $gt and $lt covering the desired time period.

You can use query something similar to below.
db.dbversitydotcom_col.aggregate([ { $unwind: “$log” },
{ $match: { “log.type” : “clicked”, "log.utc_timestamp" : "your required date" } },
{ $sort: { “Files.Size” : -1.0 } }, { $limit: 5.0 } ]).count()
Please refer to http://dbversity.com/mongodb-importance-of-aggregation-framework/ for more detailed explanation,

Related

Count of a nested value of all entries in mongodb collection

I have a collection named outbox which has this kind of structure
"_id" :ObjectId("5a94e02bb0445b1cc742d795"),
"track" : {
"added" : {
"date" : ISODate("2020-12-03T08:48:51.000Z")
}
},
"provider_status" : {
"job_number" : "",
"count" : {
"total" : 1,
"sent" : 0,
"delivered" : 0,
"failed" : 0
},
"delivery" : []
}
I have 2 tasks. First I want the sum of all the "total","sent","failed" on all the entries in the collection no matter what their objectId is. ie I want sum of all the "total","sent","delivered" and "failed". Second I want all these only for a given object Id between Start and End date.
I am trying to find total using this query
db.outbox.aggregate(
{ $group: { _id : null, sum : { $sum: "$provider_status.count.total" } } });
But I am getting this error as shown
Since I do not have much experience in mongodb I don't have any idea how to do these two tasks. Need help here.
You are executing this in Robo3t seems like.
You need to enclose this in an array like
db.test.aggregate([ //See here
{
$group: {
_id: null,
sum: {
$sum: "$provider_status.count.total"
}
}
}
])//See here
But it's not the case with playground as they handle them before submitting to the server

Spring data MongoDb query based on last element of nested array field

I have the following data (Cars):
[
{
"make" : “Ferrari”,
"model" : “F40",
"services" : [
{
"type" : "FULL",
“date_time" : ISODate("2019-10-31T09:00:00.000Z"),
},
{
"type" : "FULL",
"scheduled_date_time" : ISODate("2019-11-04T09:00:00.000Z"),
}
],
},
{
"make" : "BMW",
"model" : “M3",
"services" : [
{
"type" : "FULL",
"scheduled_date_time" : ISODate("2019-10-31T09:00:00.000Z"),
},
{
"type" : "FULL",
“scheduled_date_time" : ISODate("2019-11-04T09:00:00.000Z"),
}
],
}
]
Using Spring data MongoDb I would like a query to retrieve all the Cars where the scheduled_date_time of the last item in the services array is in-between a certain date range.
A query which I used previously when using the first item in the services array is like:
mongoTemplate.find(Query.query(
where("services.0.scheduled_date_time").gte(fromDate)
.andOperator(
where("services.0.scheduled_date_time").lt(toDate))),
Car.class);
Note the 0 index since it's first one as opposed to the last one (for my current requirement).
I thought using an aggregate along with a projection and .arrayElementAt(-1) would do the trick but I haven't quite got it to work. My current effort is:
Aggregation agg = newAggregation(
project().and("services").arrayElementAt(-1).as("currentService"),
match(where("currentService.scheduled_date_time").gte(fromDate)
.andOperator(where("currentService.scheduled_date_time").lt(toDate)))
);
AggregationResults<Car> results = mongoTemplate.aggregate(agg, Car.class, Car.class);
return results.getMappedResults();
Any help suggestions appreciated.
Thanks,
This mongo aggregation retrieves all the Cars where the scheduled_date_time of the last item in the services array is in-between a specific date range.
[{
$addFields: {
last: {
$arrayElemAt: [
'$services',
-1
]
}
}
}, {
$match: {
'last.scheduled_date_time': {
$gte: ISODate('2019-10-26T04:06:27.307Z'),
$lt: ISODate('2019-12-15T04:06:27.319Z')
}
}
}]
I was trying to write it in spring-data-mongodb without luck.
They do not support $addFields yet, see here.
Since version 2.2.0 RELEASE spring-data-mongodb includes the Aggregation Repository Methods
The above query should be
interface CarRepository extends MongoRepository<Car, String> {
#Aggregation(pipeline = {
"{ $addFields : { last:{ $arrayElemAt: [$services,-1] }} }",
"{ $match: { 'last.scheduled_date_time' : { $gte : '$?0', $lt: '$?1' } } }"
})
List<Car> getCarsWithLastServiceDateBetween(LocalDateTime start, LocalDateTime end);
}
This method logs this query
[{ "$addFields" : { "last" : { "$arrayElemAt" : ["$services", -1]}}}, { "$match" : { "last.scheduled_date_time" : { "$gte" : "$2019-11-03T03:00:00Z", "$lt" : "$2019-11-05T03:00:00Z"}}}]
The date parameters are not parsing correctly. I didn't spend much time making it work.
If you want the Car Ids this could work.
public List<String> getCarsIdWithServicesDateBetween(LocalDateTime start, LocalDateTime end) {
return template.aggregate(newAggregation(
unwind("services"),
group("id").last("services.date").as("date"),
match(where("date").gte(start).lt(end))
), Car.class, Car.class)
.getMappedResults().stream()
.map(Car::getId)
.collect(Collectors.toList());
}
Query Log
[{ "$unwind" : "$services"}, { "$group" : { "_id" : "$_id", "date" : { "$last" : "$services.scheduled_date_time"}}}, { "$match" : { "date" : { "$gte" : { "$date" : 1572750000000}, "$lt" : { "$date" : 1572922800000}}}}]

Query to count number of occurrence in array grouped by day

I have the following document structure:
(trackerEventsCollection) =
{
"_id" : ObjectId("5b26c4fb7c696201040c8ed1"),
"trackerId" : ObjectId("598fc51324h51901043d76de"),
"trackingEvents" : [
{
"type" : "checkin",
"eventSource" : "app",
"timestamp" : ISODate("2017-08-25T06:34:58.964Z")
},
{
"type" : "power",
"eventSource" : "app",
"timestamp" : ISODate("2017-08-25T06:51:23.795Z")
},
{
"type" : "position",
"eventSource" : "app",
"timestamp" : ISODate("2017-08-25T06:51:23.985Z")
}
]
}
I would like to write a query that would count number of trackingEvents with type "type" : "power" grouped by day. This seems to be quite tricky to me because parent document does not have date and I should rely on timestamp field that belongs to the trackingEvents array members.
I'm not really experienced mongodb user and couldn't understand how can this be achieved so far.
Would really appreciate any help, thanks
To process your nested array as a separate documents you need to use $unwind. In the next stage you can use $match to filter out by type. Then you can group by single days counting occurences. The point is that you have to build grouping key containing year, month and day like in following code:
db.trackerEvents.aggregate([
{ $unwind: "$trackingEvents" },
{ $match: { "trackingEvents.type": "power" } },
{
$group: {
_id: {
year: { $year:"$trackingEvents.timestamp" },
month:{ $month:"$trackingEvents.timestamp" },
day: { $dayOfMonth:"$trackingEvents.timestamp" }
},
count: { $sum: 1 }
}
}
])

Combine mongo $push and $currentDate to include time in new array element

I am trying to add a new document to a mongo array and I require one of the fields to be the current timestamp. This is for field level versioning but I can't figure out how to combine $push and $currentDate to get the result I would like.
Can someone point me in the right direction?
db.tmp.adviceReportingJourney.update(
{ _id : "5525f99be4b041151d51386e5525f99be4b041151d513870" },
{
$push: {
"$currentDate": {
"Conversation1MeetingCreated" : {
"vid" : 4,
"ts" : {"$type": "timestamp"},
"data" : 1428552213559
}
}
}
}
)
You can use your coding language date.Now to add the current time ;-)
Like (i asume java;-)):
db.tmp.adviceReportingJourney.update(
{ _id : "5525f99be4b041151d51386e5525f99be4b041151d513870" },
{
$push: {
"$currentDate": {
"Conversation1MeetingCreated" : {
"vid" : 4,
"ts" : {"$type": "timestamp"},
"data" : Date.now()
}
}
}
}
)
UPDATE: When running mongo > 3.0 you could use $currentDate. From the documentation $currentDate it shows that $currentDate only works on db.collection.update(), db.collection.findAndModify().
See the example below to update the embedded document "Conversation1MeetingCreated" (where you update the timestamp) use:
{
$currentDate: {
"Conversation1MeetingCreated.ts": { $type: "timestamp" }
},
$set: {
"Conversation1MeetingCreated.vid" : 4,
"Conversation1MeetingCreated.data": 1428552213559
}
}
Hope it helps.

Mongo DB - how to query for id dependent on oldest date in array of a field

Lets say I have a collection called phone_audit with document entries of the following form - _id which is the phone number, and value containing items that always contains 2 entries (id, and a date).
Please see below:
{
"_id" : {
"phone_number" : "+012345678"
},
"value" : {
"items" : [
{
"_id" : "c14b4ac1db691680a3fb65320fba7261",
"updated_at" : ISODate("2016-03-14T12:35:06.533Z")
},
{
"_id" : "986b58e55f8606270f8a43cd7f32392b",
"updated_at" : ISODate("2016-07-23T11:17:53.552Z")
}
]
}
},
......
I need to get a list of _id values for every entry in that collection representing the older of the two items in each document.
So in the above - result would be [c14b4ac1db691680a3fb65320fba7261,...]
Any pointers at the type of query to execute would be v.helpful even if the exact syntax is not correct.
With aggregate(), you can $unwind value.items, $sort by update_at, then use $first to get the oldest:
[
{
"$unwind": "$value.items"
},
{
"$sort": { "value.items.updated_at": 1 }
},
{
"$group":{
_id: "$_id.phone_number",
oldest:{$first:"$value.items"}
}
},
{
"$project":{
value_id: "$oldest._id"
}
}
]