Query to count number of occurrence in array grouped by day - mongodb

I have the following document structure:
(trackerEventsCollection) =
{
"_id" : ObjectId("5b26c4fb7c696201040c8ed1"),
"trackerId" : ObjectId("598fc51324h51901043d76de"),
"trackingEvents" : [
{
"type" : "checkin",
"eventSource" : "app",
"timestamp" : ISODate("2017-08-25T06:34:58.964Z")
},
{
"type" : "power",
"eventSource" : "app",
"timestamp" : ISODate("2017-08-25T06:51:23.795Z")
},
{
"type" : "position",
"eventSource" : "app",
"timestamp" : ISODate("2017-08-25T06:51:23.985Z")
}
]
}
I would like to write a query that would count number of trackingEvents with type "type" : "power" grouped by day. This seems to be quite tricky to me because parent document does not have date and I should rely on timestamp field that belongs to the trackingEvents array members.
I'm not really experienced mongodb user and couldn't understand how can this be achieved so far.
Would really appreciate any help, thanks

To process your nested array as a separate documents you need to use $unwind. In the next stage you can use $match to filter out by type. Then you can group by single days counting occurences. The point is that you have to build grouping key containing year, month and day like in following code:
db.trackerEvents.aggregate([
{ $unwind: "$trackingEvents" },
{ $match: { "trackingEvents.type": "power" } },
{
$group: {
_id: {
year: { $year:"$trackingEvents.timestamp" },
month:{ $month:"$trackingEvents.timestamp" },
day: { $dayOfMonth:"$trackingEvents.timestamp" }
},
count: { $sum: 1 }
}
}
])

Related

Spring data MongoDb query based on last element of nested array field

I have the following data (Cars):
[
{
"make" : “Ferrari”,
"model" : “F40",
"services" : [
{
"type" : "FULL",
“date_time" : ISODate("2019-10-31T09:00:00.000Z"),
},
{
"type" : "FULL",
"scheduled_date_time" : ISODate("2019-11-04T09:00:00.000Z"),
}
],
},
{
"make" : "BMW",
"model" : “M3",
"services" : [
{
"type" : "FULL",
"scheduled_date_time" : ISODate("2019-10-31T09:00:00.000Z"),
},
{
"type" : "FULL",
“scheduled_date_time" : ISODate("2019-11-04T09:00:00.000Z"),
}
],
}
]
Using Spring data MongoDb I would like a query to retrieve all the Cars where the scheduled_date_time of the last item in the services array is in-between a certain date range.
A query which I used previously when using the first item in the services array is like:
mongoTemplate.find(Query.query(
where("services.0.scheduled_date_time").gte(fromDate)
.andOperator(
where("services.0.scheduled_date_time").lt(toDate))),
Car.class);
Note the 0 index since it's first one as opposed to the last one (for my current requirement).
I thought using an aggregate along with a projection and .arrayElementAt(-1) would do the trick but I haven't quite got it to work. My current effort is:
Aggregation agg = newAggregation(
project().and("services").arrayElementAt(-1).as("currentService"),
match(where("currentService.scheduled_date_time").gte(fromDate)
.andOperator(where("currentService.scheduled_date_time").lt(toDate)))
);
AggregationResults<Car> results = mongoTemplate.aggregate(agg, Car.class, Car.class);
return results.getMappedResults();
Any help suggestions appreciated.
Thanks,
This mongo aggregation retrieves all the Cars where the scheduled_date_time of the last item in the services array is in-between a specific date range.
[{
$addFields: {
last: {
$arrayElemAt: [
'$services',
-1
]
}
}
}, {
$match: {
'last.scheduled_date_time': {
$gte: ISODate('2019-10-26T04:06:27.307Z'),
$lt: ISODate('2019-12-15T04:06:27.319Z')
}
}
}]
I was trying to write it in spring-data-mongodb without luck.
They do not support $addFields yet, see here.
Since version 2.2.0 RELEASE spring-data-mongodb includes the Aggregation Repository Methods
The above query should be
interface CarRepository extends MongoRepository<Car, String> {
#Aggregation(pipeline = {
"{ $addFields : { last:{ $arrayElemAt: [$services,-1] }} }",
"{ $match: { 'last.scheduled_date_time' : { $gte : '$?0', $lt: '$?1' } } }"
})
List<Car> getCarsWithLastServiceDateBetween(LocalDateTime start, LocalDateTime end);
}
This method logs this query
[{ "$addFields" : { "last" : { "$arrayElemAt" : ["$services", -1]}}}, { "$match" : { "last.scheduled_date_time" : { "$gte" : "$2019-11-03T03:00:00Z", "$lt" : "$2019-11-05T03:00:00Z"}}}]
The date parameters are not parsing correctly. I didn't spend much time making it work.
If you want the Car Ids this could work.
public List<String> getCarsIdWithServicesDateBetween(LocalDateTime start, LocalDateTime end) {
return template.aggregate(newAggregation(
unwind("services"),
group("id").last("services.date").as("date"),
match(where("date").gte(start).lt(end))
), Car.class, Car.class)
.getMappedResults().stream()
.map(Car::getId)
.collect(Collectors.toList());
}
Query Log
[{ "$unwind" : "$services"}, { "$group" : { "_id" : "$_id", "date" : { "$last" : "$services.scheduled_date_time"}}}, { "$match" : { "date" : { "$gte" : { "$date" : 1572750000000}, "$lt" : { "$date" : 1572922800000}}}}]

mongoDB distict problems

It's one of my data as JSON format:
{
"_id" : ObjectId("5bfdb412a80939b6ed682090"),
"accounts" : [
{
"_id" : ObjectId("5bf106eee639bd0df4bd8e05"),
"accountType" : "DDA",
"productName" : "DDA1"
},
{
"_id" : ObjectId("5bf106eee639bd0df4bd8df8"),
"accountType" : "VSA",
"productName" : "VSA1"
},
{
"_id" : ObjectId("5bf106eee639bd0df4bd8df9"),
"accountType" : "VSA",
"productName" : "VSA2"
}
]
}
I want to make a query to get all productName(no duplicate) of accountType = VSA.
I write a mongo query:
db.Collection.distinct("accounts.productName", {"accounts.accountType": "VSA" })
I expect: ['VSA1', 'VSA2']
I get: ['DDA','VSA1', 'VSA2']
Anybody knows why the query doesn't work in distinct?
Second parameter of distinct method represents:
A query that specifies the documents from which to retrieve the distinct values.
But the thing is that you showed only one document with nested array of elements so whole document will be returned for your condition "accounts.accountType": "VSA".
To fix that you have to use Aggregation Framework and $unwind nested array before you apply the filtering and then you can use $group with $addToSet to get unique values. Try:
db.col.aggregate([
{
$unwind: "$accounts"
},
{
$match: {
"accounts.accountType": "VSA"
}
},
{
$group: {
_id: null,
uniqueProductNames: { $addToSet: "$accounts.productName" }
}
}
])
which prints:
{ "_id" : null, "uniqueProductNames" : [ "VSA2", "VSA1" ] }

Find in nested arrays mongodb

Below the data structure of my services in MongoDB:
"serviceInfo" : {
"title" : "Lorem ipsum",
"options" : [
{
"startDate" : ISODate("2018-10-01T00:00:00.000Z"),
"endDate" : ISODate("2018-10-31T00:00:00.000Z"),
"availabilities" : [
{
"businessDay" : {
"id" : 1,
"name" : "Monday"
},
}
]
}
]
Now, I want to query all the services available the Monday during the period between startDate and endDate.
I tried this code but I have an empty array as result instead of my document.
db.collection('services').find({
'serviceInfo.options': {
$elemMatch: {
'startDate': { $lte: new Date(req.query.date) },
'endDate': { $gte: new Date(req.query.date) },
'availabilities': {
$elemMatch: {
'businessDay.id': req.query.day
}
}
}
}
}).toArray()
I guess my problem is in the nested array availabilities but I don't find the correct way to do the query.
Thanks in advance for your help.
I found my problem.
'businessDay.id' expected an Int32 and parseInt(req.query.day) did the trick.

Mongo DB - how to query for id dependent on oldest date in array of a field

Lets say I have a collection called phone_audit with document entries of the following form - _id which is the phone number, and value containing items that always contains 2 entries (id, and a date).
Please see below:
{
"_id" : {
"phone_number" : "+012345678"
},
"value" : {
"items" : [
{
"_id" : "c14b4ac1db691680a3fb65320fba7261",
"updated_at" : ISODate("2016-03-14T12:35:06.533Z")
},
{
"_id" : "986b58e55f8606270f8a43cd7f32392b",
"updated_at" : ISODate("2016-07-23T11:17:53.552Z")
}
]
}
},
......
I need to get a list of _id values for every entry in that collection representing the older of the two items in each document.
So in the above - result would be [c14b4ac1db691680a3fb65320fba7261,...]
Any pointers at the type of query to execute would be v.helpful even if the exact syntax is not correct.
With aggregate(), you can $unwind value.items, $sort by update_at, then use $first to get the oldest:
[
{
"$unwind": "$value.items"
},
{
"$sort": { "value.items.updated_at": 1 }
},
{
"$group":{
_id: "$_id.phone_number",
oldest:{$first:"$value.items"}
}
},
{
"$project":{
value_id: "$oldest._id"
}
}
]

MongoDB: How to count a field if it's value matches a condition?

I have a document that includes a field like this:
{
...
log: [
{
utc_timestamp: ISODate("2014-11-15T10:26:47.337Z"),
type: "clicked"
},
{
utc_timestamp: ISODate("2014-10-15T16:12:51.959Z"),
type: "emailed"
},
{
utc_timestamp: ISODate("2014-10-15T16:10:51.959Z"),
type: "clicked"
},
{
utc_timestamp: ISODate("2014-09-15T04:59:19.431Z"),
type: "emailed"
},
{
utc_timestamp: ISODate("2014-09-15T04:58:19.431Z"),
type: "clicked"
},
],
...
}
How do I get the count of log entries of type "clicked" from this month, only if there is not a log entry of type "emailed" this month?
In other words, I want to find out which clicks have not been sent a related email.
So, in this example, the count would be 1 since the most recent "clicked" entry doesn't have an "emailed" entry.
Note: For this use case, clicks don't have unique IDs - this is all the data that is logged.
Use the following aggregation pipeline:
db.click_log.aggregate([
{ "$match" : { "log.type" : { "$ne" : "emailed" } } }, // get rid of docs with an "emailed" value in log.type and docs not from this month
{ "$unwind" : "$log" }, // unwind to get log elements as separate docs
{ "$project" : { "_id" : 1, "log" : 1, "month" : { "$month" : "$log.utc_timestamp" } } },
{ "$match" : { "log" : "clicked", "month" : <# of month> } }, // get rid of log elements not from this month and that aren't type clicked
{ "$group" : { "_id" : "$_id", "count" : { "$sum" : 1 } } } // collect clicked elements from same original doc and count number
])
This will return, for each document not having "emailed" as a value of log.type, the count of elements of the array log that have log.type value clicked and with timestamp from the current month. If you want a sliding 30-day period for month, change the $match to be a range query with $gt and $lt covering the desired time period.
You can use query something similar to below.
db.dbversitydotcom_col.aggregate([ { $unwind: “$log” },
{ $match: { “log.type” : “clicked”, "log.utc_timestamp" : "your required date" } },
{ $sort: { “Files.Size” : -1.0 } }, { $limit: 5.0 } ]).count()
Please refer to http://dbversity.com/mongodb-importance-of-aggregation-framework/ for more detailed explanation,