MongoDB custom sorting - mongodb

i have a collection of records as follows
{
"_id":417,
"ptime":ISODate("2013-11-26T11:18:42.961Z"),
"p":{
"type":"1",
"txt":"test message"
},
"users":[
{
"uid":"52872ed59542f",
"pt":ISODate("2013-11-26T11:18:42.961Z")
},
{
"uid":"524eb460986e4",
"pt":ISODate("2013-11-26T11:18:42.961Z")
},
{
"uid":"524179060781e",
"pt":ISODate("2013-11-27T12:48:35Z")
}
],
},
{
"_id":418,
"ptime":ISODate("2013-11-25T11:18:42.961Z"),
"p":{
"type":"1",
"txt":"test message 2"
},
"users":[
{
"uid":"524eb460986e4",
"pt":ISODate("2013-11-23T11:18:42.961Z")
},
{
"uid":"52872ed59542f",
"pt":ISODate("2013-11-24T11:18:42.961Z")
},
{
"uid":"524179060781e",
"pt":ISODate("2013-11-22T12:48:35Z")
}
],
}
How to sort the above records with descending order of ptime and pt where users uid ="52872ed59542f" ?

If you want to do such a sort, you probably want to store your data in a different way. MongoDB in generally is not near as good with manipulating nested documents as top level fields. In your case, I would recommend splitting out ptime, pt and uid into their own collection:
messages
{
"_id":417,
"ptime":ISODate("2013-11-26T11:18:42.961Z"),
"type":"1",
"txt":"test message"
},
users
{
"id":417,
"ptime":ISODate("2013-11-26T11:18:42.961Z"),
"uid":"52872ed59542f",
"pt":ISODate("2013-11-26T11:18:42.961Z")
},
{
"id":417,
"ptime":ISODate("2013-11-26T11:18:42.961Z"),
"uid":"524eb460986e4",
"pt":ISODate("2013-11-26T11:18:42.961Z")
},
{
"id":417,
"ptime":ISODate("2013-11-26T11:18:42.961Z"),
"uid":"524179060781e",
"pt":ISODate("2013-11-27T12:48:35Z")
}
You can then set an index on the users collection for uid, ptime and pt.
You will need to do two queries to also get the text messages themselves though.

You can use the Aggregation Framework to sort by first ptime and then users.pt field as follows.
db.users.aggregate(
{$sort : {'ptime' : 1}},
{$unwind : "$users"},
{$match: {"users.uid" : "52872ed59542f"}},
{$sort : {'users.pt' : 1}},
{$group : {_id : {id : "$_id", "ptime" : "$ptime", "p" : "$p"}, users : {$push : "$users"}}},
{$group : {_id : "$_id.id", "ptime" : {$first : "$_id.ptime"}, "p" : {$first : "$_id.p"}, users : {$push : "$users"}}}
);

db.yourcollection.find(
{
users:{
$elemMatch:{uid:"52872ed59542f"}
}
}).sort({ptime:-1})
But you will have problems with order by pt field. You should use Aggregation Framework to project data or use Derick's approach.

Related

Spring data MongoDb query based on last element of nested array field

I have the following data (Cars):
[
{
"make" : “Ferrari”,
"model" : “F40",
"services" : [
{
"type" : "FULL",
“date_time" : ISODate("2019-10-31T09:00:00.000Z"),
},
{
"type" : "FULL",
"scheduled_date_time" : ISODate("2019-11-04T09:00:00.000Z"),
}
],
},
{
"make" : "BMW",
"model" : “M3",
"services" : [
{
"type" : "FULL",
"scheduled_date_time" : ISODate("2019-10-31T09:00:00.000Z"),
},
{
"type" : "FULL",
“scheduled_date_time" : ISODate("2019-11-04T09:00:00.000Z"),
}
],
}
]
Using Spring data MongoDb I would like a query to retrieve all the Cars where the scheduled_date_time of the last item in the services array is in-between a certain date range.
A query which I used previously when using the first item in the services array is like:
mongoTemplate.find(Query.query(
where("services.0.scheduled_date_time").gte(fromDate)
.andOperator(
where("services.0.scheduled_date_time").lt(toDate))),
Car.class);
Note the 0 index since it's first one as opposed to the last one (for my current requirement).
I thought using an aggregate along with a projection and .arrayElementAt(-1) would do the trick but I haven't quite got it to work. My current effort is:
Aggregation agg = newAggregation(
project().and("services").arrayElementAt(-1).as("currentService"),
match(where("currentService.scheduled_date_time").gte(fromDate)
.andOperator(where("currentService.scheduled_date_time").lt(toDate)))
);
AggregationResults<Car> results = mongoTemplate.aggregate(agg, Car.class, Car.class);
return results.getMappedResults();
Any help suggestions appreciated.
Thanks,
This mongo aggregation retrieves all the Cars where the scheduled_date_time of the last item in the services array is in-between a specific date range.
[{
$addFields: {
last: {
$arrayElemAt: [
'$services',
-1
]
}
}
}, {
$match: {
'last.scheduled_date_time': {
$gte: ISODate('2019-10-26T04:06:27.307Z'),
$lt: ISODate('2019-12-15T04:06:27.319Z')
}
}
}]
I was trying to write it in spring-data-mongodb without luck.
They do not support $addFields yet, see here.
Since version 2.2.0 RELEASE spring-data-mongodb includes the Aggregation Repository Methods
The above query should be
interface CarRepository extends MongoRepository<Car, String> {
#Aggregation(pipeline = {
"{ $addFields : { last:{ $arrayElemAt: [$services,-1] }} }",
"{ $match: { 'last.scheduled_date_time' : { $gte : '$?0', $lt: '$?1' } } }"
})
List<Car> getCarsWithLastServiceDateBetween(LocalDateTime start, LocalDateTime end);
}
This method logs this query
[{ "$addFields" : { "last" : { "$arrayElemAt" : ["$services", -1]}}}, { "$match" : { "last.scheduled_date_time" : { "$gte" : "$2019-11-03T03:00:00Z", "$lt" : "$2019-11-05T03:00:00Z"}}}]
The date parameters are not parsing correctly. I didn't spend much time making it work.
If you want the Car Ids this could work.
public List<String> getCarsIdWithServicesDateBetween(LocalDateTime start, LocalDateTime end) {
return template.aggregate(newAggregation(
unwind("services"),
group("id").last("services.date").as("date"),
match(where("date").gte(start).lt(end))
), Car.class, Car.class)
.getMappedResults().stream()
.map(Car::getId)
.collect(Collectors.toList());
}
Query Log
[{ "$unwind" : "$services"}, { "$group" : { "_id" : "$_id", "date" : { "$last" : "$services.scheduled_date_time"}}}, { "$match" : { "date" : { "$gte" : { "$date" : 1572750000000}, "$lt" : { "$date" : 1572922800000}}}}]

mongoDB distict problems

It's one of my data as JSON format:
{
"_id" : ObjectId("5bfdb412a80939b6ed682090"),
"accounts" : [
{
"_id" : ObjectId("5bf106eee639bd0df4bd8e05"),
"accountType" : "DDA",
"productName" : "DDA1"
},
{
"_id" : ObjectId("5bf106eee639bd0df4bd8df8"),
"accountType" : "VSA",
"productName" : "VSA1"
},
{
"_id" : ObjectId("5bf106eee639bd0df4bd8df9"),
"accountType" : "VSA",
"productName" : "VSA2"
}
]
}
I want to make a query to get all productName(no duplicate) of accountType = VSA.
I write a mongo query:
db.Collection.distinct("accounts.productName", {"accounts.accountType": "VSA" })
I expect: ['VSA1', 'VSA2']
I get: ['DDA','VSA1', 'VSA2']
Anybody knows why the query doesn't work in distinct?
Second parameter of distinct method represents:
A query that specifies the documents from which to retrieve the distinct values.
But the thing is that you showed only one document with nested array of elements so whole document will be returned for your condition "accounts.accountType": "VSA".
To fix that you have to use Aggregation Framework and $unwind nested array before you apply the filtering and then you can use $group with $addToSet to get unique values. Try:
db.col.aggregate([
{
$unwind: "$accounts"
},
{
$match: {
"accounts.accountType": "VSA"
}
},
{
$group: {
_id: null,
uniqueProductNames: { $addToSet: "$accounts.productName" }
}
}
])
which prints:
{ "_id" : null, "uniqueProductNames" : [ "VSA2", "VSA1" ] }

I need to count how many children orgs are assigned to a parent org in MongoDB

I'm new to the MongoDB world. I'm trying to figure out how to count the number of children organizations assigned to a parent organization. I have documents that have this general structure:
{
"_id" : "001",
"parentOrganization" : {
"organizationId" : "pOrg1"
},
"childOrganization" : {
"organizationId" : "cOrg1"
}
},
{
"_id" : "002",
"parentOrganization" : {
"organizationId" : "pOrg1"
},
"childOrganization" : {
"organizationId" : "cOrg2"
}
},
{
"_id" : "003",
"parentOrganization" : {
"organizationId" : "pOrg2"
},
"childOrganization" : {
"organizationId" : "cOrg3"
}
}
Each document has a parentOrganization with an associated childOrganization. There may be multiple documents with the same parentOrganization, but different childOrganizations. There may also be multiple documents with the same parent/child relationship. Additionally, there may even be a case where a child org may associate with multiple parent orgs.
I'm trying to group by parentOrganization and then count the number of unique childOrganization's associated with each parentOrganization, as well as display the unique id's.
I have tried using an aggregation framework with $match and $group, but I'm still not getting into the child organization parts to count them. Here is what I'm currently attempting:
var s1 = {$match: {"parentOrganization.organizationId": {$exists: true}}};
var s2 = {$group: {_id: "$parentOrganization.organizationId", count: {$sum: "$childOrganization.organizationId"}}};
db.collection.aggregate(s1, s2);
My results are returning the parentOrganization, but my $sum is not returning the number of associated childOrganizations:
/* 1 */
{
"_id" : "pOrg1",
"count" : 0
}
/* 2 */
{
"_id" : "pOrg2",
"count" : 0
}
I get the feeling it is a bit more complicated than my limited knowledge has access to at this time. What details am I missing in this query?
Your $sum is referencing the childOrganization.organizationId value, which is a string. When $sum references a string, it will return the value 0.
I was a unsure of exactly what you were asking for, but I believe that these aggregations can help you on your way.
This will return a count of documents groups by the parentOrganization.organizationId
db.collection.aggregate({$group: {"_id":"$parentOrganization.organizationId", "count": {"$sum": 1}}})
Output:
{ "_id" : "pOrg2", "count" : 1 }
{ "_id" : "pOrg1", "count" : 2 }
This will return a count of unique parent/child organizations:
db.collection.aggregate(
{$group: {"_id": {"parentOrganization": "$parentOrganization.organizationId", "childOrganization": "$childOrganization.organizationId"}, "count":{$sum:1}}})
Output:
{ "_id" : { "parentOrganization" : "pOrg2", "childOrganization" : "cOrg3" }, "count" : 1 }
{ "_id" : { "parentOrganization" : "pOrg1", "childOrganization" : "cOrg2" }, "count" : 1 }
{ "_id" : { "parentOrganization" : "pOrg1", "childOrganization" : "cOrg1" }, "count" : 1 }
This will return a count of unique child organizations and get the set of unique child organizations as well using $addToSet. One caveat of using $addToSet is that the MongoDB 16MB limit on document size still holds. This means that if your collection is large enough such that the size of the set will make one document greater than 16MB, the command will fail. The first $group will create a set of child organizations grouped by parent organization. The $project is used simply to add the total size of the set to the result.
db.collection.aggregate([
{$group: {"_id" : "$parentOrganization.organizationId", "childOrgs" : { "$addToSet" : "$childOrganization.organizationId"}}},
{$project: {"_id" : "$_id", "uniqueChildOrgsCount": {"$size" : "$childOrgs"}, "uniqueChildOrgs": "$childOrgs"}}])
Output:
{ "_id" : "pOrg2", "uniqueChildOrgsCount" : 1, "uniqueChildOrgs" : [ "cOrg3" ]}
{ "_id" : "pOrg1", "uniqueChildOrgsCount" : 2, "uniqueChildOrgs" : [ "cOrg2", "cOrg1" ]}
During these aggregations, I left out the $match statement you included for simplicity, but you could add that back as well.

MongoDB Group by field and show array of grouped items?

I have a collection of Projects in where projects are like this:
{
"_id" : ObjectId("57e3e55c638cb8b971"),
"allocInvestor" : "Example Investor",
"fieldFoo" : "foo bar",
"name" : "GTP 3 (Roof Lease)"
}
I want to receive a list of projects grouped by allocInvestor field and only show fields: name and id
If I use aggregate and $group like this:
db.getCollection('projects').aggregate([
{"$group" : {
_id:"$allocInvestor", count:{$sum:1}
}
}
])
I receive a count of project per allocInvestor but I need is to receive the list of allocInvestor with subarray of projects per allocInvestor.
I'm using meteor by the way in case that helps. But I first want to get the query right on mongodb then try for meteor.
You can use $push or $addToSet to create a list of name and _id per every group.
$push allows duplicates and $addToSet does not add an element to the list again, if it is already there.
db.getCollection('projects').aggregate([
{ "$group" : { _id : "$allocInvestor",
count : {$sum : 1},
"idList" : {"$addToSet" : "$_id"},
"nameList" : {"$addToSet":"$name"}
}
}
]);
To get the name and _id data in a single list:
db.getCollection('projects').aggregate([
{ "$group" : { _id : "$allocInvestor", "projects" : {"$addToSet" : {id : "$_id", name: "$name"}}}},
{"$project" : {"_id" : 0, allocInvestor : "$_id", "projects" : 1 }}
]);
Use the $$ROOT operator to reference the entire document and then use project to eliminate the fields that you do not require.
db.projects.aggregate([
{"$group" : {
"_id":"$allocInvestor",
"projects" : {"$addToSet" : "$$ROOT"}
}
},
{"$project" : {
"_id":0,
"allocInvestor":"$_id",
"projects._id":1
"projects.name":1
}
}
])

Mongo DB - how to query for id dependent on oldest date in array of a field

Lets say I have a collection called phone_audit with document entries of the following form - _id which is the phone number, and value containing items that always contains 2 entries (id, and a date).
Please see below:
{
"_id" : {
"phone_number" : "+012345678"
},
"value" : {
"items" : [
{
"_id" : "c14b4ac1db691680a3fb65320fba7261",
"updated_at" : ISODate("2016-03-14T12:35:06.533Z")
},
{
"_id" : "986b58e55f8606270f8a43cd7f32392b",
"updated_at" : ISODate("2016-07-23T11:17:53.552Z")
}
]
}
},
......
I need to get a list of _id values for every entry in that collection representing the older of the two items in each document.
So in the above - result would be [c14b4ac1db691680a3fb65320fba7261,...]
Any pointers at the type of query to execute would be v.helpful even if the exact syntax is not correct.
With aggregate(), you can $unwind value.items, $sort by update_at, then use $first to get the oldest:
[
{
"$unwind": "$value.items"
},
{
"$sort": { "value.items.updated_at": 1 }
},
{
"$group":{
_id: "$_id.phone_number",
oldest:{$first:"$value.items"}
}
},
{
"$project":{
value_id: "$oldest._id"
}
}
]