MongoDB Aggregation in custom Spring Data #Query - annotations

Anyone know how to shoehorn MongoDB aggregate queries into spring-data custom #Query?
I would like to group my entities and retrieve all the latest version by date.
The aggregate query is
db.collection.aggregate(
[
{ $sort: { url: 1, checkedOn: 1 } },
{
$group: {
_id: {url: "$url", path: "$path" },
lastChecked: { $last: "$checkedOn" }
}
}
])
This runs fine and returns a collection like this:
{
"result" : [
{
"_id" : {
"url" : "http://www.google.co.uk",
"path" : "/about"
},
"lastChecked" : ISODate("2015-03-17T11:43:01.222Z")
},
{
"_id" : {
"url" : "http://www.google.co.uk",
"path" : "/about2"
},
"lastChecked" : ISODate("2015-03-17T11:43:04.213Z")
}
],
"ok" : 1
}
I'd like to do this as part of a spring-data custom query:
#RepositoryRestResource(collectionResourceRel = "healthcheck", path = "healthcheck")
public interface LinkRepository extends MongoRepository<Link, String> {
#Query("{ 'status' : { $gt: 299 }, 'path' : ?0 }" )
List<CheckedLink> findByPathAndInvalid(#Param("path") String path);
#Query(**aggregate query goes here**)
List<CheckedLink> findByLatest();
}
I can't put the aggregate query directly into #Query because the resulting collection is not directly mapped to the Link object.
Only reference I can find is on an old Spring-Sandbox project but this has not been updated since 2013:
https://github.com/hantsy/spring-sandbox/wiki/spring-data-mongo#custom-query

The #Query annotation can be only used to fill the query part of a mongo request. As other annotation classes are CountQuery, DeleteQuery and ExistsQuery, I assume that this is not possible using an annotation.

Related

Spring data MongoDb query based on last element of nested array field

I have the following data (Cars):
[
{
"make" : “Ferrari”,
"model" : “F40",
"services" : [
{
"type" : "FULL",
“date_time" : ISODate("2019-10-31T09:00:00.000Z"),
},
{
"type" : "FULL",
"scheduled_date_time" : ISODate("2019-11-04T09:00:00.000Z"),
}
],
},
{
"make" : "BMW",
"model" : “M3",
"services" : [
{
"type" : "FULL",
"scheduled_date_time" : ISODate("2019-10-31T09:00:00.000Z"),
},
{
"type" : "FULL",
“scheduled_date_time" : ISODate("2019-11-04T09:00:00.000Z"),
}
],
}
]
Using Spring data MongoDb I would like a query to retrieve all the Cars where the scheduled_date_time of the last item in the services array is in-between a certain date range.
A query which I used previously when using the first item in the services array is like:
mongoTemplate.find(Query.query(
where("services.0.scheduled_date_time").gte(fromDate)
.andOperator(
where("services.0.scheduled_date_time").lt(toDate))),
Car.class);
Note the 0 index since it's first one as opposed to the last one (for my current requirement).
I thought using an aggregate along with a projection and .arrayElementAt(-1) would do the trick but I haven't quite got it to work. My current effort is:
Aggregation agg = newAggregation(
project().and("services").arrayElementAt(-1).as("currentService"),
match(where("currentService.scheduled_date_time").gte(fromDate)
.andOperator(where("currentService.scheduled_date_time").lt(toDate)))
);
AggregationResults<Car> results = mongoTemplate.aggregate(agg, Car.class, Car.class);
return results.getMappedResults();
Any help suggestions appreciated.
Thanks,
This mongo aggregation retrieves all the Cars where the scheduled_date_time of the last item in the services array is in-between a specific date range.
[{
$addFields: {
last: {
$arrayElemAt: [
'$services',
-1
]
}
}
}, {
$match: {
'last.scheduled_date_time': {
$gte: ISODate('2019-10-26T04:06:27.307Z'),
$lt: ISODate('2019-12-15T04:06:27.319Z')
}
}
}]
I was trying to write it in spring-data-mongodb without luck.
They do not support $addFields yet, see here.
Since version 2.2.0 RELEASE spring-data-mongodb includes the Aggregation Repository Methods
The above query should be
interface CarRepository extends MongoRepository<Car, String> {
#Aggregation(pipeline = {
"{ $addFields : { last:{ $arrayElemAt: [$services,-1] }} }",
"{ $match: { 'last.scheduled_date_time' : { $gte : '$?0', $lt: '$?1' } } }"
})
List<Car> getCarsWithLastServiceDateBetween(LocalDateTime start, LocalDateTime end);
}
This method logs this query
[{ "$addFields" : { "last" : { "$arrayElemAt" : ["$services", -1]}}}, { "$match" : { "last.scheduled_date_time" : { "$gte" : "$2019-11-03T03:00:00Z", "$lt" : "$2019-11-05T03:00:00Z"}}}]
The date parameters are not parsing correctly. I didn't spend much time making it work.
If you want the Car Ids this could work.
public List<String> getCarsIdWithServicesDateBetween(LocalDateTime start, LocalDateTime end) {
return template.aggregate(newAggregation(
unwind("services"),
group("id").last("services.date").as("date"),
match(where("date").gte(start).lt(end))
), Car.class, Car.class)
.getMappedResults().stream()
.map(Car::getId)
.collect(Collectors.toList());
}
Query Log
[{ "$unwind" : "$services"}, { "$group" : { "_id" : "$_id", "date" : { "$last" : "$services.scheduled_date_time"}}}, { "$match" : { "date" : { "$gte" : { "$date" : 1572750000000}, "$lt" : { "$date" : 1572922800000}}}}]

How to remove an element from inner array of nested array pymongo using $ pull

Here is my news document structure
{
"_id" : ObjectId("5bff0903bd9a221229c7c9b2"),
"title" : "Test Page",
"desc" : "efdfr",
"mediaset_list" : [
{
"_id" : ObjectId("5bfeff94bd9a221229c7c9ae"),
"medias" : [
{
"_id" : ObjectId("5bfeff83bd9a221229c7c9ac"),
"file_type" : "png",
"file" : "https://aws.com/gg.jpg",
"file_name" : "edf.jpg"
},
{
"_id" : ObjectId("5bfeff83bd9a221229c7c9ad"),
"file_type" : "mov",
"file" : "https://aws.com/gg.mov",
"file_name" : "abcd.mov"
}
]
}
]}
The queries that i've tried are given below
Approach 1
db.news.find_and_modify({},{'$pull': {"mediaset_list": {"medias": {"$elemMatch" : {"_id": ObjectId('5bfeff83bd9a221229c7c9ac')}} }}})
Approach 2
db.news.update({},{'$pull': {"mediaset_list.$.medias": {"_id": ObjectId('5bfeff83bd9a221229c7c9ac')}} })
Issue we are facing
The above queries are removing entire elements inside 'mediaset_list' . But i only want to remove the element inside 'medias' matching object ID.
Since you have two nested arrays you have to use arrayFilters to indicate which element of outer array should be modified, try:
db.news.update({ _id: ObjectId("5bff0903bd9a221229c7c9b2") },
{ $pull: { "mediaset_list.$[item].medias": { _id: ObjectId("5bfeff83bd9a221229c7c9ad") } } },
{ arrayFilters: [ { "item._id": ObjectId("5bfeff94bd9a221229c7c9ae") } ] })
So item is used here as a placeholder which will be used by MongoDB to determine which element of mediaset_list needs to be modified and the condition for this placeholder is defined inside arrayFilters. Then you can use $pull and specify another condition for inner array to determine which element should be removed.
From #micki's mongo shell query (Answer above) , This is the pymongo syntax which will update all news document with that media id .
db.news.update_many({},
{
"$pull":
{ "mediaset_list.$[item].medias": { "_id": ObjectId("5bfeff83bd9a221229c7c9ad") } } ,
},
array_filters=[{ "item._id": ObjectId("5bfeff94bd9a221229c7c9ae")}],
upsert=True)

mongoDB distict problems

It's one of my data as JSON format:
{
"_id" : ObjectId("5bfdb412a80939b6ed682090"),
"accounts" : [
{
"_id" : ObjectId("5bf106eee639bd0df4bd8e05"),
"accountType" : "DDA",
"productName" : "DDA1"
},
{
"_id" : ObjectId("5bf106eee639bd0df4bd8df8"),
"accountType" : "VSA",
"productName" : "VSA1"
},
{
"_id" : ObjectId("5bf106eee639bd0df4bd8df9"),
"accountType" : "VSA",
"productName" : "VSA2"
}
]
}
I want to make a query to get all productName(no duplicate) of accountType = VSA.
I write a mongo query:
db.Collection.distinct("accounts.productName", {"accounts.accountType": "VSA" })
I expect: ['VSA1', 'VSA2']
I get: ['DDA','VSA1', 'VSA2']
Anybody knows why the query doesn't work in distinct?
Second parameter of distinct method represents:
A query that specifies the documents from which to retrieve the distinct values.
But the thing is that you showed only one document with nested array of elements so whole document will be returned for your condition "accounts.accountType": "VSA".
To fix that you have to use Aggregation Framework and $unwind nested array before you apply the filtering and then you can use $group with $addToSet to get unique values. Try:
db.col.aggregate([
{
$unwind: "$accounts"
},
{
$match: {
"accounts.accountType": "VSA"
}
},
{
$group: {
_id: null,
uniqueProductNames: { $addToSet: "$accounts.productName" }
}
}
])
which prints:
{ "_id" : null, "uniqueProductNames" : [ "VSA2", "VSA1" ] }

How to watch for changes to specific fields in MongoDB change stream with Spring Data Mongodb

According to the documentation of the mongodb changestream, I can watch for updates on specific fields on the collecttion:
const changeStream = collection.watch(
[{
$match: {
$and: [
{ "updateDescription.updatedFields.a": { $exists: true } },
{ operationType: "update" }
]
}
}],
{
fullDocument: "updateLookup"
}
);
I would like to know, how to achieve the same thing with spring data mongodb reactive.
When I build a change stream with
Flux<ChangeStreamEvent<TestObject>> changeStream =
mongoTemplate.changeStream(
newAggregation(
match(
where("operationType").is("update")
.and("updateDescription").exists(true)
)
),
TestObject.class,
ChangeStreamOptions.empty(),
"testObject"
);
I get all updates. A log of one of these change events looks like:
ChangeStreamEvent {
raw=ChangeStreamDocument{
resumeToken={ "_data" : { "$binary" : "glsU9sYAAAATRmRfaWQAZFsU9sayZFockO0phgBaEATGaP42X4VI+4SrTyscwRtsBA==", "$type" : "00" } },
namespace=test.testObject,
fullDocument=Document{{_id=5b14f6c6b2645a1c90ed2986, a=a, b=b, _class=com.example.demo.TestObject}},
documentKey={ "_id" : { "$oid" : "5b14f6c6b2645a1c90ed2986" } },
clusterTime=null,
operationType=OperationType{value='update'},
updateDescription=UpdateDescription{removedFields=[], updatedFields={ "b" : "b" }}},
targetType=class com.example.demo.TestObject
}
However, when I change the filter to
newAggregation(
match(
where("operationType").is("update")
.and("updateDescription.updatedFields").exists(true)
)
)
or
newAggregation(
match(
where("operationType").is("update")
.and("updateDescription.updatedFields.b").exists(true)
)
)
I get not change events. So here is the question: how can I define the Criteria to match updates on updatedField.b?
A sample Document from the collection is:
{
"_id" : ObjectId("5b14f049b2645a4a24b33df1"),
"a" : "va",
"b" : "vb",
"_class" : "com.example.demo.TestObject"
}
The example with the first "working" filter can be downloaded at:
my test github repo

Retrieving value of an emedded object in mongo

Followup Question
Thanks #4J41 for your spot on resolution. Along the same lines, I'd also like to validate one other thing.
I have a mongo document that contains an array of Strings, and I need to convert this particular array of strings into an array of object containing a key-value pair. Below is my curent appraoch to it.
Mongo Record:
Same mongo record in my initial question below.
Current Query:
templateAttributes.find({platform:"V1"}).map(function(c){
//instantiate a new array
var optionsArray = [];
for (var i=0;i< c['available']['Community']['attributes']['type']['values'].length; i++){
optionsArray[i] = {}; // creates a new object
optionsArray[i].label = c['available']['Community']['attributes']['type']['values'][i];
optionsArray[i].value = c['available']['Community']['attributes']['type']['values'][i];
}
return optionsArray;
})[0];
Result:
[{label:"well-known", value:"well-known"},
{label:"simple", value:"simple"},
{label:"complex", value:"complex"}]
Is my approach efficient enough, or is there a way to optimize the above query to get the same desired result?
Initial Question
I have a mongo document like below:
{
"_id" : ObjectId("57e3720836e36f63695a2ef2"),
"platform" : "A1",
"available" : {
"Community" : {
"attributes" : {
"type" : {
"values" : [
"well-known",
"simple",
"complex"
],
"defaultValue" : "well-known"
},
[......]
}
I'm trying to query the DB and retrieve only the value of defaultValue field.
I tried:
db.templateAttributes.find(
{ platform: "A1" },
{ "available.Community.attributes.type.defaultValue": 1 }
)
as well as
db.templateAttributes.findOne(
{ platform: "A1" },
{ "available.Community.attributes.type.defaultValue": 1 }
)
But they both seem to retrieve the entire object hirarchy like below:
{
"_id" : ObjectId("57e3720836e36f63695a2ef2"),
"available" : {
"Community" : {
"attributes" : {
"type" : {
"defaultValue" : "well-known"
}
}
}
}
}
The only way I could get it to work was with find and map function, but it seems to be convoluted a bit.
Does anyone have a simpler way to get this result?
db.templateAttributes.find(
{ platform: "A1" },
{ "available.Community.attributes.type.defaultValue": 1 }
).map(function(c){
return c['available']['Community']['attributes']['type']['defaultValue']
})[0]
Output
well-known
You could try the following.
Using find:
db.templateAttributes.find({ platform: "A1" }, { "available.Community.attributes.type.defaultValue": 1 }).toArray()[0]['available']['Community']['attributes']['type']['defaultValue']
Using findOne:
db.templateAttributes.findOne({ platform: "A1" }, { "available.Community.attributes.type.defaultValue": 1 })['available']['Community']['attributes']['type']['defaultValue']
Using aggregation:
db.templateAttributes.aggregate([
{"$match":{platform:"A1"}},
{"$project": {_id:0, default:"$available.Community.attributes.type.defaultValue"}}
]).toArray()[0].default
Output:
well-known
Edit: Answering the updated question: Please use aggregation here.
db.templateAttributes.aggregate([
{"$match":{platform:"A1"}}, {"$unwind": "$available.Community.attributes.type.values"},
{$group: {"_id": null, "val":{"$push":{label:"$available.Community.attributes.type.values",
value:"$available.Community.attributes.type.values"}}}}
]).toArray()[0].val
Output:
[
{
"label" : "well-known",
"value" : "well-known"
},
{
"label" : "simple",
"value" : "simple"
},
{
"label" : "complex",
"value" : "complex"
}
]