How to get Spring Data Mongo Aggregations to work like they do with mongo - mongodb

I am attempting a simple projection using Spring Data Mongo's Aggregation API.
The pipeline step I want to do is:
{
$project : {
"account._id" : 1,
"account.position" : 1
}
}
This is what I have tried (along with a ton of other tweaks because nothing seems to work):
ProjectionOperation project1 = Aggregation.project("account._id", "account.position");
However, even though this is how the documentation says to do it here: https://docs.spring.io/spring-data/mongodb/docs/current/reference/html/#mongo.aggregation.projection
The actual document that is rendered by that projection ends up looking like:
{
$project : {
_id : "$account._id",
position : "$account.position"
}
}
Which works completely differently than the projection that I am wanting to use.
Does anyone know how to get a projection like I want out of Spring Data Mongo Aggregation API, or is this a bug I need to report?
Update 8/29/2019 - Adding more data to build out the context:
Two collections are involved: "groups" and "accounts"
A group looks something like this:
{
_id : ObjectId("..."),
name: ...,
ownerId: ObjectId("..."),
other stuff...
}
An account looks something like this:
{
_id : ObjectId("..."),
position : "ABC",
memberships : [{
groupId: ObjectId("..."),
otherstuff: ...,
}],
other stuff...
}
My whole aggregation looks like this and works as desired in mongodb shell: (trying to get a list of all account ids of a particular type that are members of any groups owned by a particular user)
groups.aggregate(
{
$match : {
ownerId : ObjectId("XYZ"),
}
},
{
$lookup: {
from: "accounts",
localField: "_id",
foreignField: "memberships.groupId",
as: "account"
}
},
{
$project: {
"account._id" : 1,
"account.position" : 1
}
},
{
$unwind: "$account"
},
{
$match: {
"account.position" : "ZZZ"
}
},
{
$project: {
_id : 0,
accountId : "$account._id"
}
})
Java version of the Aggregation:
MatchOperation match1 = Aggregation.match(
where("ownerId").is(accountId));
LookupOperation lookupOperation = LookupOperation.newLookup()
.from("accounts")
.localField("_id")
.foreignField("memberships.groupId")
.as("account");
// This doesn't work correctly on nested fields:
ProjectionOperation project1 = Aggregation.project(
"studentAccount._id",
"studentAccount.position");
Aggregation aggregation = Aggregation.newAggregation(
match1,
lookupOperation,
project1,
unwind("account"),
match(where("account.position").is("ZZZ")),
project().and("account._id").as("accountId"));

If you want your aggregation work look like mongoshell your could try like this
Aggregation aggregation = Aggregation.newAggregation(
match1,
lookupOperation,
// This's your project operation
new AggregationOperation() {
#Override
public Document toDocument(AggregationOperationContext aggregationOperationContext) {
Document project = new Document("$project",
new Document(
"_id", "$account._id"
).append("position", "$account.position")
);
return aggregationOperationContext.getMappedObject(project);
}
},
unwind("account"),
match(where("account.position").is("ZZZ")),
project().and("account._id").as("accountId")
);
You can check my answer here in a more generic way

Related

How do I update a field in a sub-document array with a field from the document in MongoDB?

I have a large amount of data (~160M items) where a date value wasn't populated on the sub-document array fields, but was populated on the parent document. I'm very new to MongoDB and having trouble figuring out how to $set the field to match. Here's a sample of the data:
{
"_id": "5f11d4c48663f32e940696ed",
"Widgets":[{
"WidgetId":663,
"Name":"Super Widget 2.0",
"Created":null,
"LastUpdated":null
}],
"Status":3,
"LastUpdated":null,
"Created": "2018-11-09T18:22:16.000Z"
}
}
My knowledge of MongoDB is pretty limited but here's the basic aggregation I have created for part of the pipeline and where I'm struggling:
db.sample.aggregate(
[
{
"$match" : {
"Donors.$.Created" : {
"$exists" : true
}
}
},
{
"$match" : {
"Widgets.$.Created" : null
}
},
{
"$set" : {
"Widgets.$.Created" : "Created" // <- This is where I can't figure out how to define the reference to the parent "Created" field
}
}
]
);
The desired output would be:
{
"_id": "5f11d4c48663f32e940696ed",
"Widgets":[{
"WidgetId":663,
"Name":"Super Widget 2.0",
"Created":"2018-11-09T18:22:16.000Z",
"LastUpdated":null
}],
"Status":3,
"LastUpdated":null,
"Created": "2018-11-09T18:22:16.000Z"
}
}
Thanks for any assitance
Are you attempting to add the Created field to sub documents on query/aggregation? Or are you attempting to update/save the Created field on the subdocuments?
The $ is an update operator, to be used with updateMany or updateOne. Not aggregate.
https://docs.mongodb.com/manual/reference/operator/query-array/
https://docs.mongodb.com/manual/reference/operator/update-array/
If you just want to add the parents Created field to all subdocuments on query/aggregation this is all you have to do: https://mongoplayground.net/p/yHDHULCSTIz
db.collection.aggregate([
{
"$addFields": {
"Widgets.Created": "$Created"
}
}
])
If your attempting to save the parents Created field to all subdocuments:
db.sample.updateMany({"Widgets.Created" : null}, [{$set: {"Widgets.Created" : "$Created"}}])
Note: This matches any doc that has a subdocument with a null Created field and updates all the subdocuments.

How to apply group by on nested document in MongoDB using MongoTemplate?

db.students.aggregate([
{ $unwind: "$details" },
{
$group: {
_id: {
sid: "$details.student._id",
statuscode: "$details.studentStatus.statusCode"
},
total: { $sum: 1 }
}
}
]);
The query is working fine and need to convert into mongo template.
Sample document:
{
"_id" : 59,
"details" : [
{
"student" : {
"_id" : "5d3145a8523a2e602e5e0200"
},
"studentStatus" : {
"statusCode" : 1
}
}
]
}
The Spring Data MongoTemplate code for the given aggregation is as follows.
Note that I have added a project stage before the group. This project is required; if the nested fields ("details.student._id" and "details.studentStatus.statusCode") are used directly within the group stage there are errors "FieldPath field names may not contain '.'." and could not be resolved (and this only happens when you use more than one field in the grouping).
The result is same as that of the aggregation you have provided. I have used the latest of Spring and MongoDB drivers with Java 8.
MongoOperations mongoOps = new MongoTemplate(MongoClients.create(), "spr_test");
Aggregation agg = newAggregation(
unwind("details"),
project("_id")
.and("details.student._id").as("sid")
.and("details.studentStatus.statusCode").as("statuscode"),
group("sid", "statuscode")
.count().as("total")
);
AggregationResults<Document> aggResults = mongoOps.aggregate(agg, "students", Document.class);
aggResults.forEach(System.out::println);

Group by array of document in Spring Mongo Db

How can I group by tagValue in Spring and MongoDb?
MongoDB Query :
db.feed.aggregate([
{ $group: { _id: "$feedTag.tagValue", number: { $sum : 1 } } },
{ $sort: { _id : 1 } }
])
How can I do the same thing in Spring MongoDB, may be using Aggregation method?
Sample document of feed collections:
{
"_id" : ObjectId("556846dd1df42d5d579362fd"),
"feedTag" : [
{
"tagName" : "sentiment",
"tagValue" : "neutral",
"modelName" : "sentiment"
}
],
"createdDate" : "2015-05-28"
}
To group by tagValue, since this is an array field, you need to apply the $unwind pipeline step before the group to split the array so that you can get the actual count:
db.feed.aggregate([
{
"$unwind": "$feedTag"
}
{
"$group": {
"_id": "$feedTag.tagValue",
"number": { "$sum" : 1 }
}
},
{ "$sort": { "_id" : 1 } }
])
The following is the equivalent example in Spring Data MongoDB:
import static org.springframework.data.mongodb.core.aggregation.Aggregation.*;
Aggregation agg = newAggregation(
unwind("feedTag"),
group("feedTag.tagValue").count().as("number"),
sort(ASC, "_id")
);
// Convert the aggregation result into a List
AggregationResults<Feed> results = mongoTemplate.aggregate(agg, "feed", Feed.class);
List<Feed> feedCount = results.getMappedResults();
From the above, a new aggregation object is created via the newAggregation static factory method which is passed a list of aggregation operations that define the aggregation pipeline of your Aggregation.
The firt step uses the unwind operation to generate a new document for each tag within the "feedTag" array.
In the second step the group operation defines a group for each embedded "feedTag.tagValue"-value for which the occurrence count is aggregated via the count aggregation operator.
As the third step, sort the resulting list of feedTag by their tagValue in ascending order via the sort operation.
Finally call the aggregate Method on the MongoTemplate to let MongoDB perform the actual aggregation operation with the created Aggregation as an argument.
Note that the input collection is explicitly specified as the "feed" parameter to the aggregate Method. If the name of the input collection is not specified explicitly, it is derived from the input-class passed as first parameter to the newAggreation Method.

MongoDB Aggregation with DBRef

Is it possible to aggregate on data that is stored via DBRef?
Mongo 2.6
Let's say I have transaction data like:
{
_id : ObjectId(...),
user : DBRef("user", ObjectId(...)),
product : DBRef("product", ObjectId(...)),
source : DBRef("website", ObjectId(...)),
quantity : 3,
price : 40.95,
total_price : 122.85,
sold_at : ISODate("2015-07-08T09:09:40.262-0700")
}
The trick is "source" is polymorphic in nature - it could be different $ref values such as "webpage", "call_center", etc that also have different ObjectIds. For example DBRef("webpage", ObjectId("1")) and DBRef("webpage",ObjectId("2")) would be two different webpages where a transaction originated.
I would like to ultimately aggregate by source over a period of time (like a month):
db.coll.aggregate( { $match : { sold_at : { $gte : start, $lt : end } } },
{ $project : { source : 1, total_price : 1 } },
{ $group : {
_id : { "source.$ref" : "$source.$ref" },
count : { $sum : $total_price }
} } );
The trick is you get a path error trying to use a variable starting with $ either by trying to group by it or by trying to transform using expressions via project.
Any way to do this? Actually trying to push this data via aggregation to a subcollection to operate on it there. Trying to avoid a large cursor operation over millions of records to transform the data so I can aggregate it.
Mongo 4. Solved this issue in the following way:
Having this structure:
{
"_id" : LUUID("144e690f-9613-897c-9eab-913933bed9a7"),
"owner" : {
"$ref" : "person",
"$id" : NumberLong(10)
},
...
...
}
I needed to use "owner.$id" field. But because of "$" in the name of field, I was unable to use aggregation.
I transformed "owner.$id" -> "owner" using following snippet:
db.activities.find({}).aggregate([
{
$addFields: {
"owner": {
$arrayElemAt: [{ $objectToArray: "$owner" }, 1]
}
}
},
{
$addFields: {
"owner": "$owner.v"
}
},
{"$group" : {_id:"$owner", count:{$sum:1}}},
{$sort:{"count":-1}}
])
Detailed explanations here - https://dev.to/saurabh73/mongodb-using-aggregation-pipeline-to-extract-dbref-using-lookup-operator-4ekl
You cannot use DBRef values with the aggregation framework. Instead you need to use JavasScript processing of mapReduce in order to access the property naming that they use:
db.coll.mapReduce(
function() {
emit( this.source.$ref, this["total_price"] )
},
function(key,values) {
return Array.sum( values );
},
{
"query": { "sold_at": { "$gte": start, "$lt": end } },
"out": { "inline": 1 }
}
)
You really should not be using DBRef at all. The usage is basically deprecated now and if you feel you need some external referencing then you should be "manually referencing" this with your own code or implemented by some other library, with which you can do so in a much more supported way.

How can I write a Mongoose find query that uses another field as it's conditional?

Consider the following:
I have a Mongoose model called 'Person'. In the schema for the Person mode, each Person has two fields: 'children' and 'maximum_children'. Both fields are of type Number.
I would like to write a find query that returns Persons when that Persons 'children' value is less that it's 'maximum_children' value.
I have tried:
person_model.find({
children: {
$lt: maximum_children
}
}, function (error, persons) {
// DO SOMETHING ELSE
});
and
person_model.find({
children: {
$lt: 'maximum_children'
}
}, function (error, persons) {
// DO SOMETHING ELSE
});
I'm doing something wrong in trying to specify the field name that I want to compare 'children' against.
OK.
I found a solution, just after I posted this question.
The answer seems to be:
person_model.find({
$where: "children < maximum_children"}, function (error, persons)
}, {
// DO SOMETHING ELSE
});
Seems to work OK, although it seems messy.
$where must execute its JavaScript conditional against every doc so its performance can be quite poor. Instead, you can use aggregate to include a new field in a $project stage the indicates whether the doc matches or not and then filter on that:
person_model.aggregate([
{$project: {
isMatch: {$lt: ['$children', '$maximum_children']},
doc: '$$ROOT'
}},
{$match: {isMatch: true}},
{$project: {_id: 0, doc: 1}}
], function(err, results) {...});
This uses $$ROOT to include the original doc as the doc field of the projection, with a final $project used to remove the isMatch field that was added.
results looks like:
{
"doc" : {
"_id" : ObjectId("54d04591257efd80c6965ada"),
"children" : 5,
"maximum_children" : 10
}
},
{
"doc" : {
"_id" : ObjectId("54d04591257efd80c6965add"),
"children" : 5,
"maximum_children" : 6
}
}
If you want to remove the added doc level of the objects you can use Array#map on results like so:
results = results.map(function(item) { return item.doc; });
Which reshapes results to put them back into their original form:
{
"_id" : ObjectId("54d04591257efd80c6965ada"),
"children" : 5,
"maximum_children" : 10
},
{
"_id" : ObjectId("54d04591257efd80c6965add"),
"children" : 5,
"maximum_children" : 6
}