multiplication and group by in mongodb - mongodb

I have a collection as follow in mongodb :
{
"_id" : ObjectId("54901212f315dce7077204af"),
"Date" : ISODate("2014-10-20T04:00:00.000Z"),
"Type" : "Twitter",
"Entities" : [
{
"ID" : 2,
"Name" : "test1",
"Sentiment" : {
"Value" : 20,
"Neutral" : 1
},
{
"ID" : 1,
"Name" : "test1",
"Sentiment" : {
"Value" : 1,
"Neutral" : 1
}
},
{
"ID" : 3,
"Name" : "test1",
"Sentiment" : {
"Value" : 2,
"Neutral" : 1
}
]
}
and I have couple of them , for example in date 2014-10-20 you might find 5 tweets each of which have different value for sentiment, now what I want to do is to group by date and then get the sum of sentiment value for each date multiply it by the number of collections for each date, for example if we have 2 collections in 2014-10-20 with sentiment values of 20,1,2 like the collection showed above and just 5 for another collection then the value for 2014-10-20 is (20+1+2+5)3(because this tweet is repeated for 3 entites) 2(because we have 2 tweets document in this date)=168 , if I do not consider frequency of collections my code works well as follow :
DBObject unwind = new BasicDBObject("$unwind", "$Entities"); // "$unwind" converts object with array into many duplicate objects, each with one from array
collectionG = db.getCollection("GraphDataCollection");
DBObject groupFields = new BasicDBObject( "_id", "$Date");
groupFields.put("value", new BasicDBObject( "$sum", "$Entities.Sentiment.Value"));
DBObject groupBy = new BasicDBObject("$group", groupFields );
DBObject sort = new BasicDBObject("$sort", new BasicDBObject("Date", 1));
stages.add(unwind);
stages.add(groupBy);
DBObject project = new BasicDBObject("_id",0);
project.put("Date","$_id");
project.put("value",1);
stages.add(new BasicDBObject("$project",project));
stages.add(sort);
AggregationOutput output = collectionG.aggregate(stages);
Now the result for for example 2014-10-20 returns 28 but I want 168
can anyone help me ?
Update : the last version of the code that I used is as follow:
DBCollection collectionG;
collectionG = db.getCollection("GraphDataCollection");
List<DBObject> stages = new ArrayList<DBObject>();
ArrayList<DBObject> andArray = null;
DBObject groupFields = new BasicDBObject( "_id", "$_id");
groupFields.put("value", new BasicDBObject( "$sum", "$Entities.Sentiment.Value"));
groupFields.put("date", new BasicDBObject( "$first", "$Date"));
DBObject groupBy = new BasicDBObject("$group", groupFields );
stages.add(groupBy);
DBObject groupByDate = new BasicDBObject( "_id", "$date");
groupByDate.put("value",new BasicDBObject("$sum","$value"));
groupByDate.put("count",new BasicDBObject("$sum",1));
DBObject dtGrp = new BasicDBObject("$group", groupByDate );
stages.add(dtGrp);
DBObject project = new BasicDBObject("_id",1);
project.put("value",new BasicDBObject("$multiply",
new Object[]{"$value","$count"}));
stages.add(new BasicDBObject("$project",project));
AggregationOutput output = collectionG.aggregate(stages);
System.out.println(output.results());

Unwind Entities:
DBObject unwind = new BasicDBObject("$unwind", "$Entities");
stages.add(unwind);
Group by _id to find the sum of all the Entities sentiment values per document.
DBObject groupFields = new BasicDBObject( "_id", "$_id");
groupFields.put("value", new BasicDBObject( "$sum", "$Entities.Sentiment.Value"));
groupFields.put("date", new BasicDBObject( "$first", "$Date"));
DBObject groupBy = new BasicDBObject("$group", groupFields );
stages.add(groupBy);
Group by Date now, to get the sum of total Entities Value, and the count of documents per group.
DBObject groupByDate = new BasicDBObject( "_id", "$date");
groupByDate.put("value",new BasicDBObject("$sum","$value"));
groupByDate.put("count",new BasicDBObject("$sum",1));
DBObject dtGrp = new BasicDBObject("$group", groupByDate );
stages.add(dtGrp);
Project value as the multiplicative result of count and value, for each group.
DBObject project = new BasicDBObject("_id",1);
project.put("value",new BasicDBObject("$multiply",
new Object[]{"$value","$count"}));
stages.add(new BasicDBObject("$project",project));
In case your dates differ by milliseconds, you need to group by the date, year and month together, in the second group stage and add a sort stage if necessary.

Related

MongoDB shell aggregation does not work in Morphia

Consider the following MongoDB collection:
{
"_id" : ObjectId("..."),
"myId": 12345,
"root": {
basicData: {
code: "CODE"
}
data: [
{
descriptions: {
description: [
{
text: "...",
language: "de"
}
]
}
}
]
}}
I'm trying to get documents filtered by "myId" and "code", but with descriptions in only one specific language. In the shell, the following command seems to work properly:
db.Items.aggregate([
{ "$match" : { "myId" : 40943 , "root.basicData.code" : "A_CODE"}},
{ "$unwind" : "$root.data"},
{ "$unwind" : "$root.data.descriptions.description"},
{ "$match" : { "root.data.descriptions.description.language" : "de"}}
])
In Morphia I try to do the following to get to the same result:
AggregationPipeline pipeline = dataStore.createAggregation(Item.class);
Query<Item> matchIdAndCode = dataStore.createQuery(Item.class);
matchIdAndCode.field("myId").equal(myid);
matchIdAndCode.field("root.basicData.code").equal(code);
pipeline.match(matchIdAndCode);
pipeline.unwind("root.data");
pipeline.unwind("root.data.descriptions.description");
Query<Item> matchLanguage = dataStore.createQuery(Item.class);
matchLanguage.field("root.data.descriptions.description.language").equal(language);
pipeline.match(matchLanguage);
Iterator<Item> itemAggregate = pipeline.aggregate(Item.class);
but the iterator does not contain any items. I am not shure where to search for further errors, especially because when I copy the stages in the morphia aggregation pipeline to the shell, I get the expected result.
you are missing $ sign in following lines
pipeline.unwind("root.data");
pipeline.unwind("root.data.descriptions.description");
should be
pipeline.unwind("$root.data");
pipeline.unwind("$root.data.descriptions.description");
As a workaround, I now used the MongoDB Java Driver. My working solution:
List<DBObject> stages = new ArrayList<DBObject>();
DBCollection collection = dataStore.getCollection(Item.class);
// match
DBObject matchFields = new BasicDBObject("myId", myid);
matchFields.put("code", code);
DBObject match = new BasicDBObject("$match", matchFields );
stages.add(match);
// unwind
DBObject unwindDescriptiveData = new BasicDBObject("$unwind", "$root.data");
stages.add(unwindDescriptiveData);
DBObject unwindDescription = new BasicDBObject("$unwind", "$root.data.descriptions.description");
stages.add(unwindDescription);
// match
DBObject languageMatchFields = new BasicDBObject("root.data.descriptions.description.language", language);
DBObject languageMatch = new BasicDBObject("$match", languageMatchFields );
stages.add(languageMatch);
AggregationOutput aggregate = collection.aggregate(stages);
The mapping to the pojo can be done with Morphia again:
List<Item> items = new ArrayList<Item>();
for (Iterator<DBObject> iterator = aggregate.results().iterator(); iterator.hasNext();) {
items.add(morphia.fromDBObject(Item.class, iterator.next()));
}

Fetch mongo documents based on date

We are inserting mongo documents with identifier and there is a subarray within the documents.
insert 1 :
db.test.insert(
{
"companyId" : "123",
"persons" : [
{
"joiningDate" : NumberLong("1431674741623"),
"name" : "Rajesh"
}
],
})
insert 2 :
db.test.insert(
{
"companyId" : "123",
"persons" : [
{
"joiningDate" : NumberLong("1431674741653"),
"name" : "Rahul"
}
],
})
I would like to retreive the company details based on the company id and merge the persons into one array list, and sort the persons based on the joining date.
Currently I am able to retreive the data using QueryBuilder but I am unable to sort the person based on the date.I can use java comparator to do the same , but I am looking out if there is any API from mongo db java driver which can be used to get the same.
Thanks.
you should use mongo aggregation like first $unwind persons array and then sort persons.joiningDate and then group with push as below :
db.test.aggregate({
"$match": {
"companyId": "123" // match companyId
}
}, {
"$unwind": "$persons" //unwind person array
}, {
"$sort": {
"persons.joiningDate": -1 //sort joining date
}
}, {
"$group": {
"_id": "$companyId",
"persons": {
"$push": "$persons" //push all sorted data into persons
}
}
}).pretty()
For converting this code into java use mongo java aggregation as
// unwind persons
DBObject unwind = new BasicDBObject("$unwind", "$persons");
// create pipeline operations, with the $match companyId
DBObject match = new BasicDBObject("$match", new BasicDBObject("companyId", "123"));
// sort persons by joining date
DBObject sortDates = new BasicDBObject("$sort", new BasicDBObject("persons.joiningDate", -1)); // -1 and 1 descending or ascending resp.
// Now the $group operation
DBObject groupFields = new BasicDBObject("_id", "$companyId");
groupFields.put("persons", new BasicDBObject("$push", "$persons"));
DBObject group = new BasicDBObject("$group", groupFields);
// run aggregation
List < DBObject > pipeline = Arrays.asList(match, unwind,sortDates, group);
AggregationOutput output = test.aggregate(pipeline);
for(DBObject result: output.results()) {
System.out.println(result);
}

Nested query with aggregation in Mongo

I have a document in MongoDB:
{
"_id" : ObjectId("111111111111111111111111"),
"taskName" : "scan",
"nMapRun" : {
...
"hosts" : {
...
"distance" : {
"value" : "1"
},..
}
I'm interested in the field: nMapRun.hosts.distance.value
How do I get ten maximum values ​​of the field .
Could you give an example of a Java?
The aggregation operation in shell:
db.collection.aggregate([
{$sort:{"nMapRun.hosts.distance.value":-1}},
{$limit:10},
{$group:{"_id":null,"values":{$push:"$nMapRun.hosts.distance.value"}}},
{$project:{"_id":0,"values":1}}
])
You need to build the corresponding DBObjects for each stage as below:
DBObject sort = new BasicDBObject("$sort",
new BasicDBObject("nMapRun.hosts.distance.value", -1));
DBObject limit = new BasicDBObject("$limit", 10);
DBObject groupFields = new BasicDBObject( "_id", null);
groupFields.put("values",
new BasicDBObject( "$push","$nMapRun.hosts.distance.value"));
DBObject group = new BasicDBObject("$group", groupFields);
DBObject fields = new BasicDBObject("values", 1);
fields.put("_id", 0);
DBObject project = new BasicDBObject("$project", fields );
Running the aggregation pipeline:
List<DBObject> pipeline = Arrays.asList(sort, limit, group, project);
AggregationOutput output = coll.aggregate(pipeline);
output.results().forEach(i -> System.out.println(i));

How to get the distinct list of ids from the Mongo Collection by datetime ordering?

How to get the distinct list of ids from the Mongo Collection by datetime ordering ?
DBObject queryObject = new BasicDBObject("dateTime",-1);
List<String> objects= MDB.getCollection("messages").distinct("_stdid",queryObject);
This is tricky, because each distinct value for _stdid can have a different dateTime field value. Which one do you want to pick? The first, the last, the average?
You will also need to use the aggregation framework, and not a straight distinct. On the MongoDB shell, you would use (if you wanted the first of all the dateTime values that map to a single _stdid field):
db.messages.aggregate( [
{ $group: { _id: '$_stdid', dateTime : { $first: '$dateTime' } } },
{ $sort: { dateTime : -1 } }
] );
In Java, this looks like:
// create our pipeline operations, first with the $group operation
DBObject groupFields = new BasicDBObject( "_id", "$_stdid" );
groupFields.put( "dateTime", new BasicDBObject( "$first", "$dateTime" ) );
DBObject group = new BasicDBObject( "$group", groupFields );
// then the $sort
DBObject sort = new BasicDBObject( "$sort", new BasicDBObject( 'dateTime', 1 ) );
// run aggregation
AggregationOutput output = MDB.getCollection("messages").aggregate( group, sort );

Find objects by array in mongodb (or java)

I've got a collection (dataset) like this:
{
"_id" : ObjectId("515611c1c6e3718ee42a5655"),
"id": "Product1",
"type": "ProductType4"
"productFeature": [
{
"id": "ProductFeature1"
},
{
"id": "ProductFeature2"
},
{
"id": "ProductFeature3"
}
]
"productPropertyNumeric": 25
},
... and more product objects...
{
"_id" : ObjectId("515611c1c6e3718ee42a5666"),
"id": "ProductFeature1",
"label": "blablabla"
},
{
"_id" : ObjectId("515611c1c6e3718ee42a5667"),
"id": "ProductFeature2",
"label": "blebleble"
},
{
"_id" : ObjectId("515611c1c6e3718ee42a5668"),
"id": "ProductFeature3",
"label": "blublublu"
} ... and more feature objects...
According to Product1, I have to find the features and labels that the specific product has in its "productFeature" array.
I have tried in Mongo shell to find them (using a variable, for example):
var aaa = db.dataset.find({ id: "Product1" })
db.dataset.find({ id: "aaa.productFeature.id" })
But it doesn't work. If somebody knows how to find objects by array please help me.
Thanks very much.
PS: It would be best in Java - I apply a query just for example:
BasicDBObject query = new BasicDBObject();
query.put("type","ProductType4");
query.put("productPropertyNumeric", new BasicDBObject("$gt", 10));
DBCursor cursor = coll.find(query).sort( new BasicDBObject("label", 1));
while (cursor.hasNext()){
System.out.println(cursor.next().get("id"));
}
Here is my answer to my own question. I hope this helps to someone.
BasicDBObject query = new BasicDBObject();
BasicDBObject field = new BasicDBObject();
query.put("id", "Product1");
field.put("id", 1);
field.put("productFeature", 1);
field.put("_id", 0);
DBCursor cursor = coll.find(query, field);
while (cursor.hasNext()) {
BasicDBObject result = (BasicDBObject) cursor.next();
System.out.println(result);
ArrayList<BasicDBObject> features = (ArrayList<BasicDBObject>) result.get("productFeature");
for (BasicDBObject embedded : features) {
String featuresId = (String) embedded.get("id");
BasicDBObject query2 = new BasicDBObject();
BasicDBObject field2 = new BasicDBObject();
query2.put("id", featuresId);
field2.put("id", 1);
field2.put("label", 1);
field2.put("_id", 0);
DBCursor cursor2 = coll.find(query2, field2);
while (cursor2.hasNext()) {
System.out.println(cursor2.next());
}
}
}
You have to supply the "path" in the document structure to the field you want to query on from the document root. In this case the path is 'productFeature' --> 'id'. Instead of an arrow MongoDB uses a dot (.), e.g.,
db.dataset.find({ "productFeature.id" : "Product1" });
In Java you do something very similar:
BasicDBObject query = new BasicDBObject("productFeature.id" : "Product1");
DBCursor cursor = coll.find(query).sort( new BasicDBObject("label", 1));
while (cursor.hasNext()){
System.out.println(cursor.next().get("id"));
}
In Java you could also use the Query class in combination with MongoTemplate.
import org.springframework.data.mongodb.core.MongoTemplate;
import org.springframework.data.mongodb.core.query.Criteria;
import org.springframework.data.mongodb.core.query.Query;
#Autowired
private final MongoTemplate mongoTemplate;
...
public YourObjectClass findProduct1(){
Query query = new Query();
query.addCriteria(Criteria.where("productFeature.id").is("Product1"));
List<YourObjectClass> result = this.mongoTemplate.find(query, YourObjectClass.class);
return result;
}