MongoDB Aggregation Framework - mongodb

I have a document that's structured as follows:
{
'_id' => 'Star Wars',
'count' => 1234,
'spelling' => [ ( 'Star wars' => 10, 'Star Wars' => 15, 'sTaR WaRs' => 5) ]
}
I would like to get the top N documents (by descending count), but with only one one spelling per document (the one with the highest value). It there a way to do this with the aggregation framework?
I can easily get the top 10 results (using $sort and $limit). But how do I get only one spelling per each?
So for example, if I have the following three records:
{
'_id' => 'star_wars',
'count' => 1234,
'spelling' => [ ( 'Star wars' => 10, 'Star Wars' => 15, 'sTaR WaRs' => 5) ]
}
{
'_id' => 'willow',
'count' => 2211,
'spelling' => [ ( 'willow' => 300, 'Willow' => 550) ]
}
{
'_id' => 'indiana_jones',
'count' => 12,
'spelling' => [ ( 'indiana Jones' => 10, 'Indiana Jones' => 25, 'indiana jones' => 5) ]
}
And I ask for the top 2 results, I'll get:
{
'_id' => 'willow',
'count' => 2211,
'spelling' => 'Willow'
}
{
'_id' => 'star_wars',
'count' => 1234,
'spelling' => 'Star Wars'
}
(or something to this effect)
Thanks!

Your schema as designed would make using anything but a MapReduce difficult as you've used the keys of the object as values. So, I adjusted your schema to better match with MongoDB's capabilities (in JSON format as well for this example):
{
'_id' : 'star_wars',
'count' : 1234,
'spellings' : [
{ spelling: 'Star wars', total: 10},
{ spelling: 'Star Wars', total : 15},
{ spelling: 'sTaR WaRs', total : 5} ]
}
Note that it's now an array of objects with a specific key name, spelling, and a value for the total (I didn't know what that number actually represented, so I've called it total in my examples).
On to the aggregation:
db.so.aggregate([
{ $unwind: '$spellings' },
{ $project: {
'spelling' : '$spellings.spelling',
'total': '$spellings.total',
'count': '$count'
}
},
{ $sort : { total : -1 } },
{ $group : { _id : '$_id',
count: { $first: '$count' },
largest : { $first : '$total' },
spelling : { $first: '$spelling' }
}
}
])
Unwind all of the data so the aggregation pipeline can access the various values of the array
Flatten the data to include the key aspects needed by the pipeline. In this case, the specific spelling, the total, and the count.
Sort on the total, so that the last grouping can use $first
Then, group so that only the $first value for each _id is returned, and then also return the count which because of the way it was flattened for the pipeline, each temporary document will contain the count field.
Results:
[
{
"_id" : "star_wars",
"count" : 1234,
"largest" : 15,
"spelling" : "Star Wars"
},
{
"_id" : "indiana_jones",
"count" : 12,
"largest" : 25,
"spelling" : "Indiana Jones"
},
{
"_id" : "willow",
"count" : 2211,
"largest" : 550,
"spelling" : "Willow"
}
]

Related

Symfony how to mongo-odm-aggregation-bundle

I am confused to ask this question but I can not find a solution to my problem.
I use the mongo-odm-agregation-bundle to perform an aggregate on my data.
I don't know how to use correctly this bundle, the documentation is not sufficiently explicit and the result is not that i would expect.
So, in mongoDB my code is for the aggregate :
id: { Epreuve:"$EPREUVE", month: { $month: "$DATE" },
day: { $dayOfMonth: "$DATE" }, year: { $year: "$DATE" }},
total: { $sum: "$SCORE" },
nbmots: {$sum: "$NBMOTS"},
moymots: {$avg : "$NBMOTS"},
moytemps:{$avg: "$CHRONOS"},
position: { $sum: 1 },
And the result is :
{
"_id" : {
"Epreuve" : "Verbe",
"month" : NumberInt(2),
"day" : NumberInt(21),
"year" : NumberInt(2017)
},
"total" : NumberLong(430),
"nbmots" : NumberLong(16),
"moymots" : 16.0,
"moytemps" : 147.24,
"position" : 1.0
}
In Symfony, i use this sample to test :
$expr = new \Solution\MongoAggregation\Pipeline\Operators\Expr;
$aq = $this->get('doctrine_mongodb.odm.default_aggregation_query')->getCollection('PortailBundle:DataScoreMotsMeles')->createAggregateQuery();
$result = $aq->match(['SESSION'=>$currentSession])
->group(['_id' => [ 'Epreuve' => "EPREUVE", ],
'Score' => $expr->sum("SCORE"),
'nbMots'=> $expr->sum("NBMOTS"),
'moyMots'=> $expr->avg("NBMOTS"),
'moyTemps'=> $expr->avg("CHRONOS"),
'count' => $expr->sum(1)])
->sort(['count' => -1])
->limit(10)
->getQuery()
->aggregate()
->toArray();
The result is :
Array ( [0] => Array ( [_id] => Array ( [Epreuve] => **EPREUVE** ) [Score] => **0** [nbMots] => **0** [moyMots] => [moyTemps] => [count] => 3 ) )
The problem is the result is 0 each time.
It is normal because i use :
$expr->sum("NBMOTS")
instead of :
$expr->sum('$NBMOTS')
But if i use '$NBMOTS' it doesn't works. So how i do ? I need your help.

How to apply correctly $limit and $skip in subfields?

I'm starting with mongodb and I'm finding many difficulties with the following scheme.
{
"_id" : "AAA",
"events" : [
{
"event" : "001",
"time" : 1456823333
},
{
"event" : "002",
"time" : 1456828888
},
{
"event" : "003",
"time" : 1456825555
},...
]
}
I want to get the events sorted by date and apply limit and skip.
I'm using the following query:
$op = array(
array('$match' => array('_id' => $userId)),
array('$unwind' => '$events'),
array('$sort' => array('events.time' => -1)),
array('$group' => array('_id' => '$_id',
'events' => array('$push' => '$events')))
//,array('$project' => array('_id' => 1, 'events' => array('$events', 0, 3)))
//,array('$limit' => 4)
//,array('$skip' => 3)
);
$result= Mongo->aggregate('mycollection', $op);
I have tried everything to filter $project or $limit and $skip but none of it works.
How should I apply the limit and skyp conditions in events?
If I do not apply the conditions of "limit" above the result is ordered correctly.
Result:
{ "waitedMS":0,
"result":[
{
"_id":"AAA",
"events":[
{
"event":"002",
"time":1456828888,
},
{
"event":"003",
"time":1456825555,
},
{
"event":"001",
"time":1456823333,
},...
}
],
"ok":1
}
Order correctly but I can not limit the number of results for paging.

how to check two fields having same values in same table in laravel 5 with mongodb

I have to check the two fields having same values in same table in laravel 5. I am using Mongodb.
{
"id": "565d23ef5c2a4c9454355679",
"title": "Event1",
"summary": "test",
"total": NumberInt(87),
"remaining": NumberInt(87),
"status": "1"
}
I need to check "total" and "remaining" fields are same. How to write query in laravel 5.1. Please help.
One approach you could take would be using the aggregation framework methods from the raw MongoDB collection object provided from the underlying driver. In the mongo shell, you would essentially run the following aggregation pipeline operation to compare the two fields and return the documents which satisfy that criteria:
db.collection.aggregate([
{
"$project": {
"isMatch": { "$eq" : ["$total", "$remaining"] }, // similar to "valueof(total) == valueof(remaining)"
"id" : 1,
"title" : 1,
"summary" : 1,
"total" : 1,
"remaining" : 1,
"status" : 1
}
},
{
"$match": { "isMatch": true } // filter to get documents that only satisfy "valueof(total) == valueof(remaining)"
}
]);
Or using the $where operator in the find() query:
db.collection.find({ "$where" : "this.total == this.remaining" })
Thus in laravel, you can get the documents using raw expressions as follows
$result = DB::collection("collectionName") -> raw(function ($collection)
{
return $collection->aggregate(array(
array(
"$project" => array(
"id" => 1,
"title" => 1,
"summary" => 1,
"total" => 1,
"remaining" => 1,
"status" => 1,
"isMatch" => array(
"$eq" => array( "$total", "$remaining" )
)
)
),
array(
"$match" => array(
"isMatch" => true
)
)
));
});
In the case of $where, you can inject the expressions directly into the query:
Model::whereRaw(array("$where" => "this.total == this.remaining"))->get();
Or using the raw expression on the internal MongoCollection object executed on the query builder. Note that using the raw() method requires using a cursor because it is a low-level call:
$result = Model::raw()->find(array("$where" => "this.total == this.remaining"));
Collectionname::whereRaw(array('$where' => "this.filed1 > this.field2"))

MongoDB: Upserting and Sub documents

Let's assume the following schema:
{
'_id' : 'star_wars',
'count' : 1234,
'spellings' : [
{ spelling: 'Star wars', total: 10},
{ spelling: 'Star Wars', total : 15},
{ spelling: 'sTaR WaRs', total : 5} ]
}
I can update the count and one of the spellings by doing this:
db.movies.update(
{_id: "star_wars",
'spellings.spelling' : "Star Wars" },
{ $inc :
{ 'spellings.$.total' : 1,
'count' : 1 }}
)
But this form of update doesn't work with upsert. i.e., if I try to update (with upsert) with an _id that doesn't exist, or with a spelling that doesn't already exist, nothing happens.
Is there a solution that allows me to upsert when updating ($inc) a sub-document?
Thanks!
You could change your schema a little, though. If your documents looked like this:
{
'_id' : 'star_wars',
'count' : 1234,
'spellings' :
{
'Star wars': 10,
'Star Wars': 15,
'sTaR WaRs': 5
}
}
Your updates would become as simple as:
db.movies.update({_id:"star_wars"},{$inc:{"spellings.Star Wars":1}},true)

MongoDB associative array - pull possible?

Having the following Array:
array(
'id' => 12,
'keys' => array('x1' => array('idx' => 12, 'text'=> '1123145'),
'x2' => array('idx' => 14, 'text'=> '1123142'),
'x3' => array('idx' => 12, 'text'=> '1123145'),
'x4' => array('idx' => 14, 'text'=> '1123145')
)
)
I want to pull all keys with idx 12. So i do the following:
$mdb->db->collection->update(array('id' => 12), array('$pull' => array('keys' => array('idx' => 12))));
But it don't works, whats the problem?
This is impossible to do with this schema.
you are trying to pull the id = 12 from the array key, but the problem, is that each element of key is an object by itself.
the only way to do what you want with minimum modification is change schema in this way:
{
"_id" : 12,
"keys" : [
{
"type" : 'x1',
"idx" : 12,
"text" : "1111"
},
{
"type" : 'x2',
"idx" : 14,
"text" : "1111"
},
{
"type" : 'x3',
"idx" : 12,
"text" : "1111"
},
{
"type" : 'x4',
"idx" : 14,
"text" : "1111"
}
}]
}
than you can run your query as follows:
db.XXX.update(
{ "_id" : 12},
{
'$pull' : {
'keys' : {
'idx' : 12
}
}
}
);
I hope you will be able to transfer this into php with an arrays, because it looks for me that you are using php for querying