How to select subdocuments with MongoDB - mongodb

I have a collection with a subdocument tags like :
Collection News :
title (string)
tags: [tag1, tag2...]
I want to select all the tags who start with a pattern, but returning only the matching tags.
I already use a regex but it returns all the news containing the matching tag, here is the query :
db.news.find( {"tags":/^proga/i}, ["tags"] ).sort( {"tags":1} ).
limit( 0 ).skip( 0 )
My question is : How can I retrieve all the tags (only) who match the pattern ?
(The final goal is to make an autocomplete field)
I also tried using distinct, but I didn't find a way to make a distinct with a find, it always returning me all the tags :(
Thanks for your time

A bit late to the party, but hopefully will help others who are hunting for a solution. I've found a way to do this using the aggregation framework and combining $project and $unwind with the $match, by chaining them together. I've done it using PHP but you should get the gist:
$ops = array(
array('$match' => array(
'collectionColumn' => 'value',
)
),
array('$project' => array(
'collection.subcollection' => 1
)
),
array('$unwind' => '$subCollection'),
array('$match' => array(
subCollection.subColumn => 'subColumnValue'
)
)
);
The first match and project are just use to filter out to make it faster, then the unwind on subcollection spits out each subcollection item by item which can then be filtered using the final match.
Hope that helps.
UPDATE (from Ryan Wheale):
You can then $group the data back into its original structure. It's like having an $elemMatch which returns more than one subdocument:
array('$group' => array(
'_id' => '$_id',
'subcollection' => array(
'$push' => '$subcollection'
)
)
);
I translated this from Node to PHP, so I haven't tested in PHP. If anybody wants the Node version, leave a comment below and I will oblige.

Embedded documents are not collections. Look at your query: db.news.find will return documents from the news collection. tags is not a collection, and cannot be filtered.
There is a feature request for this "virtual collection feature" (SERVER-142), but don't expect to see this too soon, because it's "planned but not scheduled".
You can do the filtering client-side, or move the tags to a separate collection. By retrieving only a subset of fields - only the tags field - this should be reasonably fast.
Hint: Your regex uses the /i flag, which makes it impossible to use indexation. Your db strings should be case-normalized (e.g. all upper case)

Related

Perl module for Elastisearch Percolator

I'm trying to use the Elasticsearch Percolator with perl and I have found this cool module.
The Percolation methods are listed here
As far as I can tell they're just read methods, hence it is only possible to read the queries index and see if a query already exists, count the queries matched, etc.
Unless I'm missing something it is not possible to add queries via the Percolator interface, so what I did is use the normal method to create a document against the .percolator index as follow:
my $e = Search::Elasticsearch->new( nodes => 'localhost:9200' );
$e->create(
index => 'my_index',
type => '.percolator',
id => $max_idx,
body => {
query => {
match => {
...whatever the query is....
},
},
},
);
Is that the best way of adding a query to the percolator index via the perl module ?
Thanks!
As per DrTech answer the code I posted looks to be the correct way of doing it.

Get one most recent comment for each post in mongodb

I have two collections in mongodb, one for posts and one for comments. What would be the best approach to get one most recent comment for each post? I'm looking for a similar solution but for mongodb: http://www.xaprb.com/blog/2006/12/07/how-to-select-the-firstleastmax-row-per-group-in-sql/
You should be able to do this with the aggregation framework by combining $group with $max.
I would like to give you an exact solution, but I can't do so unless you give an example of your data.
By the way: The proper way to structure this data in MongoDB would be to put the comments into a sub-Array of the posts.
Just in case anyone else have a similar problem, I solved mine using the Map-Reduce:
First I create a map function like this:
$map = "function() { emit(this.post_id, this); }";
and reduce function:
$reduce = "function(k, vals) {".
"var newest = null;".
"for ( var i in vals ) {".
"if ( newest === null ) {".
"newest = vals[i];".
"}".
"else {".
"if ( vals[i]['_id'] > newest['_id'])".
"newest = vals[i]".
"}".
"}".
"return newest;".
"}";
and a new collection with the necessary data is ready...
$commentsAggregated = $db->command(array(
"mapreduce" => "comments",
"map" => $map,
"reduce" => $reduce,
"query" => $query,
"out" => array("merge" => "commentsCollectionNew")
));
$getComments = $db->selectCollection($commentsAggregated['result'])->find();

Working with nested documents/arrays in Lithium and MongoDB

I'm new to both MongoDB and Lithium and I can't really find the "good way" of working with nested documents. I noticed that when I try
$user = Users::find('first' ... );
$user->somenewfield = array('key' => 'val');
what I get for "somenewfield" is a Document object. But there is also a DocumentArray class - what is the difference between them?
When I call
$user->save();
this results in Mongo (as expected):
"somenewfield" : {
"key": "value"
}
OK, but when I later want to add a new key-value to the array and try
$user->somenewfield['newkey'] = 'newval';
var_dump($user->somenewfield->to('array')); // shows the old and the new key-value pairs
$user->save(); // does not work - the new pair is not added
What is the correct way to adding a new array to a document using lithium? What is the correct way of updating the array/adding new values to the array? Shall I alywas give a key for the array value?
Thanks for the help in advance. I'm kinda stuck ... reading the documentation, reading the code ... but at some points it gets difficult to find out everything alone :)
Edit:
What I found at the end was that the way I shall use nested arrays is with $push and $pull:
Users::update(array('$push' => array('games' => (string) $game->_id)),
array(
'_id' => $this->user()->_id,
'games' => array('$ne' => (string) $game->_id)),
array('atomic' => false));
I think there are some quirks in handling subdocuments, you can try:
$somenewfield = $user->somenewfield;
$somenewfield->newkey = newvalue;
$user->somenewfield = $somenewfield;
$user->save();
Or the alternative syntax:
$user->{'somenewfield.newkey'} = $newvalue;
$user->save();
You should be able to find more examples in the tests (look in tests/data at any tests for Document).

Multiple document update issue in mongodb

Does MongoDB support multiple document update if selector (first argument) is selecting more than 1 document inside a collection.
In the below example first one works fine as it selects only a particular document and modifies zip value.
While in second case collection $addresses has multiple documents which has 'home' => 'canada', it doesn't update anything.
Can someone please help me ?
$addresses->update(array('_id' => new MongoId('4f69de380c211d6c21000001')),
array('$set' => array('zip' => 20)));
$addresses->update(array('home' => 'canada')),
array('$set' => array('zip' => 20)));
Edit:
Equivalent javascript command
db.addresses.update({home: "canada"}, {$set: {zip: 20}})
Updates zip value of first encountered match, is this the expected behavior.
console command is updating at least one document, PHP is not doing anything if selector matches more than 1 document.
As long as the second statement matches anything it should always update a single document.
If you want it to apply to multiple documents you should pass in the multiple flag as documented here.
http://www.php.net/manual/en/mongocollection.update.php
In your case the query should read:
$addresses->update(array('home' => 'canada'),
array('$set' => array('zip' => 20)),
array('multiple' => true)
);

Does MongoDB's Map/Reduce sort work?

If the following is used
Analytic.collection.map_reduce(map, reduce,
:query => {:page => subclass_name},
:sort => [[:pageviews, Mongo::DESCENDING]]).find.to_a
it won't sort by pageviews. Alternatively, if it is array of hash:
Analytic.collection.map_reduce(map, reduce,
:query => {:page => subclass_name},
:sort => [{:pageviews => Mongo::DESCENDING}]).find.to_a
it won't work either. I think the reason it has to be an array is to specify the first field to sort by, etc. I also tried just a flat array instead of an array of array like in the first code listing up above and it didn't work either.
Is it not working? This is the spec: http://api.mongodb.org/ruby/current/Mongo/Collection.html#map_reduce-instance_method
What are you trying to do? Sort is really only useful in conjunction with limit: it's applied before the map so you can just MapReduce the latest 20 items or something. If you're trying to sort the results, you can just do a normal sort on the output collection.
Ok, it is a little bit tricky:
After the map_reduce(), a Mongo::Collection object is returned, but the structure is like:
[{"_id":123.0,"value":{"pageviews":3621.0,"timeOnPage":206024.0}},
{"_id":1320.0,"value":{"pageviews":6584.0,"timeOnPage":373195.0}},
...
]
so to do the sort, it has to be:
Analytic.collection.map_reduce(map, reduce,
:query => {:page => subclass_name}).find({},
:sort => [['value.pageviews', Mongo::DESCENDING]])
note the value.pageviews part.