Perl module for Elastisearch Percolator - perl

I'm trying to use the Elasticsearch Percolator with perl and I have found this cool module.
The Percolation methods are listed here
As far as I can tell they're just read methods, hence it is only possible to read the queries index and see if a query already exists, count the queries matched, etc.
Unless I'm missing something it is not possible to add queries via the Percolator interface, so what I did is use the normal method to create a document against the .percolator index as follow:
my $e = Search::Elasticsearch->new( nodes => 'localhost:9200' );
$e->create(
index => 'my_index',
type => '.percolator',
id => $max_idx,
body => {
query => {
match => {
...whatever the query is....
},
},
},
);
Is that the best way of adding a query to the percolator index via the perl module ?
Thanks!

As per DrTech answer the code I posted looks to be the correct way of doing it.

Related

Syncing the data from mongoDB to elasticsearch using Logstash

I am trying to sync the mongoDB database to the elasticsearch. I am using logstash-input-mongoDb and logstash-output-elasticsearch plugins.
The issue is mongoDb plugin is not able to extract all the information from the inserted document in mongodb, thus I am seeing only few fields being inserted to the elasticsearch. And I also get the entire query as the log in elasticsearch index. I tried to manipulate the filters in the config file for the logstash and change the input to the elasticsearch but could not make it work.
Any help or suggestion would be great.
Edit:
Mongo schema:
A:{
B: 'sometext',
C: {G: 'someText', H:'some text'}
},
D:[
{E:'sometext',F:'sometext'},
{E:'sometext',F:'sometext'},
{E:'sometext',F:'sometext'}
]
plugin:
input {
mongodb {
uri => 'mongodb://localhost:27017/testDB'
placeholder_db_dir => '/opt/logstash-mongodb/'
placeholder_db_name => 'logstash_sqlite.db'
collection => 'testCOllection'
batch_size => 1000
}
}
output {
stdout {
codec => rubydebug
}
elasticsearch {
action => "index"
index => "testdb_testColl"
hosts => ["localhost:9200"]
}
}
output to elastic:
{
//some metadata
A_B: 'sometext',
A_C_G: 'someText',
A_C_H: 'some text',
log_entry: 'contains complete document inserted to mongoDB'
}
We are not getting property D of mongo collection in the elastic.
Hope this explains the problem more elaborately.
because your configuration looked good to me, I checked the issues of the phutchins/logstash-input-mongodb repo, and I found this one: "array not stored to elasticsearch", which pretty much described your problem. It is still an open issue, but you might want to try out the workaround suggested by ivancruzbht. Such workaround uses the ruby Logstash filter to parse the log_entry field, which you also confirmed has all the fields - including D.

Perl mongo->collection count using '$in' operator

I was wondering if it was possible to use the "in" operator as you can from the mongo shell, using the perl MongoDB::Collection module. I have tried a number of things, but haven't quite got the result I am expecting. I've check the docs and other posts on stackoverflow but can't seem to find anything specifically about this, unless I am overlooking something.
http://docs.mongodb.org/manual/reference/operator/query/in/
The count query I am running via the mongo shell is
mongo:PRIMARY> db.getCollection("Results").count( { TestClass : "TestClass", TestMethod : { $in: ["method1" , "method2", "method3"] } })
181605
I have tried this a few different ways passing the list as an array or hash-refs or pre-building a string...
my $count = $mongo->{collection}->count({
'TimeStamp' => { '$gt' => $ft, '$lt' => $tt },
'TestClass' => $TestClass,
'TestMethod' => { '$in' => [$whitelist->methods] },
'Result' => $result
});
Where Dumping $whitelist->methods is
$VAR1 = {
'method1' => 1,
'method2' => 1,
'method3' => 1
};
I've looked high and low for an answer, does anyone know if the driver is currently capable of using the $in operator like this? Looping through the returned methods from a previous query and adding up the results will require more code.
The only other stack overflow post I have seen about the $in operator was this $in mongoDB operator with _id in perl recommending using http://api.mongodb.org/perl/current/MongoDB/OID.html but don't think that is relevant in my example as looks more to do with ID's.
Any help or discussion would be greatly appreciated.
The problem is that $in clause expects its value to be an array reference, but you supply a hashref (as Dumper's output shows) into it. The easiest way to turn the latter into the former is to apply keys function:
# ...
'TestMethod' => { '$in' => [keys %{$whitelist->methods}] }
... or just [keys $whitelist->methods], if you're using Perl 5.14+, as ...
starting with Perl 5.14, keys can take a scalar EXPR, which must
contain a reference to an unblessed hash or array
.

$in mongoDB operator with _id in perl

I try to execute this query with perl on a MongoDB database :
$db->$collection->find({"_id" : { "$in" : ["4f520122ecf6171327000137", "4f4f49c09d1bd90728000034"]}});
But it return nothing and it must return two documents. What is wrong with this query ?
Thank you.
Edit : It doesn't work too :
$db->$collection->find( {_id => "4f520122ecf6171327000137"} );
First, make sure you're using the correct syntax. Your first example is not valid Perl code, since you're including a chunk of JSON as the query parameter.
Second, assuming these ID values are MongoDB ObjectID's, you'll need to make OID objects in order to differentiate them from ordinary strings. And make sure to use single quotes ('') around $in, otherwise Perl will try to interpolate $in as a variable (which presumably has nothing in it).
So I assume you want to do something like this:
$db->$collection->find( {
"_id" => {
'$in' => [ MongoDB::OID->new( value => "4f520122ecf6171327000137" ),
MongoDB::OID->new( value => "4f4f49c09d1bd90728000034" )
]
}
} );
Edit: Additionally, using autoloaded method names to retrieve collections has been deprecated for a while. You're better off using $db->get_collection( "collection name" )->find( ... )

Mongodb findAndModify embedded document - how do you know which one you've modified?

findAndModify in mongodb is great, but I am having a little trouble knowing which embedded document I modified.
Here is an example where a Post embeds_many Comments. (I'm using Mongoid ORM but the question is generic to any MongoDB setup).
begin
p = Post.asc(id).where(comments: { '$elemMatch' => {reserved: false} }).find_and_modify({'$set' => {'comments.$.reserved' => true}}, {new: true}
# now i need to find which comment I just reserved
c = p.comments.select{|c| c.reserved }.first
...
ensure
c.update_attribute :reserved, false
end
Ok this sort of works, but if I have multiple processes running this simultaneously my select could choose a comment that another process had reserved (race condition).
This is the closest I have for now (reserving by process id):
begin
p = Post.asc(id).where(comments: { '$elemMatch' => {reserved: nil} }).find_and_modify({'$set' => {'comments.$.reserved' => Process.pid}}, {new: true}
# now i need to find which comment I just reserved
c = p.comments.select{|c| c.reserved == Process.pid }.first
...
ensure
c.update_attribute :reserved, nil
end
Which seems to work. Is this the best way to do this or is there a better pattern?
Was able to solve it by generating a SecureRandom.hex and setting this on the embedded document with find_and_modify. Then you can loop through the embedded documents and see which one has your matching hex, to see which one you are working with.

Multiple document update issue in mongodb

Does MongoDB support multiple document update if selector (first argument) is selecting more than 1 document inside a collection.
In the below example first one works fine as it selects only a particular document and modifies zip value.
While in second case collection $addresses has multiple documents which has 'home' => 'canada', it doesn't update anything.
Can someone please help me ?
$addresses->update(array('_id' => new MongoId('4f69de380c211d6c21000001')),
array('$set' => array('zip' => 20)));
$addresses->update(array('home' => 'canada')),
array('$set' => array('zip' => 20)));
Edit:
Equivalent javascript command
db.addresses.update({home: "canada"}, {$set: {zip: 20}})
Updates zip value of first encountered match, is this the expected behavior.
console command is updating at least one document, PHP is not doing anything if selector matches more than 1 document.
As long as the second statement matches anything it should always update a single document.
If you want it to apply to multiple documents you should pass in the multiple flag as documented here.
http://www.php.net/manual/en/mongocollection.update.php
In your case the query should read:
$addresses->update(array('home' => 'canada'),
array('$set' => array('zip' => 20)),
array('multiple' => true)
);