Elasticsearch searching with perl client - perl

I'm attempting to do something that should be simple but I cannot get it to work. I've looked and search all over to find detailed doc for perl search::elsticsearch. I can only find CPAN doc and as far as search is concerned it is barely mentioned. I've search here and cannot find a duplicate question.
I have elasticsearch and filebeat. Filebeat is sending syslog to elasticsearch. I just want to search for messages with matching text and date range. I can find the messages but when I try to add date range the query fails. Here is the query from kibana dev tools.
GET _search
{
"query": {
"bool": {
"filter": [
{ "term": { "message": "metrics" }},
{ "range": { "timestamp": { "gte": "now-15m" }}}
]
}
}
}
I don't get exactly what I'm looking for but there isn't an error.
Here is my attempt with perl
my $results=$e->search(
body => {
query => {
bool => {
filter => {
term => { message => 'metrics' },
range => { timestamp => { 'gte' => 'now-15m' }}
}
}
}
}
);
This is the error.
[Request] ** [http://x.x.x.x:9200]-[400]
[parsing_exception]
[range] malformed query, expected [END_OBJECT] but found [FIELD_NAME],
with: {"col":69,"line":1}, called from sub Search::Elasticsearch::Role::Client::Direct::__ANON__
at ./elasticsearchTest.pl line 15.
With vars: {'body' => {'status' => 400,'error' => {
'root_cause' => [{'col' => 69,'reason' => '[range]
malformed query, expected [END_OBJECT] but found [FIELD_NAME]',
'type' => 'parsing_exception','line' => 1}],'col' => 69,
'reason' => '[range] malformed query, expected [END_OBJECT] but found [FIELD_NAME]',
'type' => 'parsing_exception','line' => 1}},'request' => {'serialize' => 'std',
'path' => '/_search','ignore' => [],'mime_type' => 'application/json',
'body' => {
'query' => {
'bool' =>
{'filter' => {'range' => {'timestamp' => {'gte' => 'now-15m'}},
'term' => {'message' => 'metrics'}}}}},
'qs' => {},'method' => 'GET'},'status_code' => 400}
Can someone help me figure out how to search with the search::elasticsearch perl module?

Multiple filter clauses must be passed as separate JSON objects within an array (like in your initial JSON query), not multiple filters in the same JSON object. This maps to how you must create the Perl data structure.
filter => [
{term => { message => 'metrics' }},
{range => { timestamp => { 'gte' => 'now-15m' }}}
]

Related

Parsing data elements and it's attributes from dataset into an array

I have a data set with different types of plans, the data below shows the activeplans and pastplans elements, (and other plans not included in the example).
I think the data can be represented as follows
$data->activeplans->activeplanloc[activeplan->{}, activeplan->{},
...activeplan->{}] $data->pastplans->pastplanloc[pastplan->{},
pastplan->{}, ...pastplan->{}]
Each plan element has number of attributes, for instance id, lat, long, numpersons (and other attributes not included in the example)
My goal is to loop through all the plan items and extract the attributes.
Also note, the ...planloc[] outer element and the lat/long fields it contains along with the empty ...plan[] - can be ignored.
This is the loop I tried to do it with, but I'm stuck on exacting the activeplan elements, can you help correct my syntax error, I don't now how to properly load the elements into an array given this data stucture?
foreach my $planArrayItem (#{$data->{"activeplans"}->{"activeplanloc"}->{"activeplan"}{}}) {
#...
if (exists $planArrayItem->{numpersons}) {
$tmp .= "<li>Number of personal: $projArrayItem->{numpersons}</li>";
}
#...
}
Oh, and this is the data set.
{ 'updatetime' => '3/24/2021 11:44:19 AM', 'pastplans' =>
{ 'pastplanloc' => [ { 'longitude' => '-29.51502', 'latitude' =>
'32.307558', 'pastplan' => { 'planclass' => 'A', 'longitude' =>
'-29.51502', 'id' => '211', 'latitude' => '32.307558',
'numlocations' => '15' } }, { 'longitude' => '-28.798305',
'latitude' => '32.656135', 'pastplan' => [ { 'id' => '214',
'longitude' => '-28.798305', 'latitude' => '32.656135',
'planclass' => 'E', 'numlocations' => '16' }, { 'longitude' =>
'-28.798305', 'id' => '215', 'latitude' => '32.656135', 'planclass'
=> 'C', 'numlocations' => '21' } ] } ] }, 'activeplans' =>
{ 'activeplanloc' => [ { 'latitude' => '33.132491', 'activeplan'
=> [ { 'planclass' => 'B', 'longitude' => '-25.304968', 'id' =>
'942', 'latitude' => '33.132491', 'numpersons' => '17' },
{ 'numpersons' => '21', 'planclass' => 'G', 'id' => '943',
'longitude' => '-25.304968', 'latitude' => '33.132491' } ],
'longitude' => '-25.304968' }, { 'latitude' => '33.097290',
'activeplan' => { 'numpersons' => '31', 'id' => '944',
'longitude' => '-25.295086', 'latitude' => '33.097290',
'planclass' => 'M' }, 'longitude' => '-25.295086' } ] } };
This is the XML format if there is a better way to format it while reading in perhaps?
<?xml version="1.0" encoding="utf-8"?>
<plans>
<updatetime>3/24/2021 11:44:19 AM</updatetime>
<pastplans>
<pastplanloc latitude="32.307558" longitude="-29.51502">
<pastplan planclass="A" id="211" numpersons="15" latitude="32.307558" longitude="-29.51502"/>
</pastplanloc>
<pastplanloc latitude="32.656135" longitude="-28.798305">
<pastplan planclass="E" id="214" numpersons="16" latitude="32.656135" longitude="-28.798305"/>
<pastplan planclass="C" id="215" numpersons="21" latitude="32.656135" longitude="-28.798305"/>
</pastplanloc>
</pastplans>
<activeplans>
<activeplanloc latitude="33.132491" longitude="-25.304968">
<activeplan planclass="B" id="942" numpersons="17" latitude="33.132491" longitude="-25.304968"/>
<activeplan planclass="G" id="943" numpersons="21" latitude="33.132491" longitude="-25.304968"/>
</activeplanloc>
<activeplanloc latitude="33.097290" longitude="-25.295086">
<activeplan planclass="M" id="944" numpersons="31" latitude="33.097290" longitude="-25.295086"/>
</activeplanloc>
</activeplans>
</plans>
I am quite certain that:
foreach my $planArrayItem (#{$data->{"activeplans"}->{"activeplanloc"}}) {
#...
if ($planArrayItem->{"activeplan"}{numpersons}) {
$tmp.= "<li>Number of personal: ".$planArrayItem->{"activeplan"}->{numpersons}."</li>";
}
}
is the code you are looking for. As you stated above "activeplanloc" contains an array which reference an activeplan. So the outer loop has to iterate over this.
With your new "data set", the correct code is:
foreach my $planArrayItem (#{$data->{"activeplans"}->{"activeplanloc"}}) {
#...
my $plans = ref $planArrayItem->{"activeplan"} eq "ARRAY" ?
$planArrayItem->{"activeplan"} : [$planArrayItem->{"activeplan"}];
foreach my $plan (#$plans) {
if ($plan->{numpersons}) {
$tmp .= "<li>Number of personal: ".$plan->{numpersons}."</li>";
}
}
}
See https://pastebin.com/QeD6ZwZz for a working example

Real time sync between mongodb and elastic search

Using Logstash-input-mongodb able to insert records at real time. But when it comes to update, that is not happening as expected. Can anyone guide me on this.
logstash-mongodb.conf
input {
mongodb {
uri => 'mongodb://127.0.0.1:27017/test-db'
placeholder_db_dir => '/opt/logstash-mongodb/'
placeholder_db_name => 'logstash_sqlite.db'
collection => 'mycol'
batch_size => 5000
generateId => true
}
}
filter{
mutate { remove_field => "_id" }
}
output {
elasticsearch {
hosts => [ "http://localhost:9200" ]
index => "test-index"
}
}

ElasticSearch (search_context_missing_exception) with Search::ElasticSearch::Scroll

I'm using Search::Elasticsearch and Search::Elasticsearch::Scroll for search and scroll into my elasticsearch server.
In scrolling process, for some querys, I'm seeing the next errors while I'm scrolling the search results:
2016/03/22 11:03:38 - 265885 FATAL: [Daemon.pm][8221]: Something gone wrong, error $VAR1 = bless( {
'msg' => '[Missing] ** [http://localhost:9200]-[404] Not Found, called from sub Search::Elasticsearch::Scroll::next at searcher.pl line 92. With vars: {\'body\' => {\'hits\' => {\'hits\' => [],\'max_score\' => \'0\',\'total\' => 5215},\'timed_out\' => bless( do{\\(my $o = 0)}, \'JSON::XS::Boolean\' ),\'_shards\' => {\'failures\' => [{\'index\' => undef,\'reason\' => {\'reason\' => \'No search context found for id [4920053]\',\'type\' => \'search_context_missing_exception\'},\'shard\' => -1},{\'index\' => undef,\'reason\' => {\'reason\' => \'No search context found for id [5051485]\',\'type\' => \'search_context_missing_exception\'},\'shard\' => -1},{\'index\' => undef,\'reason\' => {\'reason\' => \'No search context found for id [4920059]\',\'type\' => \'search_context_missing_exception\'},\'shard\' => -1},{\'index\' => undef,\'reason\' => {\'reason\' => \'No search context found for id [5051496]\',\'type\' => \'search_context_missing_exception\'},\'shard\' => -1},{\'index\' => undef,\'reason\' => {\'reason\' => \'No search context found for id [5051500]\',\'type\' => \'search_context_missing_exception\'},\'shard\' => -1}],\'failed\' => 5,\'successful\' => 0,\'total\' => 5},\'_scroll_id\' => \'c2NhbjswOzE7dG90YWxfaGl0czo1MjE1Ow==\',\'took\' => 2},\'request\' => {\'serialize\' => \'std\',\'path\' => \'/_search/scroll\',\'ignore\' => [],\'mime_type\' => \'application/json\',\'body\' => \'c2Nhbjs1OzQ5MjAwNTM6bHExbENzRDVReEc0OV9UMUgzd3Vkdzs1MDUxNDg1OnJrQ3lsUkRKVHRxRWRWeURoOTB4WVE7NDkyMDA1OTpscTFsQ3NENVF4RzQ5X1QxSDN3dWR3OzUwNTE0OTY6cmtDeWxSREpUdHFFZFZ5RGg5MHhZUTs1MDUxNTAwOnJrQ3lsUkRKVHRxRWRWeURoOTB4WVE7MTt0b3RhbF9oaXRzOjUyMTU7\',\'qs\' => {\'scroll\' => \'1m\'},\'method\' => \'GET\'},\'status_code\' => 404}
',
'stack' => [
[
'searcher.pl',
92,
'Search::Elasticsearch::Scroll::next'
]
],
'text' => '[http://localhost:9200]-[404] Not Found',
'vars' => {
'body' => {
'hits' => {
'hits' => [],
'max_score' => '0',
'total' => 5215
},
'timed_out' => bless( do{\(my $o = 0)}, 'JSON::XS::Boolean' ),
'_shards' => {
'failures' => [
{
'index' => undef,
'reason' => {
'reason' => 'No search context found for id [4920053]',
'type' => 'search_context_missing_exception'
},
'shard' => -1
},
{
'index' => undef,
'reason' => {
'reason' => 'No search context found for id [5051485]',
'type' => 'search_context_missing_exception'
},
'shard' => -1
},
{
'index' => undef,
'reason' => {
'reason' => 'No search context found for id [4920059]',
'type' => 'search_context_missing_exception'
},
'shard' => -1
},
{
'index' => undef,
'reason' => {
'reason' => 'No search context found for id [5051496]',
'type' => 'search_context_missing_exception'
},
'shard' => -1
},
{
'index' => undef,
'reason' => {
'reason' => 'No search context found for id [5051500]',
'type' => 'search_context_missing_exception'
},
'shard' => -1
}
],
'failed' => 5,
'successful' => 0,
'total' => 5
},
'_scroll_id' => 'c2NhbjswOzE7dG90YWxfaGl0czo1MjE1Ow==',
'took' => 2
},
'request' => {
'serialize' => 'std',
'path' => '/_search/scroll',
'ignore' => [],
'mime_type' => 'application/json',
'body' => 'c2Nhbjs1OzQ5MjAwNTM6bHExbENzRDVReEc0OV9UMUgzd3Vkdzs1MDUxNDg1OnJrQ3lsUkRKVHRxRWRWeURoOTB4WVE7NDkyMDA1OTpscTFsQ3NENVF4RzQ5X1QxSDN3dWR3OzUwNTE0OTY6cmtDeWxSREpUdHFFZFZ5RGg5MHhZUTs1MDUxNTAwOnJrQ3lsUkRKVHRxRWRWeURoOTB4WVE7MTt0b3RhbF9oaXRzOjUyMTU7',
'qs' => {
'scroll' => '1m'
},
'method' => 'GET'
},
'status_code' => 404
},
'type' => 'Missing'
}, 'Search::Elasticsearch::Error::Missing' );
The code I'm using is the next one (simplified) :
# Retrieve scroll
my $scroll = $self->getScrollBySignature($item);
# Retrieve all affected documents ids
while (my #docs = $scroll->next(500)) {
# Do stuff with #docs
}
The function getScrollBySignature have the next code in order to call to elasticSearch
my $scroll = $self->{ELASTIC}->scroll_helper(
index => $self->{INDEXES},
search_type => 'scan',
ignore_unavailable => 1,
body => {
size => $self->{PAGINATION},
query => {
filtered => {
filter => {
bool => {
must => [{term => {signature_id => $item->{profileId}}}, {terms => {channel_type_id => $type}}]
}
}
}
}
}
);
As you can see, I'm doing the scroll without passing scroll parameter then as documentation says, the time that scroll is alive is 1 min.
The elasticSearch is a cluster of 3 servers, and the query that ends with that error retrieves a bit more than 5000 docs.
My first solution was to update the life time for scroll to 5 minutes and the error didn't appear.
The question is, as I understand every time I'm calling $scroll->next() the life time off scroll affected is upgraded 1m more, then how is possible to receive those context related errors?
I'm doing something in a bad manner?
Thank you all.
The first thing that comes to mind is that the timer is not updated. Have you checked this? You can do a query every 10 seconds for example and see if at the 6th query it gives you the error ...
Well, a good rule of thumb is inside a ->next() block, don't stay by iteration more than time that you've configured in scroll.
Between each call of ->next() you cannot stay more than that time configured. If you stay more, the scroll may be not be there and the error earch_context_missing_exception will appear.
My solution for this problem was inside next block only store data into array/hash structure and once the scroll process ended work with all data.
The solution of the question example:
# Retrieve scroll
my $scroll = $self->getScrollBySignature($item);
# Retrieve all affected documents ids
my #allDocs;
while (my #docs = $scroll->next(500)) {
push #allDocs, map {$_->{_id}} #docs
}
foreach (#allDocs) {
# Do stuff with doc
}

Reading specific values in hash of hashes - Perl

I have an hash of hash map as below. Please note that the hash map is very huge which contains PluginsStatus as Success or Error. When PluginsStatus for a key is Success then I need not process anything (I have handled this scenario) but if its Error I need to to display in the order - PluginsStatus, PluginspatchLogName, PluginsLogFileName_0, PluginsLogFileLink_0, PluginsLogFileErrors_0 and so on.
Please note, I do not know exactly how many keys (in hash of a hash) i.e. PluginsLogFileName, PluginsLogFileLink, PluginsLogFileErrors exists i.e. it is dynamic.
$VAR1 = { 'Applying Template Changes' => {
'PluginsLogFileErrors_2' => 'No Errors',
'PluginsStatus' => 'Error',
'PluginsLogFileName_1' => 'Applying_Template_Changes_2015-05-12_02-57-40AM.log',
'PluginsLogFileName_2' => 'ApplyingTemplates.log',
'PluginsLogFileErrors_1' => 'ERROR: FAPSDKEX-00024 : Error in undeploying template.Cause : Unknown.Action : refer to log file for more details.',
'PluginspatchLogName' => '2015-05-11_08-14-28PM.log',
'PluginsLogFileLink_0' => '/tmp/xpath/2015-05-11_08-14-28PM.log',
'PluginsLogFileName_0' => '2015-05-11_08-14-28PM.log',
'PluginsLogFileErrors_0' => 'No Errors',
'PluginsLogFileLink_2' => 'configlogs/ApplyingTemplates.log',
'PluginsLogFileLink_1' => 'configlogs/Applying_Template_Changes_2015-05-12_02-57-40AM.log'
},
'Configuring Keystore Service' => {
'PluginsStatus' => 'Error',
'PluginsLogFileName_1' => 'Configuring_Keystore_Service_2015-05-11_11-11-37PM.log',
'PluginsLogFileErrors_1' => 'ERROR: Failed to query taxonomy attribute AllProductFamilyAndDomains.',
'PluginspatchLogName' => '2015-05-11_08-14-28PM.log',
'PluginsLogFileLink_0' => '/tmp/xpath/2015-05-11_08-14-28PM.log',
'PluginsLogFileName_0' => '2015-05-11_08-14-28PM.log',
'PluginsLogFileErrors_0' => 'No Errors',
'PluginsLogFileLink_1' => 'configlogs/Configuring_Keystore_Service_2015-05-11_11-11-37PM.log'
},
'Applying Main Configuration' => {
'PluginsStatus' => 'Error',
'PluginspatchLogName' => '2015-05-11_08-14-28PM.log',
'PluginsLogFileName_0' => 'Applying_Main_Configuration_2015-05-12_01-11-21AM.log',
'PluginsLogFileLink_0' => '/tmp/xpath/2015-05-11_08-14-28PM.log',
'PluginsLogFileErrors_0' => 'ERROR: Failed to query taxonomy attribute AllProductFamilyAndDomains.apps.ad.common.exception.ADException: Failed to query taxonomy attribute AllProductFamilyAndDomains.... 104 lines more'
}
};
Below is an output snippet I am looking for:
Plugin name is = Applying Template Changes
PluginsStatus = Error
PluginspatchLogName = 2015-05-11_08-14-28PM.log
PluginsLogFileName_0 = 2015-05-11_08-14-28PM.log
PluginsLogFileLink_0 = /tmp/xpath/2015-05-11_08-14-28PM.log
PluginsLogFileErrors_0 = No Errors
PluginsLogFileName_1 = Applying_Template_Changes_2015-05-12_02-57-40AM.log
PluginsLogFileLink_1 = configlogs/Applying_Template_Changes_2015-05-12_02- 57-40AM.log
PluginsLogFileErrors_1 = ERROR: FAPSDKEX-00024 : Error in undeploying template.Cause : Unknown.Action : refer to log file for more details.,
PluginsLogFileName_2 = ApplyingTemplates.log
PluginsLogFileLink_2 = configlogs/ApplyingTemplates.log
PluginsLogFileErrors_2 = No Errors`
Please let me know if someone could help me here ?
You have built a hash that is less than ideal for your purposes. You should create a LogFile hash element that has an array as its value. After that the process is trivial
{
"Applying Main Configuration" => {
LogFile => [
{
Errors => "ERROR: Failed to query taxonomy attribute AllProductFamilyAndDomains.apps.ad.common.exception.ADException: Failed to query taxonomy attribute AllProductFamilyAndDomains.... 104 lines more",
Link => "/tmp/xpath/2015-05-11_08-14-28PM.log",
Name => "Applying_Main_Configuration_2015-05-12_01-11-21AM.log",
},
],
patchLogName => "2015-05-11_08-14-28PM.log",
Status => "Error",
},
"Applying Template Changes" => {
LogFile => [
{
Errors => "No Errors",
Link => "/tmp/xpath/2015-05-11_08-14-28PM.log",
Name => "2015-05-11_08-14-28PM.log",
},
{
Errors => "ERROR: FAPSDKEX-00024 : Error in undeploying template.Cause : Unknown.Action : refer to log file for more details.",
Link => "configlogs/Applying_Template_Changes_2015-05-12_02-57-40AM.log",
Name => "Applying_Template_Changes_2015-05-12_02-57-40AM.log",
},
{
Errors => "No Errors",
Link => "configlogs/ApplyingTemplates.log",
Name => "ApplyingTemplates.log",
},
],
patchLogName => "2015-05-11_08-14-28PM.log",
Status => "Error",
},
"Configuring Keystore Service" => {
LogFile => [
{
Errors => "No Errors",
Link => "/tmp/xpath/2015-05-11_08-14-28PM.log",
Name => "2015-05-11_08-14-28PM.log",
},
{
Errors => "ERROR: Failed to query taxonomy attribute AllProductFamilyAndDomains.",
Link => "configlogs/Configuring_Keystore_Service_2015-05-11_11-11-37PM.log",
Name => "Configuring_Keystore_Service_2015-05-11_11-11-37PM.log",
},
],
patchLogName => "2015-05-11_08-14-28PM.log",
Status => "Error",
},
}
Just iterate over the keys of the hash. Use the $hash{key}{inner_key} syntax to get into the nested hash.
#!/usr/bin/perl
use warnings;
use strict;
use feature qw{ say };
my %error = ( 'Applying Template Changes' => {
'PluginsLogFileErrors_2' => 'No Errors',
'PluginsStatus' => 'Error',
'PluginsLogFileName_1' => 'Applying_Template_Changes_2015-05-12_02-57-40AM.log',
# ...
'PluginsLogFileLink_0' => '/tmp/xpath/2015-05-11_08-14-28PM.log',
'PluginsLogFileErrors_0' => 'ERROR: Failed to query taxonomy attribute AllProductFamilyAndDomains.apps.ad.common.exception.ADException: Failed to query taxonomy attribute AllProductFamilyAndDomains.... 104 lines more',
},
);
for my $step (keys %error) {
print "Plugin name is = $step\n";
for my $detail (sort keys %{ $error{$step} }) {
print "$detail = $error{$step}{$detail}\n";
}
}

How do I set the HTTP User-Agent when using Search::Elasticsearch?

I'm using Search::Elasticsearch to query MetaCPAN.
my $es = Search::Elasticsearch->new(
cxn_pool => 'Static::NoPing',
nodes => 'api.metacpan.org:80',
);
my $scroller = $es->scroll_helper(
index => 'v0',
type => 'release',
search_type => 'scan',
scroll => '2m',
size => $size,
body => {
fields => [qw(author archive date)],
query => { range => { date => { gte => $date } } },
},
);
This works ok, but I'd like to set the HTTP User-Agent header to a custom value so my requests can be identified if there's a problem. How do I do that with Search::Elasticsearch?
You can pass arguments to the handle constructor using handle_args. So for the default HTTP::Tiny you would use agent:
my $es = Search::Elasticsearch->new(
cxn_pool => 'Static::NoPing',
nodes => 'api.metacpan.org:80',
handle_args => { agent => "youragent/0.1" },
);