Logstash translate plugin problem parsing csv - plugins

I'm trying to parse all the columns of a CSV file (except the first one, obviously).
The plugin only get the second column as result of the filter.
All the other column are ignored.
It should be possible, according to this sentence getting from the documentation :
When using a CSV dictionary, multiple values in the translation must
be extracted with another filter e.g. Dissect or KV. Note that the
fallback is a string so on no match the fallback setting needs to
formatted so that a filter can extract the multiple values to the
correct fields.
Here is my logstash code :
translate {
field => "idBatch"
dictionary_path => "D:\idBatch-description.csv"
refresh_interval => 500
destination => "donneesDictionnaireExterne"
# Données par défaut en l'absence de correspondance
fallback => "Aucune correspondance trouvée,10000"
add_tag => [ "import_CSV_ok"]
}
# Mapper des données du dictionnaire externe
dissect {
mapping => {
"donneesDictionnaireExterne" => "%{descriptionBatch},%{maxDuration}"
# EXEMPLE pour GAR01B0 : Batch d'injection Archive;86408
}
}
Here is a sample of my CSV file :
"GDA08A0_SupPdc","Batch de Suppression de PDC","9999"
"GDI01A0_Parsing","Moteur de parsing des etats internes","9999"
Does anyone know why it doesn't work ?

The translate filter will ignore everything after the second column, you will need to change the format of your dictionary.
Your dictionary needs to be something like this.
"GDA08A0_SupPdc","Batch de Suppression de PDC;9999"
"GDI01A0_Parsing","Moteur de parsing des etats internes;9999"
Then your dissect filter will be like this one:
dissect {
mapping => {
"donneesDictionnaireExterne" => "%{descriptionBatch};%{maxDuration}"
}
}
You can also use a mutate filter to remove the donneesDictionnaireExterne field.
mutate {
remove_field => ["donneesDictionnaireExterne"]
}
Finally the output for your example is:
{
"descriptionBatch" => "Batch de Suppression de PDC",
"maxDuration" => "9999",
"#version" => "1",
"#timestamp" => 2019-04-02T02:10:45.107Z,
"idBatch" => "GDA08A0_SupPdc",
"message" => "{ \"idBatch\":\"GDA08A0_SupPdc\"}",
"tags" => [
[0] "import_CSV_ok"
],
"host" => "hostname"
}
{
"descriptionBatch" => "Moteur de parsing des etats internes",
"maxDuration" => "9999",
"#version" => "1",
"#timestamp" => 2019-04-02T02:10:45.109Z,
"idBatch" => "GDI01A0_Parsing",
"message" => "{ \"idBatch\":\"GDI01A0_Parsing\"}",
"tags" => [
[0] "import_CSV_ok"
],
"host" => "hostname"
}

Related

Elasticsearch searching with perl client

I'm attempting to do something that should be simple but I cannot get it to work. I've looked and search all over to find detailed doc for perl search::elsticsearch. I can only find CPAN doc and as far as search is concerned it is barely mentioned. I've search here and cannot find a duplicate question.
I have elasticsearch and filebeat. Filebeat is sending syslog to elasticsearch. I just want to search for messages with matching text and date range. I can find the messages but when I try to add date range the query fails. Here is the query from kibana dev tools.
GET _search
{
"query": {
"bool": {
"filter": [
{ "term": { "message": "metrics" }},
{ "range": { "timestamp": { "gte": "now-15m" }}}
]
}
}
}
I don't get exactly what I'm looking for but there isn't an error.
Here is my attempt with perl
my $results=$e->search(
body => {
query => {
bool => {
filter => {
term => { message => 'metrics' },
range => { timestamp => { 'gte' => 'now-15m' }}
}
}
}
}
);
This is the error.
[Request] ** [http://x.x.x.x:9200]-[400]
[parsing_exception]
[range] malformed query, expected [END_OBJECT] but found [FIELD_NAME],
with: {"col":69,"line":1}, called from sub Search::Elasticsearch::Role::Client::Direct::__ANON__
at ./elasticsearchTest.pl line 15.
With vars: {'body' => {'status' => 400,'error' => {
'root_cause' => [{'col' => 69,'reason' => '[range]
malformed query, expected [END_OBJECT] but found [FIELD_NAME]',
'type' => 'parsing_exception','line' => 1}],'col' => 69,
'reason' => '[range] malformed query, expected [END_OBJECT] but found [FIELD_NAME]',
'type' => 'parsing_exception','line' => 1}},'request' => {'serialize' => 'std',
'path' => '/_search','ignore' => [],'mime_type' => 'application/json',
'body' => {
'query' => {
'bool' =>
{'filter' => {'range' => {'timestamp' => {'gte' => 'now-15m'}},
'term' => {'message' => 'metrics'}}}}},
'qs' => {},'method' => 'GET'},'status_code' => 400}
Can someone help me figure out how to search with the search::elasticsearch perl module?
Multiple filter clauses must be passed as separate JSON objects within an array (like in your initial JSON query), not multiple filters in the same JSON object. This maps to how you must create the Perl data structure.
filter => [
{term => { message => 'metrics' }},
{range => { timestamp => { 'gte' => 'now-15m' }}}
]

Unable to ingest the syslog-logstash.conf for remove & replace functions

I am just a newbie to the ELK and trying some testing on this, i'm able to run some tests but while i'm trying a filter with grok & mutate to remoev & replace some feilds from my syslog output i'm getting into below error..
21:58:47.976 [LogStash::Runner] ERROR logstash.agent - Cannot create pipeline {:reason=>"Expected one of #, {, ,, ] at line 21, column 9 (byte 496) after filter {\n if [type] == \"syslog\" {\n grok {\n match => { \"message\" => \"%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:hostname} %{DATA:program}(?:\\[%{POSINT:pid}\\])?: %{GREEDYDATA:syslog_message}\" }\n }\n date {\n match => [ \"syslog_timestamp\", \"MMM d HH:mm:ss\", \"MMM dd HH:mm:ss\" ]\n }\n mutate {\n remove_field => [\n \"message\",\n \"pid\",\n \"port\"\n "}
Below is my config file ....
# cat logstash-syslog2.conf
input {
file {
path => [ "/scratch/rsyslog/*/messages.log" ]
type => "syslog"
}
}
filter {
if [type] == "syslog" {
grok {
match => { "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:hostname} %{DATA:program}(?:\[%{POSINT:pid}\])?: %{GREEDYDATA:syslog_message}" }
}
date {
match => [ "syslog_timestamp", "MMM d HH:mm:ss", "MMM dd HH:mm:ss" ]
}
mutate {
remove_field => [
"message",
"pid",
"port"
"_grokparsefailure"
]
}
mutate {
replace => [
"#source_host", "%{allLogs_hostname}"
"#message", "%{allLogs_message}"
]
}
mutate {
remove => [
"allLogs_hostname",
"syslog_message",
"syslog_timestamp"
]
}
}
output {
if [type] == "syslog" {
elasticsearch {
hosts => "localhost:9200"
index => "%{type}-%{+YYYY.MM.dd}"
}
}
}
please suggest what i'm doing wrong and help to understand the remove & replace functions for the lagstash..
PS: my ELK version is 5.4
The Config you posted have lot of syntactical errors , the logsatsh has it's own config language and expects the config file to abide by the rule.
This link has complete logstash config language reference.
I made some corrections to your config file and posted here , Have added my comments and explanation of what was wrong in the config file itself
input
{
file
{
path => [ "/scratch/rsyslog/*/messages.log" ]
type => "syslog"
}
}
filter
{
if [type] == "syslog"
{
grok
{
match => { "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:hostname} %{DATA:program}(?:\[%{POSINT:pid}\])?: %{GREEDYDATA:syslog_message}" }
}
date
{
match => [ "syslog_timestamp", "MMM d HH:mm:ss", "MMM dd HH:mm:ss" ]
}
# Have merged it with the remove_field option below
#mutate {
# remove_field => [
# "message",
# "pid",
# "port",
# "_grokparsefailure"
# ]
#}
mutate
{
# The replace option only accept hash data type which has a syntax as below
# For more details visit the below link
# https://www.elastic.co/guide/en/logstash/current/plugins-filters-mutate.html#plugins-filters-mutate-replace
replace => {
"#source_host" => "%{allLogs_hostname}"
"#message" => "%{allLogs_message}"
}
}
mutate
{
# Mutate does not have remove option i guess your intention is to remove the event field
# hence used remove_field option here
# The remove_filed option only accepts arary as value type as shown below
# For details read the below link
# https://www.elastic.co/guide/en/logstash/current/plugins-filters-mutate.html#plugins-filters-mutate-remove_field
remove_field => [
"message",
"pid",
"port",
"_grokparsefailure",
"allLogs_hostname",
"syslog_message",
"syslog_timestamp"
]
}
}
}
output
{
if [type] == "syslog"
{
elasticsearch
{
# The Hosts option only takes uri as a value type , originally you have provided string as it's value type
# For more info please read the below link
#https://www.elastic.co/guide/en/logstash/current/plugins-outputs-elasticsearch.html#plugins-outputs-elasticsearch-hosts
hosts => ["localhost:9200"]
index => "%{type}-%{+YYYY.MM.dd}"
}
}
}
You can test whether the config file is syntactically correct by using logstash command line option -t this option will test and report the config file is syntactically correct
bin\logstash -f 'path-to-your-config-file' -t
Please let me know for any clarification
You have to add a comma after "port" in your logstash configuration file.
mutate {
remove_field => [
"message",
"pid",
"port",
"_grokparsefailure"
]
}

Logstash - Custom Timestamp Error

I am trying to input a timestamp field in Logstash and i am getting dateparsefailure message.
My Message -
2014-08-01;11:00:22.123
Pipeline file
input {
stdin{}
#beats {
# port => "5043"
# }
}
# optional.
filter {
date {
locale => "en"
match => ["message", "YYYY-MM-dd;HH:mm:ss.SSS"]
target => "#timestamp"
add_field => { "debug" => "timestampMatched"}
}
}
output {
elasticsearch {
hosts => [ "127.0.0.1:9200" ]
}
stdout { codec => rubydebug }
}
Can someone tell me what i am missing ?
Update 1
I referred to the link - How to remove trailing newline from message field and now it works.
But, in my log message, i have multiple values other than timestamp
<B 2014-08-01;11:00:22.123 Field1=Value1 Field2=Value2
When i give this as input, it is not working. How to read a part of the log and make it as timestamp ?
Update 2
it works now.
Changed the config file as below
filter {
kv
{
}
mutate {
strip => "message"
}
date {
locale => "en"
match => ["timestamp1", "YYYY-MM-dd;HH:mm:ss.SSS"]
target => "#timestamp"
add_field => { "debug" => "timestampMatched"}
}
}
I am posting the answer below and steps i used to solve the issue so that i can help people like me.
Step 1 - I read the message in the form of key and value pair
Step 2 - I trimmed off the extra space that leads to parse exception
Step 3 - I read the timestamp value and other fields in respective fields.
input {
beats {
port => "5043"
}
}
# optional.
filter {
kv { }
date {
match => [ "timestamp", "yyyy-MM-dd HH:mm:ss,SSS" ]
remove_field => [ "timestamp" ]
}
}
output {
elasticsearch {
hosts => [ "127.0.0.1:9200" ]
}
}

Logstash date ISO8601 convert

I want to convert logstash date type to this type 2015-12-03 03:01:00 and [message][3] - [message][1]
Date match doesn't work, how can I do?
Or %{[message][0]} expression is right or not.
filter {
multiline {
.............
}
grok {
match => { "message" => "%{GREEDYDATA:message}" }
overwrite => ["message"]
}
mutate {
gsub => ["message", "\n", " "]
split => ["message", " "]
}
date {
match => [ "%{[message][0]}","ISO8601"} ]
}
}
Message output like this:
"message" => [
[0] "2015-12-03T01:33:22+00:00"
[1]
[2]
[3] "2015-12-03T01:33:24+00:00"
]
Assuming your input is:
2015-12-03T01:33:22+00:00\n\n2015-12-03T01:33:24+00:00
You can grok that without split:
match => { message, "%{TIMESTAMP_ISO8601:string1}\\n\\n%{TIMESTAMP_ISO8601:string2}" }
You can then use the date{} filter with string1 or string2 as input.

Why am I getting a "Odd number of elements in anonymous hash" warning in Perl?

Help, I'm trying to create a new post in my wordpress blog with custom fields using the following perl script using metaweblogAPI over XMLRPC, but there seems to be an issue with the custom fields. Only the second custom field (width) ever seems to get posted. Can't get the "height" to publish properly. When I add another field, I get the "Odd number of elements in anonymous hash" error. This has got to be something simple - would someone kindly sanity check my syntax? Thanks.
#!/usr/bin/perl -w
use strict;
use RPC::XML::Client;
use Data::Dumper;
my $cli=RPC::XML::Client->new('http://www.sitename.com/wp/xmlrpc.php');
my $appkey="perl"; # doesn't matter
my $blogid=1; # doesn't matter (except blogfarm)
my $username="Jim";
my $passwd='_____';
my $text=<<'END';
This is the post content...
You can also include html tags...
See you!
END
my $publish=0; # set to 1 to publish, 0 to put post in drafts
my $resp=$cli->send_request('metaWeblog.newPost',
$blogid,
$username,
$passwd,
{
'title' => "this is doodoo",
'description' => $text,
'custom_fields' => {
{ "key" => "height", "value" => 500 },
{ "key" => "width", "value" => 750 }
},
},
$publish);
exit 0;
While techically valid syntax, it's not doing what you think.
'custom_fields' => {
{ "key" => "height", "value" => 500 },
{ "key" => "width", "value" => 750 }
},
is roughly equivalent to something like:
'custom_fields' => {
'HASH(0x881a168)' => { "key" => "width", "value" => 750 }
},
which is certainly not what you want. (The 0x881a168 part will vary; it's actually the address where the hashref is stored.)
I'm not sure what the correct syntax for custom fields is. You can try
'custom_fields' => [
{ "key" => "height", "value" => 500 },
{ "key" => "width", "value" => 750 }
],
which will set custom_fields to an array of hashes. But that may not be right. It depends on what send_request expects.