Ingest email attachments on ElasticSearch - email

I'm trying to use ELK pipeline to read an email (IMAP), extract generic attachments (mainly PDF, eventually doc or ppt) and put them on ElasticSearch.
This is what I was able to do:
Loading directly to ElasticSearch from file some base64 data using Logstash, using the Ingest Attachment Processor on ElasticSearch to read the base64 content.
Loading data from IMAP (exchange email) I can correctly load all email information on ElasticSearch except the attachment (what I need).
The first solution works fine and does what I am looking for, except that it doesn't extract attachments directly from the email and that I have hardcoded base64 data inside the files.
With the second solution I have a field x-ms-has-attach: yes on Kibana, but there isn't anywhere the attachment itself. The imap plugin is intended to load only the content of the email without the attachment?
What am I missing? Could you suggest me a pipeline to achieve what I am looking for?
This is my logstash configuration for the first example:
input {
file {
path => "/my/path/to/data/*"
start_position => "beginning"
# sincedb_path => "/my/path/to/sincedb"
sincedb_path => "/dev/null"
close_older => 0
tags => ["attachment"]
}
}
output {
elasticsearch {
index => "email-attachment"
hosts => [ "localhost:9200" ]
}
}
This is the pipeline:
PUT _ingest/pipeline/email-attachment
{
"description": "Pipeline to parse an email and its attachments",
"processors": [
{
"attachment" : {
"field" : "message"
}
},
{
"remove" : {
"field" : "message"
}
},
{
"date_index_name" : {
"field" : "#timestamp",
"index_name_prefix" : "email-attachment-",
"index_name_format": "yyyy-MM",
"date_rounding" : "M"
}
}
]
}
This is my logstash configuration for the second example:
input {
imap {
host => "my.domain.it"
password => "mypassword"
user => "myuser"
port => 12345
type => "imap"
secure => true
strip_attachment => true
}
}
output {
elasticsearch {
index => "email-attachment"
hosts => [ "localhost:9200" ]
}
}
UPDATE
I'm using version 5.2.2

In the end I defined a totally different pipeline.
I read emails using a Ruby application with the mail library (you can find it on github), where it's quite easy to extract attachments.
Then I put the base64 encoding of those attachments directly on ElasticSearch, using Ingest Attachment Processor.
I filter on content_type just to be sure to load only "real" attachments, as the multiparts emails treat any multimedial content in the body (ie: images) as attachment.
P.S.
Using the mail library, you should do something like:
Mail.defaults do
retriever_method :imap, { :address => address,
:port => port,
:user_name => user_name,
:password => password,
:enable_ssl => enable_ssl,
:openssl_verify_mode => openssl_verify_mode }
and new_messages = Mail.find(keys: ['NOT','SEEN']) to retrieve unseen messages.
Then iterate over new_messages. After, you can encode a message simply using encoded = Base64.strict_encode64(attachment.body.to_s). Please inspect new_messages to check the exact field names to use.

Your problem might come from strip_attachment => true in the imap input plugin.

Related

how can I add api_key in logstash conf file for sending data?

I want to send the rest API data to Elasticsearch from logstash. But I don't know how I can API URL and API key to logstash conf file.
enter image description here
Here I am sharing my logstash conf file.
You required to set Authorization header for api key like below:
input {
http_poller {
urls => {
test1 => {
method => get
url => "https://my-api-url"
headers => {
Accept => "application/json"
Authorization => "Basic ZWxhc3RpY"
}
}
}
}
}

Create index with the same name as request path value, using ElasticSearch output

This is my logstash.conf:
input {
http {
host => "127.0.0.1"
port => 31311
}
}
output {
elasticsearch {
hosts => ["http://localhost:9200"]
}
stdout {
codec => "rubydebug"
}
}
As a test, I ran this command in PowerShell:
C:\Users\Me\Downloads\curl-7.64.1-win64-mingw\bin> .\curl.exe -XPUT
'http://127.0.0.1:31311/twitter'
The following output was displayed inside my Logstash terminal:
{
"#timestamp" => 2019-04-09T08:32:09.250Z,
"message" => "",
"#version" => "1",
"headers" => {
"request_path" => "/twitter",
"http_version" => "HTTP/1.1",
"http_user_agent" => "curl/7.64.1",
"request_method" => "PUT",
"http_accept" => "*/*",
"content_length" => "0",
"http_host" => "127.0.0.1:31311"
},
"host" => "127.0.0.1"
}
When I then ran
C:\Users\Me\Downloads\curl-7.64.1-win64-mingw\bin> .\curl.exe -XGET
"http://127.0.0.1:9200/_cat/indices"
inside PowerShell, I saw
yellow open logstash-2019.04.09 1THStdPfQySWl1WPNeiwPQ 5 1 0 0 401b 401b
An index named logstash-2019.04.09 has been created in response to my PUT request, following the ElasticSearch convention.
My question is: If I want the index to have the same value as the {index_name} parameter I pass inside the the command .\curl.exe -XPUT 'http://127.0.0.1:31311/{index_name}', how should I configure the ElasticSearch output inside my logstash.conf file?
EDIT: Just to clarify, I want {index_name} to be read dynamically every single time I make a PUT request to create a new index. Is that even possible?
It is possible with the index output configuration option.
This configuration can be dynamic using the %{foo} syntax. Since you want the value of [headers][request_path] to be in the index configuration, you can do something like this:
output {
elasticsearch {
hosts => ["http://localhost:9200"]
index => "%{[headers][request_path]}"
}
}
For this to work the value [headers][request_path] field must not contain any of these characters: [ , \", *, \\, <, |, ,, >, /, ?].
I recommend that you use the gsub configuration option of the mutate filter. So, to remove all the forward slashes, you should have something like this:
filter{
mutate{
gsub => ["[headers][request_path]","/",""]
}
}
If the request path has several forward slashes, you could replace them with some character that will be accepted by elasticsearch.
So, your final logstash.conf file should look like this:
input {
http {
host => "127.0.0.1"
port => 31311
}
}
filter{
mutate{
gsub => ["[headers][request_path]","/",""]
}
}
output {
elasticsearch {
hosts => ["http://localhost:9200"]
index => "%{[headers][request_path]}"
}
stdout {
codec => "rubydebug"
}
}
You can do so by adding an index configuration setting to your elasticsearch output section. e.g.
output {
elasticsearch {
hosts => ["http://localhost:9200"]
index => "yourindexnamehere"
}
stdout {
codec => "rubydebug"
}
}

Logstash fails to send emails when i use a variable as an email trigger keyword

I have configured logstash in a way we can dynamically configure the alert keyword which will send an email when it appears in message.
Logstash fails to send emails when i use a variable as an email trigger keyword.
My old configuration worked: I got emails when there is ERROR keyword in message
if "ERROR" in [message] {
email {
address=>"mailsrv.unix.gsm1900.org"
port=>25
from => "logstash_alert#t-mobile.com"
subject => "(${SPRING_PROFILES_ACTIVE}) Logstash Alert from ${APPLICATION_NAME}"
via => "smtp"
to => "${CLIENT_MAIL}"
body => "In host ${HOST_IP:HOST_NOT_SET} the event line that occurred: %{message}"
}
New config: It is not sending any emails. I have setup the ERROR key word for this variable in /etc/default/logstash file
if "${EXCEPTION_STRING}" in [message] {
email {
address=>"mailsrv.unix.gsm1900.org"
port=>25
from => "logstash_alert#t-mobile.com"
subject => "(${SPRING_PROFILES_ACTIVE}) Logstash Alert from ${APPLICATION_NAME}"
via => "smtp"
to => "${CLIENT_MAIL}"
body => "In host ${HOST_IP:HOST_NOT_SET} the event line that occurred: %{message}"
}
Please help here. Thank you
Jump in your wayback machine to 2016 to see that variables are not supported in conditionals. That post provides a workaround of setting the variable into metadata, which can then be used in the conditional:
mutate {
add_field => { "[#metadata][EXCEPTION_STRING]" => "${EXCEPTION_STRING}" }
}
if [#metadata][EXCEPTION_STRING] in [message] {
...
}

How can I embed metadata into a custom XMP field with exiftool?

Can someone please explain how to embed metadata into a custom metadata field in an MP4 file with exiftool? I've searched all the docs and it seems to be related to the config file that needs to be created. Here is what I'm working with. (I know this isnt even close, as its not doing XMP fields, but I havent found a single working example with XMP fields yet.
%Image::ExifTool::UserDefined = (
'Image::ExifTool::Exif::Main' => {
0xd001 => {
Name => 'Show',
Writable => 'string',
WriteGroup => 'IFD0',
},
);
1; #end
The command I'm trying to run is:
exiftool -config exifToolConfig -show="Lightning" /reachengine/media/mezzanines/2015/02/13/13/CanyonFlight.mp4
Running this in a linux environment.
What is the properly way to set XMP metadata on custom metadata fields via ExifTool in linux on MP4 files?
The sample exiftool config file contains a number of working examples of custom XMP tags.
Basically, it is done like this:
%Image::ExifTool::UserDefined = (
'Image::ExifTool::XMP::Main' => {
xxx => {
SubDirectory => {
TagTable => 'Image::ExifTool::UserDefined::xxx',
},
},
},
);
%Image::ExifTool::UserDefined::xxx = (
GROUPS => { 0 => 'XMP', 1 => 'XMP-xxx', 2 => 'Other' },
NAMESPACE => { 'xxx' => 'http://ns.myname.com/xxx/1.0/' },
WRITABLE => 'string',
MyNewXMPTag => { },
);
Then the command is
exiftool -config myconfig -mynewxmptag="some value" myfile.mp4

Including multiple messages in a Logstash output email

Does anybody know a way to include multiple messages in the same email from Logstash?
Currently this is the configuration that I am using:
if [LOGLEVEL] == "ERROR" AND [type] == "application" {
email {
from => "logstash#example.com"
subject => "Application error on %{host}"
to => "foo#example.com"
via => "smtp"
body => "%{message}"
replyto => "bar#example.com"
}
}
and it is sending emails, however what I'd like to be able to do is to send, say, the previous 20 messages from the same logfile, so that there is more information in the emails. Is it possible to use a query as the body of the email?
If that's not possible has anyone been able to get the emails to send a link to a page or location in the Logstash server where more details can be found?
I'm using Logstash version 1.4.2 and have checked the documentation at http://logstash.net/docs/1.4.2/outputs/email but I can't see anything that might allow me to do what I'm trying to do. I've also tried searching for examples of what I want on Google, but I can't find anything where people are including more information than what is in the current event.
Thanks,
Bill
message_format would be help you
if [LOGLEVEL] == "ERROR" AND [type] == "application" {
email {
from => "logstash#example.com"
subject => "Application error on %{host}"
to => "foo#example.com"
via => "smtp"
message_format => "%{mesage} yourlink, etc..."
body => "%{message}"
replyto => "bar#example.com"
}
}