Encoding issue when streaming logs from AWS Kinesis to ElasticSearch via Logstash - encoding

I've got an AWS Kinesis data stream called "otelpoc".
In Logstash, I'm using the Kinesis input plugin - see here.
My Logstash config is as follows:
input {
kinesis {
kinesis_stream_name => "otelpoc"
region => "ap-southeast-2"
codec => json { }
}
}
output {
elasticsearch {
hosts => ["elasticsearch:9200"]
index => "otelpoc-logstash-%{+YYYY.MM.dd}"
}
}
I can put events to Kinesis using the AWS CLI as follows:
aws kinesis put-record --stream-name otelpoc --data file://payload.json --partition-key 1
payload.json looks like this:
{
"message": "this is a test",
"level": "error"
}
... but when I do this I see an error in Logstash as follows:
Received an event that has a different character encoding than you configured. {:text=>"\\x99\\xEB,j\\a\\xAD\\x86+\\\"\\xB1\\xAB^\\xB2\\xD9^\\xBD\\xE9^\\xAE\\xBA+", :expected_charset=>"UTF-8"}
Interestingly the message still get's outputted to Elastic and I can view it in Kibana as shown below:
Not sure what I should be doing with the character encoding... I've tried several things in Logstash, but no success e.g. changing the codec in the kinesis input to something like the following
codec => plain {
charset => "UTF-8"
}
... but no luck... I tried to decode the encoded text in a few online decoders, but not really sure what I'm trying to decode from... anyone able to help?
EDIT: using v6.7.1 of ELK stack, which is quite old, but I don't think this is the issue...

I never resolved this when publishing messages to Kinesis using the AWS CLI, but for my specific use case I was trying to send logs to Kinesis using the awskinesis exporter for the Open Telemetry (OTEL) collector agent - see here.
If I use otlp_json encoding, it worked e.g.
awskinesis:
aws:
stream_name: otelpoc
region: ap-southeast-2
encoding:
name: otlp_json

Related

AWS Kinesis throwing CloudWatchException

I am trying Scala code using the KCL library to read a Kinesis stream. I keep getting this CloudWatchException and I would like to know why?
16:16:06.629 [aws-akka-http-akka.actor.default-dispatcher-20] DEBUG software.amazon.awssdk.request - Received error response: 400
16:16:06.638 [cw-metrics-publisher] WARN software.amazon.kinesis.metrics.CloudWatchMetricsPublisher - Could not publish 16 datums to CloudWatch
software.amazon.awssdk.services.cloudwatch.model.CloudWatchException: When Content-Type:application/x-www-form-urlencoded, URL cannot include query-string parameters (after '?'): '/?Action=PutMetricData&Version=2010-08-01&Namespace=......
Any idea what's causing this or as I suspect, perhaps it's a bug in the Kinesis library?

Error while submitting PySpark Application through Livy REST API

I want to submit a Pyspark application to Livy through REST API to invoke HiveWarehouse Connector. Based on this answer in Cloudera community
https://community.cloudera.com/t5/Community-Articles/How-to-Submit-Spark-Application-through-Livy-REST-API/ta-p/247502
I created a test1.json as follows
{
"jars": ["hive-warehouse-connector-assembly-1.0.0.3.1.0.0-78.jar"],
"pyFiles": ["pyspark_hwc-1.0.0.3.1.0.0-78.zip"],
"file": ["test1.py"]
}
and call InvokeHTTP. But I get this error ""Cannot deserialize instance of java.lang.String out of START_ARRAY token\n at [Source: (org.eclipse.jetty.server.HttpInputOverHTTP); line: 1, column: 224] (through reference chain: org.apache.livy.server.batch.CreateBatchRequest[\"file\"
I think the 'file' field with test1.py is wrong. Can anyone tell me how to submit this?
This works with a simple spark-submit test1.py
All suggestions are welcome
The following works
For basic Hive access the following works use the below JSON
{
    "file":"hdfs-path/test1.py"
}
For Hive LLAP access use JSON as below
{
"jars": ["<path-to-jar>/hive-warehouse-connector-assembly-1.0.0.3.1.0.0-78.jar"],
"pyFiles": ["<path-to-zip>/hive_warehouse_connector/pyspark_hwc-1.0.0.3.1.0.0-78.zip"],
"file": "<path-to-file>/test3.py"
}
Interestingly when I put the zip in the "archives" field it gives error. It works for the "pyFiles" field though as shown above

logstash is throwing exception template file not found

I'm trying to install docker-elk stack using docker-compose, elastic search and kibana are working fine, but my logstash is not connecting to elastic search and throwing error shown below, I'm installing this for first time so doesn't have much knowledge about it.
logstash-5-6 | [2017-11-26T06:09:06,455][ERROR][logstash.outputs.elasticsearch] Failed to install template. {:message=>"Template file '' could not be found!", :class=>"ArgumentError", :backtrace=>["/usr/share/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-7.4.2-java/lib/logstash/outputs/elasticsearch/template_manager.rb:37:in `read_template_file'", "/usr/share/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-7.4.2-java/lib/logstash/outputs/elasticsearch/template_manager.rb:23:in `get_template'", "/usr/share/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-7.4.2-java/lib/logstash/outputs/elasticsearch/template_manager.rb:7:in `install_template'", "/usr/share/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-7.4.2-java/lib/logstash/outputs/elasticsearch/common.rb:58:in `install_template'", "/usr/share/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-7.4.2-java/lib/logstash/outputs/elasticsearch/common.rb:25:in `register'", "/usr/share/logstash/logstash-core/lib/logstash/output_delegator_strategies/shared.rb:9:in `register'", "/usr/share/logstash/logstash-core/lib/logstash/output_delegator.rb:43:in `register'", "/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:290:in `register_plugin'", "/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:301:in `register_plugins'", "org/jruby/RubyArray.java:1613:in `each'", "/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:301:in `register_plugins'", "/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:310:in `start_workers'", "/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:235:in `run'", "/usr/share/logstash/logstash-core/lib/logstash/agent.rb:398:in `start_pipeline'"]}
logstash-5-6 | [2017-11-26T06:09:06,455][INFO ][logstash.outputs.elasticsearch] New Elasticsearch output {:class=>"LogStash::Outputs::ElasticSearch", :hosts=>["//elasticsearch-5-6:9201"]}
Logstash.conf
input {
tcp {
port => 5001
}
}
## Add your filters / logstash plugins configuration here
output {
elasticsearch {
hosts => "localhost:9201"
}
}
providing the custom template and updating its path in output plugin solved the issue.

Using logstash for email alert

I installed logstash 5.5.2 in our windows server and I would like to send an email alert when I identify some sentences.
My output section is the following:
output {
tcp {
host => "host.com"
port => 1234
codec => "json_lines"
}
if "The message was Server with id " in [log_message] {
email {
to => "<myName#company.com>"
from => "<otherName#company.com>"
subject => "Issue appearance"
body => "The input is: %{incident}"
domain => "smtp.intra.company.com"
port => 25
#via => "smtp"
}
}
}
During my debug I got the following messages:
[2017-09-11T13:19:39,181][ERROR][logstash.plugins.registry] Problems loading a plugin with {:type=>"output", :name=>"email", :path=>"logstash/outputs/email", :error_message=>"NameError", :error_class=>NameError
[2017-09-11T13:19:39,186][DEBUG][logstash.plugins.registry] Problems loading the plugin with {:type=>"output", :name=>"email"}
[2017-09-11T13:19:39,195][ERROR][logstash.agent ] Cannot create pipeline {:reason=>"Couldn't find any output plugin named 'email'. Are you sure this is correct? Trying to load the email output plugin resulted in this error: Problems loading the requested plugin named email of type output.
Which I guess says that I don't have the email plugin installed.
Can someone suggest a way to fix this?
Using another solution is not an option, just in case that someone suggests it.
Thanks and regards,
Fotis
I tried to follow the instructions in the official documentation
But the option of creating an offline plugin pack didn't work.
So what I did, was to create a running logstash instance on my client, run the command to install the output-email plugin logstash-plugin install logstash-output-email and afterwards I copied this instance on my server(which had no internet access).

Logstash TCP Input Crashing

We have a logstash (v2.3) setup with 1 queue server running RabbitMQ, 10 elasticsearch nodes and a web node for kibana. Everything "works" and we have a large number of servers sending logs at the queue server. Most of the logs make it in, but we've noticed many that just never show up.
Looking in the logstash.log file we'll see the following start showing:
{:timestamp=>"2016-07-15T16:21:34.638000+0000", :message=>"A plugin had an unrecoverable error. Will restart this plugin.\n Plugin: <LogStash::Inputs::Tcp type=>\"syslog\", port=>5544, codec=><LogStash::Codecs::JSONLines charset=>\"UTF-8\", delimiter=>\"\\n\">, add_field=>{\"deleteme\"=>\"\"}, host=>\"0.0.0.0\", data_timeout=>-1, mode=>\"server\", ssl_enable=>false, ssl_verify=>true, ssl_key_passphrase=><password>>\n Error: closed stream", :level=>:error}
This repeats about every second or so. We initially thought maybe max connections limit was being met but a netstat only shows about ~4000 connections, and our limit should be upwards of 65,000.
Why is this TCP plugin crashing so much?
Everything I've read online hints towards this being an older issue that was resolved with newer versions of Logstash, which we've long since installed. Whats confusing is that it is partially working, we're getting a ton of logs but also seem to be missing quite a bit.
Relevant conf file on Queue server:
queue.mydomain.com:
input {
tcp {
type => "syslog"
port => "5544"
}
udp {
type => "syslog"
port => "5543"
}
}
output {
rabbitmq {
key => "thekey"
exchange => "theexchange"
exchange_type => "direct"
user => "username"
password => "password"
host => "127.0.0.1"
port => 5672
durable => true
persistent => true
}
}
We recently added UDP to the above conf to test with it, but logs aren't making it in reliably to it either.
Just in case the Elasticsearch cluster conf is relevant:
We have a 10 node Elasticsearch cluster, setup to pull from the queue server, this works as intended and is on same version of Logstash as Queue server. They pull from the rabbitMQ server with the conf:
input {
rabbitmq {
durable => "true"
host => "***.**.**.**"
key => "thekey"
exchange => "theexchange"
queue => "thequeue"
user => "username"
password => "password"
}
}
Anyone have any ideas for us to try to figure out whats up with tcp-input plugin?
Thanks.