rsyslog doesn't write log to kafka - apache-kafka

I'm new with rsyslog and kafka, and get some trouble when trying to get the following log stream worked.
nginx log -> rsyslog-imudp -> rsyslog-omkafka -> kafka
Here is nginx conf
log_format jsonlog '{'
'"host": "$host",'
'"server_addr": "$server_addr",'
'"http_x_forwarded_for":"$http_x_forwarded_for",'
'"remote_addr":"$remote_addr",'
'"time_local":"$time_local",'
'"request_method":"$request_method",'
'"request_uri":"$request_uri",'
'"status":$status,'
'"body_bytes_sent":$body_bytes_sent,'
'"http_referer":"$http_referer",'
'"http_user_agent":"$http_user_agent",'
'"upstream_addr":"$upstream_addr",'
'"upstream_status":"$upstream_status",'
'"upstream_response_time":"$upstream_response_time",'
'"request_time":$request_time'
'}';
access_log syslog:server=server_ip,facility=local7,tag=nginx_access_log jsonlog;
And rsyslog conf
module(load="imudp")
input(type="imudp" port="514")
module(load="omkafka")
template(name="nginxLog" type="string" string="%msg%")
if $inputname == "imudp"then {
action(type="omkafka"
template="nginxLog"
broker=["localhost:9092"]
topic="rsyslog_logstash"
partitions.auto="on"
confParam=[
"socket.keepalive.enable=true"
]
)
}
Unluckily I don't have any output in the consumer terminal
bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic rsyslog_logstash --from-beginning
Maybe it's the template, but I cannot find much documents about it.

use rsyslog-impstats module to check how many messages had your imudp received, and how many sent to omkafka.

For debugging stop rsyslog and run it with:
rsyslogd -nd
(I suggest run in with rsyslogd -n and check your Kafka topic again and then debug it)
Probably you have had problem with SELinux that can be resolved with this:
sudo semanage port -d -t unreserved_port_t -p tcp 9092
sudo semanage port -a -t http_port_t -p tcp 9092

Related

Loading CSV data into Kafka

I was working on event monitoring / microservice monitoring with kafka.
Following a guide from https://rmoff.net/2020/06/17/loading-csv-data-into-kafka/
When i was at the area
kafkacat -b kafka:29092 -t orders_spooldir_00 \
-C -o-1 -J \
-s key=s -s value=avro -r http://schema-registry:8081 | \
jq '.payload'
I got an error, not sure what went wrong, docker end or the server end?
Would appreciate any helps on how I can proceed! thanks
% ERROR: Failed to query metadata for topic orders_spooldir_00: Local: Broker transport failure
kafka-connect | [2021-11-09 16:02:35,778] ERROR [source-csv-spooldir-00|worker] WorkerConnector{id=source-csv-spooldir-00} Error while starting connector (org.apache.kafka.connect.runtime.WorkerConnector:193)

Kafka: Connection issues - All servers failed to process request"

Setup
We have a 3 node kafka cluster, processing messages coming in through nginx. The nginx hands it off to a php which in turn forks a python process and calls the KafkaClient, SimpleProducer & Send_Message
The zookeeper is running on the same host as kafka, nginx is on a separate host. The ports 2181, 2182, 3888, 9092 are all open. No errors seen in starting zookeeper, kafka. All this setup is on AWS in the same vpc.
Kafka & Zookeeper is running as kafka user, Nginx is running as nginx, php-fpm running as apache
Versions
Kafka: 0.8.2
Python: 2.7.5
Relevant snippets from property files.
zookeeper.properties
dataDir=/data/zookeeper
clientPort=2181
maxClientCnxns=100
tickTime=2000
initLimit=5
syncLimit=2
server.1=172.31.41.78:2888:3888
server.2=172.31.45.245:2888:3888
server.3=172.31.23.101:2888:3888
producer.properties
metadata.broker.list=172.31.41.78:9092,172.31.45.245:9092,172.31.23.101:9092
consumer.properties
zookeeper.connect=172.31.41.78:2181,172.31.45.245:2181,172.31.23.101:2181
zookeeper.connection.timeout.ms=6000
server.properties (setup with appropriate IP on other machines)
port=9092
advertised.host.name=172.31.41.78
host.name=172.31.41.78
zookeeper.connect=172.31.41.78:2181,172.31.45.245:2181,172.31.23.101:2181
zookeeper.connection.timeout.ms=60000
php code
function sendDataToKafka($_data,$_srcType) {
try{
$pyKafka = "/usr/bin/python /etc/nginx/html/postMon.py ".$_srcType;
$dspec = array(
0 => array("pipe","r"),
1 => array("pipe","w"),
2 => array("file","/dev/null", "a")
);
$process = proc_open($pyKafka,$dspec,$pipes);
if (is_resource($process)) {
if(fwrite($pipes[0],$_data) == true){
fclose($pipes[0]);
echo stream_get_contents($pipes[1]);
fclose($pipes[1]);
proc_close($process);
echo "Process completed";
python code
import sys,json,time,ConfigParser
import traceback
sys.path.append("/etc/nginx/html/kafka_python-0.9.4-py2.7.egg")
from kafka import KafkaClient,SimpleProducer
try:
srcMap = {
'Alert' : 'alerts'
}
topic = srcMap.get(sys.argv[1],'events')
data = ''
data = 'Testing static Kafka message'
print 'Host: 172.31.23.101:9092'
kafka = KafkaClient("172.31.23.101:9092")
producer = SimpleProducer(kafka,random_start=True)
producer.send_messages(topic,data);
except Exception as e: # most generic exception you can catch
print str(e)
Scenarios
Scenario 1:
Running a
bin/kafka-console-producer.sh --zookeeper 172.31.41.78:2181,172.31.45.245:2181,172.31.23.101:2181 --topic alerts
on 1 shell
and
running
./kafka-console-consumer.sh --zookeeper 172.31.41.78:2181,172.31.45.245:2181,172.31.23.101:2181 --topic alerts
we are able to view messages
Scenario 2:
Running the python code command line (from the nginx host), able to view messages from the consumer
Scenario 3:
Running the php code command line (from nginx host), able to view messages from the consumer
Scenario 4:
Running from a REST client (as POSTMAN)/ CURL using the REST URL, get the following message:
<html>
<body>
Host: 172.31.23.101:9092
All servers failed to process request
<pre>Process completed</pre></body>
<html>
This shows, traffic going to nginx, nginx executing the php & python scripts, but erroring out when the first call to Kafka is made - KafkaClient happens. Somehow the python is unable to access Kafka.
Don't know if this is user permission/ silly config mistake.
Also......
We have a similar working setup in another vpc
The security groups, config files, codebase properties etc. are consistent
Upgrade options are not a possibility in near term
Any pointers/help/fresh pair of eyes would really help us going.
Thanks !
Finally figured the apache user did not have "right" permission.
selinuxconlist apache helped fix the issue.

Problems publishing messages with kafka running on mesos DCOS

I have a small cluster running DCOS. I'm able to successfully install kafka following this guide. running
$ dcos kafka connection
gives
{
"address": [
"10.131.17.126:9475",
"10.131.24.6:9655",
"10.131.14.192:9181"
],
"zookeeper": "master.mesos:2181/dcos-service-kafka",
"dns": [
"broker-0.kafka.mesos:9475",
"broker-1.kafka.mesos:9655",
"broker-2.kafka.mesos:9181"
]
}
I can create topics and I've examined zookeeper with the cli tool and the state appears to be good
get /dcos-service-kafka/brokers/ids/0
{"jmx_port":-1,"timestamp":"1474206074029","endpoints":["PLAINTEXT://10.131.17.126:9475"],"host":"10.131.17.126","version":3,"port":9475}
get /dcos-service-kafka/brokers/ids/1
{"jmx_port":-1,"timestamp":"1474206120002","endpoints":["PLAINTEXT://10.131.24.6:9655"],"host":"10.131.24.6","version":3,"port":9655}
get /dcos-service-kafka/brokers/ids/2
{"jmx_port":-1,"timestamp":"1474206122985","endpoints":["PLAINTEXT://10.131.14.192:9181"],"host":"10.131.14.192","version":3,"port":9181}
However when I try publishing
echo "Hello, World." | ./kafka-console-producer.sh --broker-list 10.131.17.126:9475, 10.131.24.6:9655, 10.131.14.192:9181 --topic topic1
I get
[2016-09-18 18:49:32,909] ERROR Error when sending message to topic topic1 with key: null, value: 13 bytes with error: Failed to update metadata after 60000 ms. (org.apache.kafka.clients.producer.internals.ErrorLoggingCallback)
I suspect it might be something to do with private vs. public ip addresses and perhaps host.name in server.properties.
Can anyone give some suggestions as to how I might debug (and hopefully fix!) the problem so I can successfully publish messages?
Thanks
AJ
Update - this does appear to have been caused by missing entries in /etc/hosts. I've updated my terraform script to write these during setup and your example above now works as expected.
Thanks for your help
Edit: For anyone looking in the future. This was a problem in /etc/hosts caused by a terraform script.
Your suspicion is correct. Those are private IP addresses which are not addressable from outside the cluster. In order to communicate with Kafka you will either have to setup a VPN such that those IP addresses become reachable, or run your publishing command on a machine in the cluster.
Also, it looks like you're running on a DC/OS version earlier than 1.8. If you use 1.8, you'll get an easier broker endpoint to use, regardless of the dynamically assigned IP addresses. You can used the named VIP broker.kafka.l4lb.thisdcos.directory:9092 however this is only addressable from machines in the cluster.
Setting up haproxy or nginx to point to the named VIP is also a way to get easy external access to a service (in this case Kafka) running on a DC/OS cluster. You would want to ensure that these proxies run on a public Agent. See here for more details.
Here is an example of installing, producing and consuming from the default Kafka installation:
~ $ dcos package install kafka
Installing Marathon app for package [kafka] version [1.1.11-0.10.0.0]
Installing CLI subcommand for package [kafka] version [1.1.11-0.10.0.0]
New command available: dcos kafka
DC/OS Kafka Service is being installed.
Documentation: https://docs.mesosphere.com/usage/services/kafka/
Issues: https://docs.mesosphere.com/support/
~ $ dcos kafka connection
{
"address": [
"10.0.3.64:9951",
"10.0.3.68:9795",
"10.0.3.66:9531"
],
"zookeeper": "master.mesos:2181/dcos-service-kafka",
"dns": [
"broker-0.kafka.mesos:9951",
"broker-1.kafka.mesos:9795",
"broker-2.kafka.mesos:9531"
],
"vip": "broker.kafka.l4lb.thisdcos.directory:9092"
}
~ $ dcos kafka topic create topic0
{
"message": "Output: Created topic \"topic0\".\n"
}
~ $ dcos node ssh --master-proxy --leader
core#ip-10-0-7-56 ~ $ wget http://download.nextag.com/apache/kafka/0.10.0.1/kafka_2.11-0.10.0.1.tgz
core#ip-10-0-7-56 ~ $ tar xf kafka_2.11-0.10.0.1.tgz
core#ip-10-0-7-56 ~ $ cd kafka_2.11-0.10.0.1
core#ip-10-0-7-56 ~/kafka_2.11-0.10.0.1 $ bin/kafka-console-producer.sh --broker-list broker.kafka.l4lb.thisdcos.directory:9092 --topic topic0
This is a message
This is another message
^Ccore#ip-10-0-7-56 ~/kafka_2.11-0.10.0.1 $ bin/kafka-console-consumer.sh --zookeeper master.mesos:2181/dcos-service-kafka --topic topic0 --from-beginning
This is a message
This is another message
^CProcessed a total of 2 messages
$ bin/kafka-console-producer.sh --broker-list 10.0.3.64:9951,10.0.3.68:9795,10.0.3.66:9531 --topic topic0
foo
bar
baz
^Ccore#ip-10-0-7-56 ~/kafka_2.11-0.10.0.1 $ bin/kafka-console-consumer.sh --zookeeper master.mesos:2181/dcos-service-kafka --topic topic0 --from-beginning
This is a message
This is another message
foo
bar
baz
^CProcessed a total of 5 messages

How to enable remote JMX on Kafka brokers (for JmxTool)?

I enabled JMX on Kafka brokers by adding
KAFKA_JMX_OPTS="-Dcom.sun.management.jmxremote=true
-Dcom.sun.management.jmxremote.authenticate=false
-Dcom.sun.management.jmxremote.ssl=false
-Djava.rmi.server.hostname=<server_IP>
-Djava.net.preferIPv4Stack=true"
However, when I use kafka.tools.JmxTool to get the JMX metrics, it outputs Unix timestamps only. Why?
./bin/kafka-run-class.sh kafka.tools.JmxTool \
--object-name 'kafka.server:type=BrokerTopicMetrics,name=AllTopicsMessagesInPerSec' \
--jmx-url "service:jmx:rmi:///jndi/rmi://<server_IP>:9111/jmxrmi"
How can I have it print out the metrics?
Edit bin/kafka-run-class.sh and set KAFKA_JMX_OPTS variable
KAFKA_JMX_OPTS="-Dcom.sun.management.jmxremote=true -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Djava.rmi.server.hostname=your.kafka.broker.hostname -Djava.net.preferIPv4Stack=true"
Update bin/kafka-server-start.sh add the below line
export JMX_PORT=PORT
You must set 'JMX_PORT' variable, or add the following line to bin/kafka-server-start.sh.
export JMX_PORT=${JMX_PORT:-9999}
then you will be able to connect to Kafka JMX metrics. I use jconsole tool and 'localhost:9999' address.
Setting JMX_PORT inside bin/kafka-run-class.sh will clash with Zookeeper, if you are running Zookeeper on the same node.
Best is to set JMX port individually inside corresponding server-start scripts:
Insert line “export JMX_PORT=${JMX_PORT:-9998}” before last line in $KAFKA_HOME/bin/zookeeper-server-start.sh file.
Restart the Zookeeper server.
Repeat steps 1 and 2 for all zookeeper nodes in the cluster.
Insert line “export JMX_PORT=${JMX_PORT:-9999}” before last line in $KAFKA_HOME/bin/kafka-server-start.sh file.
Restart the Kafka Broker.
Repeat steps 4 and 5 for all brokers in the cluster.
If you're running via systemd:
edit /etc/systemd/system/multi-user.target.wants/kafka.service
in the "[service]" section add a line:
Environment=JMX_PORT=9989
reload: systemctl daemon-reload
restart: systemctl restart kafka
enjoy the beans: echo 'beans' | java -jar jmxterm-1.0-alpha-4-uber.jar -l localhost:9989 -n 2>&1
This is Kafka 2.3.0.
Using jconsole For Available MBeans
You should use jconsole first to know the names of the MBeans available.
The proper name of the MBean you wanted to query metrics of is kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec (the AllTopics prefix was used in older verions). Thanks AndyTheEntity.
Enabling Remote JMX (with no authentication or SSL)
As described in Monitoring and Management Using JMX Technology you should set certain system properties when you start the Java VM of a Kafka broker.
Kafka's bin/kafka-run-class.sh shell script makes the configuration painless as it does the basics for you and sets KAFKA_JMX_OPTS.
# JMX settings
if [ -z "$KAFKA_JMX_OPTS" ]; then
KAFKA_JMX_OPTS="-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false "
fi
For remote JMX you should set com.sun.management.jmxremote.port that Kafka's bin/kafka-run-class.sh shell script sets using JMX_PORT environment variable.
# JMX port to use
if [ $JMX_PORT ]; then
KAFKA_JMX_OPTS="$KAFKA_JMX_OPTS -Dcom.sun.management.jmxremote.port=$JMX_PORT "
fi
With that, enabling remote JMX is as simple as the following command:
JMX_PORT=9999 ./bin/kafka-server-start.sh config/server.properties
Using JmxTool
With the above, run the JmxTool:
$ ./bin/kafka-run-class.sh kafka.tools.JmxTool \
--object-name 'kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec'
Trying to connect to JMX url: service:jmx:rmi:///jndi/rmi://:9999/jmxrmi.
"time","kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec:Count","kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec:EventType","kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec:FifteenMinuteRate","kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec:FiveMinuteRate","kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec:MeanRate","kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec:OneMinuteRate","kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec:RateUnit"
1567586728595,0,messages,0.0,0.0,0.0,0.0,SECONDS
1567586730597,0,messages,0.0,0.0,0.0,0.0,SECONDS
...
You could use --one-time option to print the JMX metrics just once.
$ ./bin/kafka-run-class.sh kafka.tools.JmxTool \
--object-name 'kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec' \
--one-time true
Trying to connect to JMX url: service:jmx:rmi:///jndi/rmi://:9999/jmxrmi.
"time","kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec:Count","kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec:EventType","kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec:FifteenMinuteRate","kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec:FiveMinuteRate","kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec:MeanRate","kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec:OneMinuteRate","kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec:RateUnit"
1567586898459,0,messages,0.0,0.0,0.0,0.0,SECONDS
vim kafka_2.11-0.10.1.1/bin/kafka-run-class.sh
and then add the first two lines and comment as I have done for other lines, (Note : after doing this Kafka scripts cannot be used for client operations for listing topics.. for your client operations you need to use a separate scripts , download again in different locations and use)
export JMX_PORT=9096
KAFKA_JMX_OPTS="-Dcom.sun.management.jmxremote=true -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Djava.rmi.server.hostname=<ipaddress> -Dcom.sun.management.jmxremote.port=$JMX_PORT -Dcom.sun.management.jmxremote.rmi.port=$JMX_PORT"
# JMX settings
#if [ -z "$KAFKA_JMX_OPTS" ]; then
# KAFKA_JMX_OPTS="-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false "
#fi
# JMX port to use
#if [ $JMX_PORT ]; then
# KAFKA_JMX_OPTS="$KAFKA_JMX_OPTS -Dcom.sun.management.jmxremote.port=$JMX_PORT "
#fi
Use kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec
The AllTopics prefix was used in older verions. You can specify topic using kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec,topic=<topic-name>
src: http://grokbase.com/t/kafka/users/164ksnhff0/enable-jmx-on-kafka-brokers
Kafka has provided all you need. When your start your server, activate the KAFKA_JMX_OPTS arg by using these command:
$KAFKA_JMX_OPTS JMX_PORT=[your_port_number] ./kafka-server-start.sh -daemon ../config/server.properties
Using those command, you activated JMX Remote and related port. Then you can connect your JConsole or another monitoring tools.
Just before calling kafka-server-start.sh add following exports. It worked like a charm for my case. You can set desired port for JMX_PORT and you should set broker for $BROKER_IP part.
export JMX_PORT=9900
export KAFKA_JMX_OPTS="-Dcom.sun.management.jmxremote=true -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Djava.rmi.server.hostname=$BROKER_IP -Djava.net.preferIPv4Stack=true"
This is standard Kafka start procedure:
bin/kafka-server-start.sh config/server.properties
This is Kafka start procedure with JMX:
JMX_PORT=8004 bin/kafka-server-start.sh config/server.properties

Terminate Kafka Console Consumer when all the messages have been read

I know there has to be a way to do this, but I am not able to figure this out. I need to stop the kafka consumer once I have read all the messages from the queue.
Can somebody provide any info on this?
You can pass parameter: -consumer-timeout-ms with a value when starting the consumer and it will throw an exception if no messages have been read during that time. For example, to stop the consumer if no new messages have arrived in the last 2 seconds:
kafka.consumer.ConsoleConsumer -consumer-timeout-ms 2000
You can see this and all the other input options here
Currently, Kafka version 2.11-2.1.1 has a script called kafka-console-consumer.sh.
It has a new flag: --timeout-ms.
Basically, this flag is the maximum time to wait before exiting when there is no new log to wait. It's in millisecond term.
You can use this property to end you console consumer after reading all the messages.
You can use SimpleConsumerShell with no-wait-at-logend option. See SystemTools-SimpleConsumerShell
For example:
./kafka-run-class.bat kafka.tools.SimpleConsumerShell --broker-list localhost:9092 --topic kafkademo --partition 0 --no-wait-at-logend
If you are not dead set on using the Scala client, try kafkacat with the -e option telling it to exit when end of partition has been reached.
E.g. to consume all messages from mytopic partition 2 and then exit:
$ kafkacat -b mybroker -t mytopic -p 2 -o beginning -e
Or consume the last 3000 messages and then exit:
$ kafkacat -b mybroker -t mytopic -p 2 -o -3000 -e