Kafka: Monitor the lag for the consumers that are assigned to partitions topic - apache-kafka

I'm working with Kafka 0.9.1 new consumer API. The consumer is manually assigned to a partition. For this consumer I would like to see its progress (meaning the lag). Since I added the group id consumer-tutorial as property, I assumed that I can use the command
bin/kafka-consumer-groups.sh --new-consumer --describe --group consumer-tutorial --bootstrap-server localhost:9092
(as explained here http://www.confluent.io/blog/tutorial-getting-started-with-the-new-apache-kafka-0.9-consumer-client)
Unfortunately, my consumer group details is not shown using the above command. Therefore I cannot monitor the progress of my consumer (it's lag). How can I monitor the lag in the above described scenario (manually assigned partition)?
The code is:
Properties props = new Properties();
props.put("bootstrap.servers", "localhost:9092");
props.put("group.id", "consumer-tutorial");
props.put("key.deserializer", StringDeserializer.class.getName());
props.put("value.deserializer", StringDeserializer.class.getName());
KafkaConsumer<String, String> consumer = new KafkaConsumer<>(props);
String topic = "my-topic";
TopicPartition topicPartition = new TopicPartition(topic, 0);
consumer.assign(Arrays.asList(topicPartition));
consumer.seekToBeginning(topicPartition);
try {
while (true) {
ConsumerRecords<String, String> records = consumer.poll(1000);
for (ConsumerRecord<String, String> record : records)
System.out.println(record.offset() + ": " + record.value());
consumer.commitSynch();
}
} finally {
consumer.close();
}

Just in case you don't want to write code to get this info or run command-like tools/shell scripts ad-hoc, there are N tools that will capture Kafka metrics, including Consumer Lag. Off the top of my head: Burrow and SPM for Kafka do a good job. Here is a bit of background about Kafka offsets, consumer lag, and a few metrics derived from what Kafka exposes via JMX. HTH.

If you interested in JMX exposure of consumer group lag, here is the agent I wrote:
https://github.com/peterkovgan/kafka9.offsets
You can run this agent on some Kafka node and expose offset lag statistics to external readers.
There are examples how you use this agent with Telegraf
(https://influxdata.com/time-series-platform/telegraf/).
At the end (combining e.g. telegraf,influxdb and grafana) you can see nice graphs of offset lags for several consumer groups.

In the kafka-consumer-groups.sh command, your group name is incorrect --group consumer-tutorial not consumer-tutorial-group

The problem with your code is directly related to the manual assignment of consumers to topic-partitions.
You specify a consumer group in the group.id property, however, the group ID is only used when you subscribe to a topic (or a set of topics) via the KafkaConsumer.subscribe() API. In your example, you are using the .assign() method, which manually attaches the client to the specified topic-partition pairs, without utilising the underlying consumer group primitives. It is for this reason you are unable to see the consumer lag. Tools such as Burrow will not work in this case, because they will query the offsets of the consumer group, which is not there.
There are two options available to you:
Use the consumer group feature properly, using the subscribe() API. This is the dominant use case for Kafka. However, the seekToBeginning() will also not work in this case, as the offsets will be entirely managed by the consumer group.
Drop the consumer group altogether and manage both partition assignments and offsets manually. This gives you the maximum possible flexibility but is a lot of work, and you might find yourself reinventing the wheel. Most people will not go down this path, unless the consumer group feature of Kafka does not suit your needs.
The choice will depend squarely on your use case. For conventional stream processing, #1 is the idiomatic approach. This is what Kafka was designed for. #2 implies that you know what you are doing and transfers all of the group management responsibility onto your application.
Note: Kafka does not have a "partial" mode where you do some of group management and Kafka does the rest. It's either all-in or none at all.

You can use simple and powerful tool for lag monitoring called
prometheus-kafka-consumer-group-exporter
refer below url:
https://github.com/braedon/prometheus-kafka-consumer-group-exporter
After installation run below command to export Consumer matrix on your required port Prometheus Kafka Consumer Group Exporter
/usr/bin/python3 /usr/local/bin/prometheus-kafka-consumer-group-exporter -p PORT -b KAFKA_CLUSTER_IP_PORT
After running above command verify data on http url YOUR-SERVER-IP:PORT like 127.0.0.1:9208
Now you can use any JMX scraper for dashboard and alert system. I am using prometheus & grafana
This can be run on any shared server like [kafka broker, zookeeper server, prometheus server or any] because it has very low overhead on system resources.

Related

Kafka consumer group description does not include all topics [duplicate]

What I want to achieve is to be sure that my Kafka streams consumer does not have lag.
I have simple Kafka streams application that materialized one topic as store in form of GlobalKTable.
When I try to describe consumer on Kafka by command:
kafka-consumer-groups --bootstrap-server localhost:9092 --describe --group my-application-id
I can't see any results. And there is no error either. When I list all consumers by:
kafka-consumer-groups --bootstrap-server localhost:9092 --describe --all-groups
my application consumer is listed correctly.
Any idea where to find additional information what is happening that I can't describe consumer?
(Any other Kafka streams consumers that write to topics can be described correctly.)
If your application does only materialize a topic into a GlobalKTable no consumer group is formed. Internally, the "global consumer" does not use subscribe() but assign() and there is no consumer group.id configured (as you can verify from the logs) and no offset are committed.
The reason is, that all application instances need to consume all topic partitions (ie, broadcast pattern). However, a consumer group is designed such that different instances read different partitions for the same topic. Also, per consumer group, only one offset can be committed per partition -- however, if multiple instance read the same partition and would commit offsets using the same group.id the commits would overwrite each other.
Hence, using a consumer group while "broadcasting" data does not work.
However, all consumers should expose a "lag" metrics records-lag-max and records-lag (cf https://kafka.apache.org/documentation/#consumer_fetch_monitoring). Hence, you should be able to hook in via JMX to monitor the lag. Kafka Streams includes client metrics via KafkaStreams#metrics(), too.

Unable to describe Kafka Streams Consumer Group

What I want to achieve is to be sure that my Kafka streams consumer does not have lag.
I have simple Kafka streams application that materialized one topic as store in form of GlobalKTable.
When I try to describe consumer on Kafka by command:
kafka-consumer-groups --bootstrap-server localhost:9092 --describe --group my-application-id
I can't see any results. And there is no error either. When I list all consumers by:
kafka-consumer-groups --bootstrap-server localhost:9092 --describe --all-groups
my application consumer is listed correctly.
Any idea where to find additional information what is happening that I can't describe consumer?
(Any other Kafka streams consumers that write to topics can be described correctly.)
If your application does only materialize a topic into a GlobalKTable no consumer group is formed. Internally, the "global consumer" does not use subscribe() but assign() and there is no consumer group.id configured (as you can verify from the logs) and no offset are committed.
The reason is, that all application instances need to consume all topic partitions (ie, broadcast pattern). However, a consumer group is designed such that different instances read different partitions for the same topic. Also, per consumer group, only one offset can be committed per partition -- however, if multiple instance read the same partition and would commit offsets using the same group.id the commits would overwrite each other.
Hence, using a consumer group while "broadcasting" data does not work.
However, all consumers should expose a "lag" metrics records-lag-max and records-lag (cf https://kafka.apache.org/documentation/#consumer_fetch_monitoring). Hence, you should be able to hook in via JMX to monitor the lag. Kafka Streams includes client metrics via KafkaStreams#metrics(), too.

Cannot setup consumer group in Kafka with Python

I'm new to Kafka and I've tried the Kafka-Python package.
I managed to setup a simple producer and consumer, which can send and receive messages. In this case the consumer is without using consumer group as below:
consumer = KafkaConsumer(queue_name, bootstrap_servers='kafka:9092')
However, when I started to use the group_id as below, it stops receiving any messages:
consumer = KafkaConsumer(bootstrap_servers='kafka:9092', auto_offset_reset='earliest', group_id='my-group')
consumer.subscribe([queue_name])
For comparison, I've also tried the confluent-kafka-python package, where I have the following consumer code, which also doesn't work:
consumer = Consumer({
'bootstrap.servers': 'kafka:9092',
'group.id': 'mygroup',
'auto.offset.reset': 'earliest'
})
consumer.subscribe([queue_name])
Also running ./kafka-consumer-groups.sh --bootstrap-server localhost:9092 --list gives empty result.
Any configuration I'm missing here?
By default, the consumer starts consuming from the last committed offsets which is probably the last offset in your case.
The auto.offset.reset only applies when there are no committed offsets. As by default the consumer automatically commits offsets, it usually only applies the first time your run it (there are a few other cases but they don't matter in this example).
So to see messages flowing, you need to either start producing once your consumer is running or use a different group name to allow auto.offset.reset to apply.

Kafka brokerlist vs 1 broker while creating a connection?

I am running a Kafka cluster.
Sample code:
Properties props = new Properties();
props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG,
"localhost:9092,localhost:9093,localhost:9094");
KafkaProducer<String, String> producer = new KafkaProducer<>(props);
Is mentioning whole broker list is mandatory?
What will happen, if I provide only localhost:9092, Will it always use this particular broker?
What if localhost:9092 broker is down?
Is the behavior same for consumer API?
Is mentioning whole broker list is mandatory?
No. The broker list does not have to contain the full set of servers. However, it is recommended to specify multiple ones in case of the server failure.
What will happen, if I provide only localhost:9092, Will it always use this particular broker?
No. Even only localhost:9092 is specified as bootstrap.servers, the clients will retrieve all broker list by sending Metadata request to that broker. After doing this, all brokers can service to the clients.
What if localhost:9092 broker is down?
That localhost:9092 is down only affects the partitions on that broker. No impact on the partitions on the rest brokers. However, if the client application was down as well, it could not find the cluster anymore even after it came back, since it failed to connect to the already-down localhost:9092. That's why it's recommended for users to provide several brokers instead of just one.
Is the behavior same for consumer API?
Yes, all above hold true for the consumer as well.

Messages sent to all consumers with the same consumer group name

There is following consumer code:
from kafka.client import KafkaClient
from kafka.consumer import SimpleConsumer
kafka = KafkaClient("localhost", 9092)
consumer = SimpleConsumer(kafka, "my-group", "my-topic")
consumer.seek(0, 2)
for message in consumer:
print message
kafka.close()
Then I produce message with script:
bin/kafka-console-producer.sh --broker-list localhost:9092 --topic my-topic
The thing is that when I start consumers as two different processes then I receive new messages in each process. However I want it to be sent to only one consumer, not broadcasted.
In documentation of Kafka (https://kafka.apache.org/documentation.html) there is written:
If all the consumer instances have the same consumer group, then this
works just like a traditional queue balancing load over the consumers.
I see that group for these consumers is the same - my-group.
How to make it so that new message is read by exactly one consumer instead of broadcasting it?
the consumer-group API was not officially supported untilĀ kafka v. 0.8.1 (released Mar 12, 2014). For server versions prior, consumer groups do not work correctly. And as of this post the kafka-python library does not currently attempt to send group offset data:
https://github.com/mumrah/kafka-python/blob/c9d9d0aad2447bb8bad0e62c97365e5101001e4b/kafka/consumer.py#L108-L115
Its hard to tell from the example above what your Zookeeper configuration is or if there's one at all. You'll need a Zookeeper cluster for the consumer group information to be persisted WRT what consumer within each group has consumed to a given offset.
A solid example is here:
Official Kafka documentation - Consumer Group Example
This should not happen - make sure that both of the consumers are being registered under the same consumer group in the zookeeper znodes. Each message to a topic should be consumed by a consumer group exactly once, so one consumer out of everyone in the group should receive the message, not what you are experiencing. What version of Kafka are you using?