Applications which have Kafka consumers could have REST services as part of deployment.
The problem here is that manipulation of offsets cannot happen when the consumer group is active and requires the consumer group to be inactive by stopping the application. This would also mean the REST services would be down for that amount of time.
Please suggest if there are ways to have them in the same deployment and yet allow offset manipulation without downtime or should they be not bundled altogether, thanks.
manipulation of offsets cannot happen when the consumer group is active and requires the consumer group to be inactive by stopping the application.
Since you tagged the question with spring-kafka, I assume you are using Spring for Apache Kafka.
You can stop the listener container(s), which will close the consumer(s); then restart them.
You can also manipulate the offsets programmatically, while the consumers are running.
https://docs.spring.io/spring-kafka/docs/current/reference/html/#seek
Related
In my case, it is a valid possibility that a consumer is offline for a longer period. During that offline period, events are still published to the topic.
When the consumer comes back online, it will re-use its existing consumer group, which has been lagging. Is it possible to skip forward to the latest message only? That is, ignore all earlier messages. In other words, I want to alter the offset to the latest message prior to consuming.
There is the spring.kafka.consumer.auto-offset property, but as far as I understand, this is only applicable for new consumer groups. Here, I am re-using an existing consumer group when the consumer comes back online. That said, if there is a possibility to automatically prune a consumer group when its consumer goes offline, this property could work, but I am not sure if such functionality exists?
I am working with the Spring Boot Kafka integration.
You can use consumer seek method after you calculate the last offset then subtract one from that, commit, and start polling.
Otherwise, simply don't use the same group. Generate a unique one and/or disable auto commits, then you're guaranteed to always use the auto.offset.reset config and lag is meaningless across app restarts
Is there any way we can detect crash or shut down of consumer?
I want that kafka server publish event when mentioned situation to all kafka clients (publishers, consumers....).
Is it possible?
Kafka keeps track of the consumed offsets per consumer on special internal topics. You could setup a special "monitoring service", have that constantly consuming from those offset internal topics, and trigger any notification/alerting mechanisms as needed so that your publishers and other consumers are notified programatically. This other SO question has more details about that.
Depending on your use case, lag monitoring is also a really good way to know if your consumers are falling behind and/or crashed. There's multiple solutions for that out there, or again, you could build your own to customize alerting/notification behavior.
I am interested in monitoring the consuming behavior. In particular, I would like to know when which messages were ready by which consumer group. Is there an offset or consumer history that I can access?
If it helps, I use Confluent Cloud for setting up the topics, etc.
If I understand your question correctly, you would like to know when events were processed by your consumer?
In that case, you should just add logging to your consumer code, then use a log-collection tool like Elasticsearch or Splunk like you'd use for tracking logs/history across any other services.
Is there any mechanism to get notified (by a specific logfile entry or similar) in case an event within a kafka topic is expired due to retention policies? (I know this should avoided by design, but still).
I know about consumer lag monitoring tools for monitoring offset discrepancies between a published event and related consumer groups but they provide afaik only numbers (the offset difference).
In another simple words: How can we find out if kafka events were never consumed and therefore expired?
The log cleaner thread will output deletion events to the broker logs, but it'll reflect file segments not particular messages
Is it possible to track who the consumers are in Kafka? Perhaps some way for a consumer to 'register' itself as a consumer to a Kafka topic?
If #1 is possible, then is it also possible for Kafka to track the time when a consumer consumed a message?
It is possible to implement these features in the application itself, but I wonder if Kafka already provides some way to do this. I can't find any documentation on these features so perhaps this is not possible in Kafka, but it would be great to get confirmation. Thank you!
You can track consumer groups, but I do not think you can track consumers within the group very easily. Within a group, it gives you lag, and from lag, you would need to read that offset difference to actually get the time
There is no other such "registration process".
What you could do is develop an "interceptor" that is able to track messages and times throughout the system. That is how Confluent Control Center is able to graphically display if/when consumers get messages
However, that requires additional configurations on all consumers. More specifically, the interceptor on the classpath.