The following subscribed topics are not assigned to any members - apache-kafka

I'm still quite new to kafka, and something happened to me which I don't understand. I have 2 apps. One is using SpringKafkas listeners to consumer binary avro messages and process them. For some debugging purposes I wrote trivial second cli app, which take topic, offset brokers etc. to consume it using just plain kafka classes, without spring kafka. Both were working for a while. Recently something went wrong with the server.
Main app now does not consume any messages. During startup is sometimes(!) prints message like: "The following subscribed topics are not assigned to any members ..." When I enable debugging, I see it's spinning on "Leader for partition ... unavailable for fetching offset, wait for metadata refresh".
Ok, I could understand from that, that "something went wrong" with the cluster. What is puzzling me a lot is, that our cli app has no issues whatever to connect to the cluster and prints all messages from beginning of topic. I reverified multiple times, that both apps uses same brokers, same topic, different groupId. Neither app specifies partition id anywhere.
What could be the cause, that one app have no problemm reading from topic, while another is unable to "connect" and consume neither preexisting nor new messages from topic?

Related

Kafka - redriving events in an error topic

We've implemented some resilience in our kafka consumer by having a main topic, a retry topic and an error topic as outlined in this blog.
I'm wondering what patterns teams are using out there to redrive events in the error topic back into the retry topic for reprocessing. Do you use some kind of GUI to help do this redrive? I foresee a need to potentially append all events from the error topic into the retry topic, but also to selectively skip certain events in the error topic if they can't be reprocessed.
Two patterns I've seen
redeploy the app with a new topic config (via environment variables or other external config).
Or use a scheduled task within the code that checks the upstream DLQ topic(s)
If you want to use a GUI, that's fine, but seems like more work for little gain as there's no tooling already built around that

How to check the progress status of the messages in kafka?

I have designed the REST Post API in java which actually publishes the message to particular Kafka topic, lets say its "ProductTopic".
In the background, a microservice is listening to this "ProductTopic" topic and start to consume the message and saves to DB. Now i would like write a GET REST API to see the progress(which gives the output of job) of the job, like how much messages are successfully consumed and how is still pending. So that end user will have an idea about what's happening.
Is there a way to achieve this ? I did searched a lot in google, all i see was the command line query to see the consumption of the messages. Not any java implementation example available from confluent side. Any help would be appreciated.
You should check consumer lag for the consumer group of your service. Lag is approximately endOffset-currentOffset. You can find examples here

kafka Manager not showing consumer lag or Sum of partition offsets

I don't know if it is config issue or not. The kafka manager can get everything except consumer lag. Can anyone help me?
I'm using kafka 2.2.0
I also checked logs, no errors at all.
And also one of the important thing is that my code base is processing messages and inserting it to database as well.
But why I posted this question is because I may not be to see lags in the topic if it exist or not.
So that I can decide how many consumer i have to run.

Throttling of messages on consumer side

I am beginner level at kafka and have developed consumer for kafka messages which looks good right now.
Though there is a requirement came along while testing of consumer that may be some throttling of messages will be needed at consumer side.
The consumer (.net core, using confluent), after receiving messages, calls api and api processes the message. As part this process, It has few number of read and write to database.
The scenario is, Consumer may receive millions or atleast few thousand of messages daily. This makes load on DB side as part of processing.
So I am thinking to put some throttling on receiving messages on kafka consumer so the DB will not be overloaded. I have checked the option for poll but seems its not all that I want.
For example, within 10 minutes, consumer can receive 100k messages only. Something like that.
Could anybody please suggest how to implement throttling of messages on kafka consumer or is there any better way that this can be handled?
I investigated more and come to know from expert that "throttling on consumer side is not easy to implement, since kafka consumer is implemented in such way to read and process messages as soon as they are available in kafka topic. So, speed is a benefit in kafka world :)"
Seems I can not do much at kafka consumer side. I am thinking to see on the other side and may be separating reads (to replica) and writes to the database can help.

Jboss Messaging. sending one message per time

We are using JBOSS 5.1.0, we using topic for storing our messages. And our client is making a durable subscription to get those messages.
Everything is working fine, but one issue is we are getting data from TCP client, we are processing and keeping it in topic, it is sending around 10 messages per second, and our client is reading one message at a time. There is a huge gap between that, and after sometime JBOSS Topic have many messages and it crashes saying out of memory.
IS there any workaround for this.
Basically the producer is producing 10x more messages than consumer can handle. If this situation is stable (not only during peak), this will never work.
If you limit the producer to send only one message per second (which is of course possible, e.g. check out RateLimiter), what will you do with extra messages on the producer side? If they are not queueing up in the topic, they will queue up on the producer side.
You have few choices:
somehow tune your consumer to process messages faster, so the topic is never filled up
tune the topic to use persistent storage. This is much better. Not only the topic won't store everything in memory, but you might also get transactional behaviour (messages are durable)
put a queue of messages that you want to set to the topic and process one message per second. That queue must be persistent and must be able to keep more messages than the topic currently can