messages consumed after sgnificant delay in jboss with hornetq

messages consumed after sgnificant delay in jboss with hornetq - jboss

I have a strange problem. After upgrading to jboss 6.1 final AS 7 the hornet queue listeners are firing after a delay of 1 day. I can see the queue has data which is more than the max limit so the data is being paged by jboss. But why the delay in consuming the messages?
Any help guys...

Related

Records associated to a Kafka batch listener are not consumed for some partitions after several rebalances (resiliency testing)

Some weeks ago my project has been updated to use Kafka 3.2.1 instead of using the one coming with Spring Boot 2.7.3 (3.1.1). We made this upgrade to avoid an issue in Kafka streams – Illegal state and argument exceptions were not ending in the uncaught exception handler.
On the consumer side, we also moved to the cooperative sticky assignator.
In parallel, we started some resiliency tests and we started to have issues with Kafka records that are not consumed anymore on some partitions when using a Kafka batch listener. The issue occurred after several rebalances caused by the test (deployment is done in Kubernetes and we stopped some pods, micro services and broker instances). The issue not present on every listeners. Kafka brokers and micro-services are up and running.
During our investigations,
we enabled Kafka events and we can clearly see that the consumer is started
we can see in the logs that the partitions that are not consuming events are assigned.
debug has been enabled on the KafkaMessageListenerContainer. We see a lot of occurrences of Receive: 0 records and Commit list: {}
Is there any blocking points to use Kafka 3.2.1 with Spring Boot/Kafka 2.7.3/2.8.8?
Any help or other advices are more than welcome to progress our investigations.

Multiple listeners are defined, the retry seems to be fired from another listener (shared err handler?).
This is a known bug, fixed in the next release:
https://github.com/spring-projects/spring-kafka/issues/2382
https://github.com/spring-projects/spring-kafka/commit/3de1e89ba697ead04de171cfa35273bb0daddbe6
Temporary work around is to give each container its own error handler.

Kafka ConcurrentMessageListenerContainer stops consuming abruptly

I am using Spring kafka for consuming message using ConcurrentMessageListenerContainer , in production I am seeing it stops consuming the messages abruptly , without any errors , sometimes even a single consumer with the VM stops consuming while other consumers are still consuming (I have 15 partitions and 3 JVM , each has concurrency of 5).
When I restart the JVM , it starts consuming !!!
Is there anyway I can check periodically whether consumer had died or something and I can restart it without restarting the JVM !!

Most likely, the consumer thread is "stuck" in your code somewhere. I suggest you take a thread dump when this happens, to see what the thread is doing.

Kafka broker occassionally takes much longer than usual to load logs on startup

We are observing that Kafka brokers occasionally take much more time to load logs on startup than usual. Much longer in this case means 40 minutes instead of at most 1 minute. This happens during a rolling restart following the procedure described by Confluent. This happens after the broker reported that controlled shutdown was succesful.
Kafka Setup
Confluent Platform 5.5.0
Kafka Version 2.5.0
3 Replicas (minimum 2 in sync)
Controlled broker shutdown enabled
1TB of AWS EBS for Kafka log storage
Other potentially useful information
We make extensive use of Kafka Streams
We use exactly-once processing and transactional producers/consumers
Observations
It is not always the same broker that takes a long time.
It does not only occur when the broker is the active controller.
A log partition that loads quickly (15ms) can take a long time (9549 ms) for the same broker a day later.
We experienced this issue before on Kafka 2.4.0 but after upgrading to 2.5.0 it did not occur for a few weeks.
Does anyone have an idea what could be causing this? Or what additional information would be useful to track down the issue?

Kafka streams 1.0: processing timeout with high max.poll.interval.ms and session.timeout.ms

I am using a stateless processor using Kafka streams 1.0 with kafka broker 1.0.1
The problem is, the CustomProcessor get closed every few seconds, which resulted in rebalance signal, I am using the following configs:
session.timeout.ms=15000
heartbeat.interval.ms=3000 // set it to 1/3 session.timeout
max.poll.interval.ms=Integer.MAX_VALUE // make it that large as I am doing a intensive computational operations that might take up to 10 mins processing 1 kafka message (NLP operations)
max.poll.records=1
despite this configuration and my understanding of how kafka timeout configurations work, I see the consumer rebalancing every few seconds.
I already went through the below article and other stackoverflow questions. about how to tune the long time operations and avoid very long session timeout that will make failure detection so late, however I still see unexpected behavior, unless I misunderstand something.
KIP-62
Diff between session.timeout.ms and max.poll.interval
Kafka kstreams processing timeout
For the consumer environment setup, I have 8 machines each 16 code, and consuming from 1 topic with 100 partitions, I am following what practice this confluent doc here recommends.
Any pointers?

I figured it out. after lots of debugging and enable verbose logging for both kafka streams client and the broker, it turned out to 2 things:
There is a critical bug in streams 1.0.0 (HERE), so I upgraded my client version from 1.0.0 to 1.0.1
I update the value of the consumer property default.deserialization.exception.handler from org.apache.kafka.streams.errors.LogAndFailExceptionHandler to org.apache.kafka.streams.errors.LogAndContinueExceptionHandler.
After the above 2 changes, everything went so perfect with no restarts, I am using grafana to monitor the restarts, and for the past 48 hours, there is no single restart happened.
I might do more troubleshooting to make sure which of the 2 items above make the real fix, but I am on a hurry to deploy to production, so if anybody is intrested to start from there, go ahead, else, once I got time will do the further analysis and update the answer!
So happy to get this fixed!!!

Delaying JMS messages in queue when starting JBoss

Is there a way to tell JMS in JBoss to delay processing of messages already in the persistent queue for a while, e.g. 2 minutes, while JBoss starts.
As it is right now, when we restart JBoss, JMS starts to dispatch messages to the MessagesListeners even before JBoss has started properly.
We're running JBoss 4.2.3

I have found an annotation called Depends where an ejb or other service can list what you depend on:
http://docs.jboss.org/ejb3/docs/reference/build/reference/en/html/jboss_extensions.html
To actually start an ejb when ther server is up and listening this works best:
http://community.jboss.org/wiki/BarrierController