Can we control the transaction retry interval in MDB? If so, please provide an example or direct me to the documentation. We want to set up a time interval of 3 min for MDB transactions. The desire is that if the query fails \first time, then it retries after 3 min of time has elapsed.
Vairam;
Take a look at the Hornet Documentation for Message Redelivery. The issues you need to consider are:
The redelivery delay (you indicated 3 minutes).
The number of times the message should be redelivered.
If you elect not to redeliver indefinitely, the final action that occurs when the last redelivery attempt fails which could be:
Drop the message.
Enqueue the message to the designated DLQ.
Enqueue the message to some other queue.
Setting the redelivery delay
Delayed redelivery is defined in the address-setting configuration.
Example:
<!-- delay redelivery of messages for 3m -->
<address-setting match="jms.queue.exampleQueue">
<redelivery-delay>300000</redelivery-delay>
</address-setting>
Setting the maximum number of redeliveries and DLQ configuration
This can be defined declaratively by specifying the DLQ configuration in the address-setting configuration:
Example:
<!-- undelivered messages in exampleQueue will be sent to the dead letter address
deadLetterQueue after 3 unsuccessful delivery attempts
-->
<address-setting match="jms.queue.exampleQueue">
<dead-letter-address>jms.queue.deadLetterQueue</dead-letter-address>
<max-delivery-attempts>3</max-delivery-attempts>
</address-setting>
If you want to drop the message after the designated number of redelivery failures, check the message header value of "JMSXDeliveryCount" and if that number is equal to the maximum redeliveries, simply supress any exceptions and commit the transaction.
Related
If using SeekToCurrentErrorHandler with stateful retry, such that the message is polled from the broker for each retry, there is a risk that for a long retry period that a consumer group rebalance could cause the partition to be re-assigned to another consumer. Hence the stateful retry period/attempts would be reset, as the new consumer has no knowledge of the state of the retry.
Taking an example, if a retry max period was 24 hours, but consumer group re-balancing was happening on average every 12 hours, the retry could never complete, and the message (and those behind it) would eventually expire from the topic once they exceeded the retention period. (Assuming the cause of the retryable exception was not resolved in this time). The message would not end up on the DLT after 24 hours as expected, as retries would not be exhausted due to the reset.
I assume that even if a consumer is retrying by re-polling messages, there is no guarantee that following a re-balance that this consumer would retain assignment to this partition. Or is it the case that we can be confident that so long as this consumer instance is alive that it would typically retain assignment to the partition it is polling?
Are there best practises/guidelines on use of stateful retry to cater for this?
Stateless retry means any total retry time that exceeds the poll timeout would cause rebalancing and duplicate message delivery. To avoid that then the retry period must be very limited. Or is the guideline to allow this, ensure messages are deduplicated by the consumer, so that the duplicate messages are acceptable and long running stateless retries can be configured?
Is the only safe and stable option for enabling a retry period of something like several hours (e.g. to cater for a service being unavailable for this period) to use retry topics?
Thanks,
Rob.
The whole point of stateful retry was to avoid a rebalance; without it, the consumer would be delayed up to the aggregate of all retry attempt delays.
However, retry in the listener adapter (including stateful retry) has now been deprecated because the error handler can now do everything the RetryTemplate can do (back off, exception classification, etc, etc).
With stateful retry (or backoffs in the error handler), the longest back off must be less than max.poll.interval.ms.
A 24 hour backoff is, frankly, ridiculous - it would be better to just stop the container and restart it a day later.
I am going through the documentation and it is little confusing about the parameter "max.in.flight.requests.per.connection"
The maximum number of unacknowledged requests the client will send on a single connection before blocking. Note that if this setting is set to be greater than 1 and there are failed sends, there is a risk of message re-ordering due to retries (i.e., if retries are enabled).
The phrase "unacknowledged requests" refers to per producer or per connection or per client ?
Edit
Please see the answer below from Eugene. I'm not sure if this answer was wrong or if Kafka changes the behaviour in the 2 years between the answers.
Original answer
It's per partition. Kafka internally might multiplex connections (e.g. to send several requests using a single connect for different topics/partitions that are handled by the same broker), or have an individual connection per partition, but these are performance concerns which are mostly dealt within the client.
The documentation of retries, sheds some more light (and clarifies that is per partition)
Setting a value greater than zero will cause the client to resend any record whose send fails with a potentially transient error. Note that this retry is no different than if the client resent the record upon receiving the error. Allowing retries without setting max.in.flight.requests.per.connection to 1 will potentially change the ordering of records because if two batches are sent to a single partition, and the first fails and is retried but the second succeeds, then the records in the second batch may appear first. Note additionally that produce requests will be failed before the number of retries has been exhausted if the timeout configured by delivery.timeout.ms expires first before successful acknowledgement. Users should generally prefer to leave this config unset and instead use delivery.timeout.ms to control retry behavior.
This is a setting per connection, per broker. If you have a producer, then internally it uses a Sender Thread that dispatches batches from the RecordAccumulator to the broker (in simpler words : sends messages). This sender thread is allowed to have a max of ${max.in.flight.requests.per.connection} requests that it has not yet received acknowledgements from the broker. Think about this way: a sender does some operations in typical processing.
Drain batches -> Make Requests -> Pool Connections -> Fire Callbacks.
So at some point (Pool Connections) it can send a request to the broker, but not wait for a response, it will check for the response in the next cycle. It can have such unacknowledged requests, up to that max.in.flight.requests.per.connection value.
If ActiveMQ Artemis is configured with a redelivery-delay > 0 and a JMS listener uses ctx.rollback() or ctx.recover() then the broker will redeliver the message as expected. But if a producer pushes a message to the queue during a redelivery then the receiver gets unordered messages.
For example:
Queue: 1 -> message 1 is redelivered as expected
Push during the redelivery phase
Queue: 2,3 -> the receiver gets 2,3,1
With a redelivery-delay of 0 everything is ok, but the frequency of redeliveries on consumer side is too high. My expectation is that every delivery to the consumer should be stopped until the unacknowledged message is purged from the queue or acknowledged. We are using a queue for connection with single devices. Every device has it's own I/O queue with a single consumer. The word queue suggest strict ordering to me. It could be nice to make this behavior configurable like "strict_redelivery_order".
What you're seeing is the expected behavior. If you use a redelivery-delay > 0 then delivery order will be broken. If you use a redelivery-delay of 0 then delivery order will not be broken. Therefore, if you want to maintain strict order then use a redelivery-delay of 0.
If the broker blocked delivery of all other messages on the queue during a redelivery delay that would completely destroy message throughput performance. What if the redelivery delay were 60 seconds or 10 minutes? The queue would be blocked that entire time. This would not be tenable for an enterprise message broker serving hundreds or perhaps thousands of clients each of whom may regularly be triggering redeliveries on shared queues. This behavior is not configurable.
If you absolutely must maintain message order even for messages that cannot be immediately consumed and a redelivery-delay of 0 causes redeliveries that are too fast then I see a few potential options (in no particular order):
Configure a dead-letter address and set a max-delivery-attempts to a suitable value so after a few redeliveries the problematic message can be cleared from the queue.
Implement a delay of your own in your client. This could be as simple as catching any exception and using a Thread.sleep() before calling ctx.rollback().
If I am using Kafka Async producer, assume there are X number of messages in buffer.
When they are actually processed on the client, and if broker or a specific partition is down for sometime, kafka client would retry and if a message is failed, would it mark the specific message as failed and move on to the next message (this could lead to out of order messages) ? Or, would it fail the remaining messages in the batch in order to preserve order?
I next to maintain the ordering, so would ideally want to kafka to fail the batch from the place where it failed, so I can retry from the failure point, how would I achieve that?
Like it says in the kafka documentation about retries
Setting a value greater than zero will cause the client to resend any
record whose send fails with a potentially transient error. Note that
this retry is no different than if the client resent the record upon
receiving the error. Allowing retries will potentially change the
ordering of records because if two records are sent to a single
partition, and the first fails and is retried but the second succeeds,
then the second record may appear first.
So, answering to your title question, no kafka doesn't have order guarantees under async sends.
I am updating the answers base on Peter Davis question.
I think that if you want to send in batch mode, the only way to secure it I would be to set the max.in.flight.requests.per.connection=1 but as the documentation says:
Note that if this setting is set to be greater than 1 and there are
failed sends, there is a risk of message re-ordering due to retries
(i.e., if retries are enabled).
Starting with Kafka 0.11.0, there is the enable.idempotence setting, as documented.
enable.idempotence: When set to true, the producer will ensure that
exactly one copy of each message is written in the stream. If false,
producer retries due to broker failures, etc., may write duplicates of
the retried message in the stream. Note that enabling idempotence
requires max.in.flight.requests.per.connection to be less than or
equal to 5, retries to be greater than 0 and acks must be all. If
these values are not explicitly set by the user, suitable values will
be chosen. If incompatible values are set, a ConfigException will be
thrown.
Type: boolean Default: false
This will guarantee that messages are ordered and that no loss occurs for the duration of the producer session. Unfortunately, the producer cannot set the sequence id, so Kafka can make these guarantees only per producer session.
Have a look at Apache Pulsar if you need to set the sequence id, which would allow you to use an external sequence id, which would guarantee ordered and exactly-once messaging across both broker and producer failovers.
I am trying to deliver a JMS message after some time passes, my initial idea was to use expiry queue and to put the messages in a queue that doesn't have any consumers. So I have 3 default queues:
WaitQueue - (expiry queue for this one is set to SendQueue)
SendQueue - this one has consumers that process the messages(by default this one has expiryQueue as its timeout queue)
ExpiryQueue - default jboss queue for all messages that really expired(not intentionally)
In insert a message into the WaitQueue with my intended delay as TimeToLive, after the time expires I expect to see the messages in SendQueue(and the consumers to process them), however it stays empty and the messages directly go to ExpiryQueue, any ideas what is wrong?
The statistics for SendQueue shows that "Received messages" increase, but current messages stays at 0, so they arrive but get forwarded immediately to the last ExpiryQueue.
Instead of using expiry queue approach which is more resource intensive; you could consider using delivery delay at the Message level.
In case of HornetQ, you can set the property _HQ_SCHED_DELIVERY.
https://docs.jboss.org/hornetq/2.3.0.Final/docs/user-manual/html/scheduled-messages.html
TextMessage message = session.createTextMessage("This is a scheduled message message which will be delivered in 5 sec.");
message.setLongProperty("_HQ_SCHED_DELIVERY", System.currentTimeMillis() + 5000);
producer.send(message);
Since JMS2.0 (JavaEE7) this property can also be set on MessageProducer. See https://github.com/jboss/jboss-jms-api_spec/blob/master/src/main/java/javax/jms/MessageProducer.java#L285