Discard duplicate message only if they are still queued with ActiveMQ Artemis and JBoss EAP 7.1 - activemq-artemis

We're using ActiveMQ Artemis on JBoss EAP 7.1.
We noticed that once a message with a specific _AMQ_DUPL_ID value is passed through the queue, if the message producer tries to send a message with the same _AMQ_DUPL_ID value to the same queue again it is discarded by the broker. However, our need is to discard duplicate messages only if they are still in queue.
Is there a way to achieve this goal?
We use the primary key from the database as _AMQ_DUPL_ID value. This is the code we use
public void sendMessage(final T msg, final String id) {
jmsTemplate.send(destination, new MessageCreator() {
#Override
public Message createMessage(Session session) throws JMSException {
Message message = session.createObjectMessage(msg);
message.setStringProperty("_AMQ_DUPL_ID", id);
return message;
}
});
}
We're looking for a solution because we have a timer that every 30 seconds loads from DB all records with a specific value for status field and puts them into the JMS queue.
Consumers consume JMS messages, processes them, updates their status field, insert/update them into the DB and opens a websocket connection with another application that we don't control. Sometimes the consumer hangs on the websocket call and, consequently, it remains busy while the timer continues to fill the queue.
To solve this problem we thought that something like Artemis duplicate message detection would help. However, when the external app hangs our consumer we need our timer to be able to put the message on the queue again.

The duplicate message detection on ActiveMQ Artemis is working as designed. Its goal is to avoid any chance that a consumer will receive a duplicate message which means that even though a message may no longer be in the queue (e.g. because it was consumed) any duplicate of that message should still be rejected.
What you're asking for here is akin to asking how you can insert multiple records with the same primary key into a database table. It simply can't because because the entire point of having a primary key is to avoid duplicate records.
I recommend you implement some kind of timeout for the websocket call otherwise your application will be negatively impacted by a resource you have no control over.
Aside from that you may be able to use a last-value queue using the primary key as the value for _AMQ_LVQ_NAME. This will guarantee that only 1 instance of the message will be in a queue at any point. Read the documentation for more details.

Related

Reconsume Kafka Message that failed during processing due to DB error

I am new to Kafka and would like to seek advice on what is the best practice to handle such scenario.
Scenario:
I have a spring boot application that has a consumer method that is listening for messages via the #KafkaListner annotation. Once an incoming message has occurred, the consumer method will process the message, which simply performs database updates to different tables via JdbcTemplate.
If the updates to the tables are successful, I will manually commit the message by calling the acknowledge() method. If the database update fails, instead of calling the acknowledge() method, I will call the nack() method with a given duration (E.g. 10 seconds) such that the message will reappear again to be consumed.
Things to note
I am not concerned with the ordering of the messages. Whatever event comes I just have to consume and process it, that's all.
I am only given a topic (no retryable topic and no dead letter topic)
Here is the problem
If I do the above method, my consumer becomes inconsistent. Let's say if I call the nack() method with a duration of 1min, meaning to say after 1 min, the same message will reappear.
Within this 1 min, there could "x" number of incoming messages to be consumed and processed. The observation made was none of these messages are getting consumed and processed.
What I want to know
Hence, I hope someone will advise me what I am doing wrongly and what is the best practice / way to handle such scenarios.
Thanks!
Records are always received in order; there is no way to defer the current record until later, but continue to process other records after this one when consuming from a single topic.
Kafka topics are a linear log and not a queue.
You would need to send it to another topic; the #RetryableTopic (non-blocking retrties) feature is specifically designed for this use case.
https://docs.spring.io/spring-kafka/docs/current/reference/html/#retry-topic
You could also increase the container concurrency so at least you could continue to process records from other partitions.

Message order issue in single consumer connected to ActiveMQ Artemis queue

Any possibility of message order issue while receive single queue consumer and multiple producer?
producer1 publish message m1 at 2021-06-27 02:57:44.513 and producer2 publish message m2 at 2021-06-27 02:57:44.514 on same queue worker_consumer_queue. Client code connected to the queue configured as single consumer should receive message in order m1 first and then m2 correct? Sometimes message receive in wrong order. version is ActiveMQ Artemis 2.17.0.
Even though I mentioned that multiple producer, message publish one after another from same thread using property blockOnDurableSend=false.
I create and close producer on each message publish. On same JVM, my assumption is order of published messages in queue, from same thread or from different threads even with async. timestamp is getJMSTimestamp(). async publish also maintain any internal queue has order?
If you use blockOnDurableSend=false you're basically saying you don't strictly care about the order or even if the message makes it to the broker at all. Using blockOnDurableSend=false basically means "fire and forget."
Furthermore, the JMSTimetamp is not when the message is actually sent as noted in the javax.jms.Message JavaDoc:
The JMSTimestamp header field contains the time a message was handed off to a provider to be sent. It is not the time the message was actually transmitted, because the actual send may occur later due to transactions or other client-side queueing of messages.
With more than one producer there is no guarantee that the messages will be processed in order.
More producers, ActiveMQ Artemis and one consumer are a distributed system and the lack of a global clock is a significant characteristic of distributed systems.
Even if producers and ActiveMQ Artemis were on the same machine and used the same clock, ActiveMQ Artemis could not receive the messages in the same order producers would create and send their messages. Because the time to create a message and the time to send a message include variable time latencies.
The easiest solution is to trust the order of the messages received by ActiveMQ Artemis, adding a timestamp with an interceptor or enabling the ingress timestamp, see ARTEMIS-2919 for further details.
If the easiest solution doesn't work, the distributed solution is to implement a distributed system total ordering algorithm as lamport timestamps.
Well, as it seams it is not a bug within Artemis, when it comes to a millisecond difference it is more like a network lag or something like this.
So to workaround I got to the idea, you could create a algorythm in which a recieved message will wait for ~100ms before it is really worked through (whatever you want to be doing with this message) and check if there is another message which your application recieved afterwards but is send before. So basicly have your own receiver queue with a delay.
IF there is message that was before, you could simply move that up in your personal algorythm. You could also think about to reject the first message back to your bus, depending on your settings on queues and topics it would be able to recieve it afterwards again.

Kafka Streams handling timeout on a cluster

In a Kafka based distributed JVM application running in several instances, I need to act on the event of "not receiving" a certain message in a specific Kafka topic for a certain configurable amount of time (this timeout value is driven by the business logic, is subject to change).
How can I accomplish this in a cluster-safe way?
Is the goal to trace latency of the E2E flow or is there some trigger which causes a second message to be expected in some configurable time?
If tracking latency, some options include:
Add a timestamp to the message. When the message is received, the latency can be calculated and used.
Add UUID, timestamp, and current component to the message and delegate message tracking to a separate service partitioned on UUID.
If some trigger causes a second message to be expected, some options include:
Partition the relevant topic in a way that guarantees the expected message will either arrive or not arrive at only 1 JVM (similar to 2 above). This will allow a list of expected messages to be kept in memory. Remove the expected messages when received and every N seconds handle 'not received' messages.
Keep track of the expected messages in a data store (DB/distributed cache). When received, remove the records. Periodically, handle 'not received' messages.
Edit:
With details in the comment, one way to approach this with a callback style approach. Messages can be routed to a specific server by setting a partition key. By adding an intermediate topic/service partitioned on UUID it should be possible to achieve this as follows:
Send Message A to ttl_routing_service. Message A should include a UUID, TTL, where to send the message (functional topic), and what to do on expiry.
Routing Service picks up the message and tracks some metadata (ex: TTL/what to do on timeout) in a local cache or starts a delayed coroutine then routes message A to the functional topic including the UUID.
On completion of message A processing, a message can be sent to ttl_routing_service with the UUID preventing the coroutine or removing the cached record.
If not removed, 'what to do on expiry' is performed.

Kafka Consumes unprocessable messages - How to reprocess broken messages later?

We are implementing a Kafka Consumer using Spring Kafka. As I understand correctly if processing of a single message fails, there is the option to
Don't care and just ACK
Do some retry handling using a RetryTemplate
If even this doesn't work do some custom failure handling using a RecoveryCallback
I am wondering what your best practices are for that. I think of simple application exceptions, such as DeserializationException (for JSON formatted messages) or longer local storage downtime, etc. Meaning there is needed some extra work, like a hotfix deployment, to fix the broken application to be able to re-process the faulty messages.
Since losing messages (i. e. not processing them) is not an option for us, the only option left is IMO to store the faulty messages in some persistence store, e. g. another "faulty messages" Kafka topic for example, so that those events can be processed again at a later time and there is no need to stop event processing totally.
How do you handle these scenarios?
One example is Spring Cloud Stream, which can be configured to publish failed messages to another topic errors.foo; users can then copy them back to the original topic to try again later.
This logic is done in the recovery callback.
We have a use case where we can't drop any messages at all, even for faulty messages. So when we encounter a faulty message, we will send a default message in place of that faulty record and at the same time send the message to a failed-topic for retry later.

custom Flume interceptor: intercept() method called multiple times for the same Event

TL;DR
When a Flume source fails to push a transaction to the next channel in the pipeline, does it always keep event instances for the next try?
In general, is it safe to have a stateful Flume interceptor, where processing of events depends on previously processed events?
Full problem description:
I am considering the possibility of leveraging guarantees offered by Apache Kafka regarding the way topic partitions are distributed among consumers in a consumer group to perform streaming deduplication in an existing Flume-based log consolidation architecture.
Using the Kafka Source for Flume and custom routing to Kafka topic partitions, I can ensure that every event that should go to the same logical "deduplication queue" will be processed by a single Flume agent in the cluster (for as long as there are no agent stops/starts within the cluster). I have the following setup using a custom-made Flume interceptor:
[KafkaSource with deduplication interceptor]-->()MemoryChannel)-->[HDFSSink]
It seems that when the Flume Kafka source runner is unable to push a batch of events to the memory channel, the event instances that are part of the batch are passed again to my interceptor's intercept() method. In this case, it was easy to add a tag (in the form of a Flume event header) to processed events to distinguish actual duplicates from events in a failed batch that got re-processed.
However, I would like to know if there is any explicit guarantee that Event instances in failed transactions are kept for the next try or if there is the possibility that events are read again from the actual source (in this case, Kafka) and re-built from zero. In that case, my interceptor will consider those events to be duplicates and discard them, even though they were never delivered to the channel.
EDIT
This is how my interceptor distinguishes an Event instance that was already processed from a non-processed event:
public Event intercept(Event event) {
Map<String,String> headers = event.getHeaders();
// tagHeaderName is the name of the header used to tag events, never null
if( !tagHeaderName.isEmpty() ) {
// Don't look further if event was already processed...
if( headers.get(tagHeaderName)!=null )
return event;
// Mark it as processed otherwise...
else
headers.put(tagHeaderName, "");
}
// Continue processing of event...
}
I encountered the similar issue:
When a sink write failed, Kafka Source still hold the data that has already been processed by interceptors. In next attempt, those data will send to interceptors, and get processed again and again. By reading the KafkaSource's code, I believe it's bug.
My interceptor will strip some information from origin message, and will modify the origin message. Due to this bug, the retry mechanism will never work as expected.
So far, The is no easy solution.