CQRS + ES Implementation Advice - apache-kafka

I'm working on a generic CQRS + ES framework (with nodejs) in the company. Remark: Only RDBMS + Redis (without AOF/RDB persistence) is allowed due to some reasons.
I really need some advices on how to implement the CQRS + ES framework....
Ignoring the ES part, I'm struggling with the implementation on the message propagation.
Here is the tables I have in the RDBMS.
EventStore: [aggregateId (varchar), aggregateType (varchar), aggregateVersion (bigint), messageId (varchar), messageData (varchar), messageMetadata (varchar), sequenceNumber (bigint)]
EventDelivery: [messageId (varchar, foreign key to EventStore), sequenceId (equal to aggregateId, varchar), sequenceNumber (equal to the one in EventStore, bigint)]
ConsumerGroup: [consumerGroup (varchar), lastSequenceNumberSeen (bigint)]
And I have multiple EventSubscriber
// In Application 1
#EventSubscriber("consumerGroup1", AccountOpenedEvent)
...
// In Application 2
#EventSubscriber("consumerGroup2", AccountOpenedEvent)
...
Here is the the flow when an AccountOpenedEvent is written to EventStore table.
For each application (i.e application 1 and application 2), it will scan the codebase to obtain all the #EventSubscriber, create a consumer group in ConsumerGroup table with lastSequeneNumberSeen = 0, then having a scheduler (with 100ms polling interval) to poll all the interested events (group by consumer group) in EventStore with condition sequeneNumber >= lastSequeneNumberSeen.
For each event (EventStore) in step 1, calculate the sequenceId (here the sequenceId is equal to aggregateId), this sequenceId (together with the sequenceNumber) is used to guarantee the message delivery ordering. Persist it into EventDelivery table, and update the lastSequeneNumberSeen = sequenceNumber (this is to prevent duplicate event being scanned in next interval).
For each application (i.e application 1 and application 2), we have another scheduler (also with 100ms polling interval) to poll the EventDelivery table (group by seqeunceId and order by sequenceNumber ASC).
For each event (EventDelivery) in step 3, call the corresponding message handler, after message is handled, acknowledge the message by deleting the record in EventDelivery.
Since I have 2 applications, I have to separate the AccountOpenedEvent in EventStore into 2 transactions, supposing 2 applications don't know each other, I can only do it passively. Thats why I need the EventDelivery table and polling scheduler.
Assuming I can use redlock + cron to make sure there is only 1 instance do the polling jobs, in case application 1 have more than 1 replicas.
Application 1 will poll the AccountOpenedEvent and create a record in EventDelivery, and store the lastSequenceNumberSeen in its consumer group.
Application 2 will also poll the AccountOpenedEvent and create a record in EventDelivery and store the lastSequenceNumberSeen in its consumer group.
Since application 1 and application 2 are different consumer group, they treat the event store stream separately.
Here is a problem, we have 2 schedulers and we would have more if there are more consumer group, these will make heavy traffic loads to the database. How to solve this? One of my solution is convert these 2 schedulers to a job and put these jobs into queue, the queue will handle the jobs per interval (lets say 100ms), but seems like this would introduce large latency if the job is unfortunately placed at the end of the queue.
Here is the 2nd problem, in the above flow, I introduced the 2nd polling job to guarantee the message delivery ordering. But unlike the first one, I don't have the lastSequenceNumberSeen, the 2nd polling job will remove the job in EventDelivery if the message is handled. But it is common a message would be handled over 100ms. If thats in case, the same event in EventDelivery will be scanned again.
I'm not sure the common practice. I'm quite struggling on how to implement this. I did lots of research on the internet. I see some of them implement the message propagation by using Debezium + Kafka (Although I cannot use these 2 tools, I still cannot understand how it works).
I know Debezium using CDC approach to tail the transaction logs of RDBMS and forward the message to Kafka. And I see some recommendations that we should not have multiple subscription on the same transaction log. Let's say Debezium guaranteed the event can be propagated to Kafka, it means I need applciation 1 and applciation 2 subscribe the Kafka topic, both should belongs to different consumer group (also use aggregateId as partition key). Since Kafka guaranteed the message ordering, everything should work fine. But I don't think Kafka would store all the message from the most beginning, lets say it is configured to store 1000000 messages, when the message handler keep failed due to unexpected reason, the 1000000 messages after this failed message cannot be handled, the 1000001th event will get lost... Although this is rare case, I'm not sure I understand it right or not, the database table is the most reliable source to trust as it store all the events from the most beginning, if the system suffer from this case, is that mean I need to manually republish all the events to Kafka to recover the projection model?
And other case, if I have new event subscriber, which need to historical events to build the projection model. With Debezium + Kafka, we need assign a new consumerGroup and configured it to read the Kafka stream from the most beginning? It has the same problem as the consumerGroup can only get the last 1000000 events... But this is not a case if we poll the database table directly instead.
I don't understand why most implementation doesn't poll the database table but make use of message broker.
And, I really need advice on how to implement a CQRS + ES framework.... especially the message propagation part (keep in mind I can only use RDBMS + Redis(without persistence))....

Related

How to replay Event Sourcing events reliably?

One of great promises of Event Sourcing is the ability to replay events. When there's no relationship between entities (e.g. blob storage, user profiles) it works great, but how to do replay quckly when there are important relationships to check?
For example: Product(id, name, quantity) and Order(id, list of productIds). If we have CreateProduct and then CreateOrder events, then it will succeed (product is available in warehouse), it's easy to implement e.g. with Kafka (one topic with n1 partitions for products, another with n2 partitions for orders).
During replay everything happens more quickly, and Kafka may reorder the events (e.g. CreateOrder and then CreateProduct), which will give us different behavior than originally (CreateOrder will now fail because product doesn't exist yet). It's because Kafka guarantees ordering only within one topic within one partition. The easy solution would be putting everything into one huge topic with one partition, but this would be completely unscalable, as single-threaded replay of bigger databases could take days at least.
Is there any existing, better solution for quick replaying of related entities? Or should we forget about event sourcing and replaying of events when we need to check relationships in our databases, and replaying is good only for unrelated data?
As a practical necessity when event sourcing, you need the ability to conjure up a stream of events for a particular entity so that you can apply your event handler to build up the state. For Kafka, outside of the case where you have so few entities that you can assign an entire topic partition to just the events for a single entity, this entails a linear scan and filter through a partition. So for this reason, while Kafka is very likely to be a critical part of any event-driven/event-based system in relaying events published by a service for consumption by other services (at which point, if we consider the event vs. command dichotomy, we're talking about commands from the perspective of the consuming service), it's not well suited to the role of an event store, which are defined by their ability to quickly give you an ordered stream of the events for a particular entity.
The most popular purpose-built event store is, probably, the imaginatively named Event Store (at least partly due to the involvement of a few prominent advocates of event sourcing in its design and implementation). Alternatively, there are libraries/frameworks like Akka Persistence (JVM with a .Net port) which use existing DBs (e.g. relational SQL DBs, Cassandra, Mongo, Azure Cosmos, etc.) in a way which facilitates their use as an event store.
Event sourcing also as a practical necessity tends to lead to CQRS (they go together very well: event sourcing is arguably the simplest possible persistence model capable of being a write model, while its nearly useless as a read model). The typical pattern seen is that the command processing component of the system enforces constraints like "product exists before being added to the cart" (how those constraints are enforced is generally a question of whatever concurrency model is in use: the actor model has a high level of mechanical sympathy with this approach, but other models are possible) before writing events to the event store and then the events read back from the event store can be assumed to have been valid as of the time they were written (it's possible to later decide a compensating event needs to be recorded). The events from within the event store can be projected to a Kafka topic for communication to another service (the command processing component is the single source of truth for events).
From the perspective of that other service, as noted, the projected events in the topic are commands (the implicit command for an event is "update your model to account for this event"). Semantically, their provenance as events means that they've been validated and are undeniable (they can be ignored, however). If there's some model validation that needs to occur, that generally entails either a conscious decision to ignore that command or to wait until another command is received which allows that command to be accepted.
Ok, you are still thinking how did we developed applications in last 20 years instead of how we should develop applications in the future. There are frameworks that actually fits the paradigms of future perfectly, one of those, which mentioned above, is Akka but more importantly a sub component of it Akka FSM Finite State Machine, which is some concept we ignored in software development for years, but future seems to be more and more event based and we can't ignore anymore.
So how these will help you, Akka is a framework based on Actor concept, every Actor is an unique entity with a message box, so lets say you have Order Actor with id: 123456789, every Event for Order Id: 123456789 will be processed with this Actor and its messages will be ordered in its message box with first in first out principle, so you don't need a synchronisation logic anymore. But you could have millions of Order Actors in your system, so they can work in parallel, when Order Actor: 123456789 processing its events, an Order Actor: 987654321 can process its own, so there is the parallelism and scalability. While your Kafka guaranteeing the order of every message for Key 123456789 and 987654321, everything is green.
Now you can ask, where Finite State Machine comes into play, as you mentioned the problem arise, when addProduct Event arrives before createOrder Event arrives (while being on different Kafka Topics), at that point, State Machine will behave differently when Order Actor is in CREATED state or INITIALISING state, in CREATED state, it will just add the Product, in INITIALISING state probably it will just stash it, until createOrder Event arrives.
These concepts are explained really good in this video and if you want to see a practical example I have a blog for it and this one for a more direct dive.
I think I found the solution for scalable (multi-partition) event sourcing:
create in Kafka (or in a similar system) topic named messages
assign users to partitions (e.g by murmurHash(login) % partitionCount)
if a piece of data is mutable (e.g. Product, Order), every partition should contain own copy of the data
if we have e.g. 256 pieces of a product in our warehouse and 64 partitions, we can initially 'give' every partition 8 pieces, so most CreateOrder events will be processed quickly without leaving user's partition
if a user (a partition) sometimes needs to mutate data in other partition, it should send a message there:
for example for Product / Order domain, partitions could work similarly to Walmart/Tesco stores around a country, and the messages sent between partitions ('stores') could be like CreateProduct, UpdateProduct, CreateOrder, SendProductToMyPartition, ProductSentToYourPartition
the message will become an 'event' as if it was generated by an user
the message shouldn't be sent during replay (already sent, no need to do it twice)
This way even when Kafka (or any other event sourcing system) chooses to reorder messages between partitions, we'll still be ok, because we don't ever read any data outside our single-threaded 'island'.
EDIT: As #LeviRamsey noted, this 'single-threaded island' is basically actor model, and frameworks like Akka can make it a bit easier.

Category projections using kafka and cassandra for event-sourcing

I'm using Cassandra and Kafka for event-sourcing, and it works quite well. But I've just recently discovered a potentially major flaw in the design/set-up. A brief intro to how it is done:
The aggregate command handler is basically a kafka consumer, which consumes messages of interest on a topic:
1.1 When it receives a command, it loads all events for the aggregate, and replays the aggregate event handler for each event to get the aggregate up to current state.
1.2 Based on the command and businiss logic it then applies one or more events to the event store. This involves inserting the new event(s) to the event store table in cassandra. The events are stamped with a version number for the aggregate - starting at version 0 for a new aggregate, making projections possible. In addition it sends the event to another topic (for projection purposes).
1.3 A kafka consumer will listen on the topic upon these events are published. This consumer will act as a projector. When it receives an event of interest, it loads the current read model for the aggregate. It checks that the version of the event it has received is the expected version, and then updates the read model.
This seems to work very well. The problem is when I want to have what EventStore calls category projections. Let's take Order aggregate as an example. I can easily project one or more read models pr Order. But if I want to for example have a projection which contains a customers 30 last orders, then I would need a category projection.
I'm just scratching my head how to accomplish this. I'm curious to know if any other are using Cassandra and Kafka for event sourcing. I've read a couple of places that some people discourage it. Maybe this is the reason.
I know EventStore has support for this built in. Maybe using Kafka as event store would be a better solution.
With this kind of architecture, you have to choose between:
Global event stream per type - simple
Partitioned event stream per type - scalable
Unless your system is fairly high throughput (say at least 10s or 100s of events per second for sustained periods to the stream type in question), the global stream is the simpler approach. Some systems (such as Event Store) give you the best of both worlds, by having very fine-grained streams (such as per aggregate instance) but with the ability to combine them into larger streams (per stream type/category/partition, per multiple stream types, etc.) in a performant and predictable way out of the box, while still being simple by only requiring you to keep track of a single global event position.
If you go partitioned with Kafka:
Your projection code will need to handle concurrent consumer groups accessing the same read models when processing events for different partitions that need to go into the same models. Depending on your target store for the projection, there are lots of ways to handle this (transactions, optimistic concurrency, atomic operations, etc.) but it would be a problem for some target stores
Your projection code will need to keep track of the stream position of each partition, not just a single position. If your projection reads from multiple streams, it has to keep track of lots of positions.
Using a global stream removes both of those concerns - performance is usually likely to be good enough.
In either case, you'll likely also want to get the stream position into the long term event storage (i.e. Cassandra) - you could do this by having a dedicated process reading from the event stream (partitioned or global) and just updating the events in Cassandra with the global or partition position of each event. (I have a similar thing with MongoDB - I have a process reading the 'oplog' and copying oplog timestamps into events, since oplog timestamps are totally ordered).
Another option is to drop Cassandra from the initial command processing and use Kafka Streams instead:
Partitioned command stream is processed by joining with a partitioned KTable of aggregates
Command result and events are computed
Atomically, KTable is updated with changed aggregate, events are written to event stream and command response is written to command response stream.
You would then have a downstream event processor that copies the events into Cassandra for easier querying etc. (and which can add the Kafka stream position to each event as it does it to give the category ordering). This can help with catch up subscriptions, etc. if you don't want to use Kafka for long term event storage. (To catch up, you'd just read as far as you can from Cassandra and then switch to streaming from Kafka from the position of the last Cassandra event). On the other hand, Kafka itself can store events for ever, so this isn't always necessary.
I hope this helps a bit with understanding the tradeoffs and problems you might encounter.

How do I implement Event Sourcing using Kafka?

I would like to implement the event-sourcing pattern using kafka as an event store.
I want to keep it as simple as possible.
The idea:
My app contains a list of customers. Customers an be created and deleted. Very simple.
When a request to create a customer comes in, I am creating the event CUSTOMER_CREATED including the customer data and storing this in a kafka topic using a KafkaProducer. The same when a customer is deleted with the event CUSTOMER_DELETED.
Now when i want to list all customers, i have to replay all events that happened so far and then get the current state meaning a list of all customers.
I would create a temporary customer list, and then processing all the events one by one (create customer, create customer, delete customer, create customer etc). (Consuming these events with a KafkaConsumer). In the end I return the temporary list.
I want to keep it as simple as possible and it's just about giving me an understanding on how event-sourcing works in practice. Is this event-sourcing? And also: how do I create snapshots when implementing it this way?
when i want to list all customers, i have to replay all events that happened so far
You actually don't, or at least not after your app starts fresh and is actively collecting / tombstoning the data. I encourage you to lookup the "Stream Table Duality", which basically states that your table is the current state of the world in your system, and a snapshot in time of all the streamed events thus far, which would be ((customers added + customers modified) - customers deleted).
The way you implement this in Kafka would be to use a compacted Kafka topic for your customers, which can be read into a Kafka Streams KTable, and persisted in memory or spill to disk (backed by RocksDB). The message key would be some UUID for the customer, or some other identifiable record that cannot change (e.g. not name, email, phone, etc. as all this can change)
With that, you can implement Interactive Queries on it to scan or lookup a certain customer's details.
Theoretically you can do Event Sourcing with Kafka as you mentioned, replaying all Events in the application start but as you mentioned, if you have 100 000 Events to reach a State it is not practical.
As it is mentioned in the previous answers, you can use Kafka Streaming KTable for sense of Event Sourcing but while KTable is hosted in Key/Value database RockDB, querying the data will be quite limited (You can ask what is the State of the Customer Id: 123456789 but you can't ask give me all Customers with State CUSTOMER_DELETED).
To achieve that flexibility, we need help from another pattern Command Query Responsibility Segregation (CQRS), personally I advice you to use Kafka reliable, extremely performant Broker and give the responsibility for Event Sourcing dedicated framework like Akka (which Kafka synergies naturally) with Apache Cassandra persistence and Akka Finite State Machine for the Command part and Akka Projection for the Query part.
If you want to see a sample how all these technology stacks plays together, I have a blog for it. I hope it can help you.

Event Sourcing - Apache Kafka + Kafka Streams - How to assure atomicity / transactionality

I'm evaluating Event Sourcing with Apache Kafka Streams to see how viable it is for complex scenarios. As with relational databases I have come across some cases were atomicity/transactionality is essential:
Shopping app with two services:
OrderService: has a Kafka Streams store with the orders (OrdersStore)
ProductService: has a Kafka Streams store (ProductStockStore) with the products and their stock.
Flow:
OrderService publishes an OrderCreated event (with productId, orderId, userId info)
ProductService gets the OrderCreated event and queries its KafkaStreams Store (ProductStockStore) to check if there is stock for the product. If there is stock it publishes an OrderUpdated event (also with productId, orderId, userId info)
The point is that this event would be listened by ProductService Kafka Stream, which would process it to decrease the stock, so far so good.
But, imagine this:
Customer 1 places an order, order1 (there is a stock of 1 for the product)
Customer 2 places concurrently another order, order2, for the same product (stock is still 1)
ProductService processes order1 and sends a message OrderUpdated to decrease the stock. This message is put in the topic after the one from order2 -> OrderCreated
ProductService processes order2-OrderCreated and sends a message OrderUpdated to decrease the stock again. This is incorrect since it will introduce an inconsistency (stock should be 0 now).
The obvious problem is that our materialized view (the store) should be updated directly when we process the first OrderUpdated event. However the only way (I know) of updating the Kafka Stream Store is publishing another event (OrderUpdated) to be processed by the Kafka Stream. This way we can't perform this update transactionally.
I would appreciate ideas to deal with scenarios like this.
UPDATE: I'll try to clarify the problematic bit of the problem:
ProductService has a Kafka Streams Store, ProductStock with this stock (productId=1, quantity=1)
OrderService publishes two OrderPlaced events on the orders topic:
Event1 (key=product1, productId=product1, quantity=1, eventType="OrderPlaced")
Event2 (key=product1, productId=product1, quantity=1, eventType="OrderPlaced")
ProductService has a consumer on the orders topic. For simplicity let's suppose a single partition to assure messages consumption in order. This consumer executes the following logic:
if("OrderPlaced".equals(event.get("eventType"))){
Order order = new Order();
order.setId((String)event.get("orderId"));
order.setProductId((Integer)(event.get("productId")));
order.setUid(event.get("uid").toString());
// QUERY PRODUCTSTOCK TO CHECK AVAILABILITY
Integer productStock = getProductStock(order.getProductId());
if(productStock > 0) {
Map<String, Object> event = new HashMap<>();
event.put("name", "ProductReserved");
event.put("orderId", order.getId());
event.put("productId", order.getProductId());
// WRITES A PRODUCT RESERVED EVENT TO orders topic
orderProcessor.output().send(MessageBuilder.withPayload(event).build(), 500);
}else{
//XXX CANCEL ORDER
}
}
ProductService also has a Kafka Streams processor that is responsible to update the stock:
KStream<Integer, JsonNode> stream = kStreamBuilder.stream(integerSerde, jsonSerde, "orders");
stream.xxx().yyy(() -> {...}, "ProductsStock");
Event1 would be processed first and since there is still 1 available product it would generate the ProductReserved event.
Now, it's Event2's turn. If it is consumed by ProductService consumer BEFORE the ProductService Kafka Streams Processor processes the ProductReseved event generated by Event1, the consumer would still see that the ProductStore stock for product1 is 1, generating a ProductReserved event for Event2, then producing an inconsistency in the system.
This answer is a little late for your original question, but let me answer anyway for completeness.
There are a number of ways to solve this problem, but I would encourage addressing this is an event driven way. This would mean you (a) validate there is enough stock to process the order and (b) reserve the stock as a single, all within a single KStreams operation. The trick is to rekey by productId, that way you know orders for the same product will be executed sequentially on the same thread (so you can't get into the situation where Order1 & Order2 reserve stock of the same product twice).
There is a post that talks discusses how to do this: https://www.confluent.io/blog/building-a-microservices-ecosystem-with-kafka-streams-and-ksql/
Maybe more usefully there is some sample code also showing how it can be done:
https://github.com/confluentinc/kafka-streams-examples/blob/1cbcaddd85457b39ee6e9050164dc619b08e9e7d/src/main/java/io/confluent/examples/streams/microservices/InventoryService.java#L76
Note how in this KStreams code the first line rekeys to productId, then a Transformer is used to (a) validate there is sufficient stock to process the order and (b) reserve the stock required by updating the state store. This is done atomically, using Kafka's Transactions feature.
This same problem is typical in assuring consistency in any distributed system. Instead of going for strong consistency, typically the process manager/saga pattern is used. This is somewhat similar to the 2-phase commit in distributed transactions but implemented explicitly in application code. It goes like this:
The Order Service asks the Product Service to reserve N items. The Product Service either accepts the command and reduces stock or rejects the command if it doesn't have enough items available. Upon positive reply to the command the Order Service can now emit OrderCreated event (although I'd call it OrderPlaced, as "placed" sounds mode idiomatic to the domain and "created" is more generic, but that's a detail). The Product Service either listens for OrderPlaced events or an explicit ConfirmResevation command is sent to it. Alternatively, if something else happened (e.g. failed to clear funds), an appropriate event can be emitted or CancelReservation command sent explicitly to the ProductService. To cater for exceptional circumstances, the ProductService may also have a scheduler (in KafkaStreams punctuation can come in handy for this) to cancel reservations that weren't confirmed or aborted within a timeout period.
The technicalities of the orchestration of the two services and handling the error conditions and compensating actions (cancelling reservation in this case) can be handled in the services directly, or in an explicit Process Manager component to segregate this responsibility. Personally I'd go for an explicit Process Manager that could be implemented using Kafka Streams Processor API.

RabbitMQ - Message order of delivery

I need to choose a new Queue broker for my new project.
This time I need a scalable queue that supports pub/sub, and keeping message ordering is a must.
I read Alexis comment: He writes:
"Indeed, we think RabbitMQ provides stronger ordering than Kafka"
I read the message ordering section in rabbitmq docs:
"Messages can be returned to the queue using AMQP methods that feature
a requeue
parameter (basic.recover, basic.reject and basic.nack), or due to a channel
closing while holding unacknowledged messages...With release 2.7.0 and later
it is still possible for individual consumers to observe messages out of
order if the queue has multiple subscribers. This is due to the actions of
other subscribers who may requeue messages. From the perspective of the queue
the messages are always held in the publication order."
If I need to handle messages by their order, I can only use rabbitMQ with an exclusive queue to each consumer?
Is RabbitMQ still considered a good solution for ordered message queuing?
Well, let's take a closer look at the scenario you are describing above. I think it's important to paste the documentation immediately prior to the snippet in your question to provide context:
Section 4.7 of the AMQP 0-9-1 core specification explains the
conditions under which ordering is guaranteed: messages published in
one channel, passing through one exchange and one queue and one
outgoing channel will be received in the same order that they were
sent. RabbitMQ offers stronger guarantees since release 2.7.0.
Messages can be returned to the queue using AMQP methods that feature
a requeue parameter (basic.recover, basic.reject and basic.nack), or
due to a channel closing while holding unacknowledged messages. Any of
these scenarios caused messages to be requeued at the back of the
queue for RabbitMQ releases earlier than 2.7.0. From RabbitMQ release
2.7.0, messages are always held in the queue in publication order, even in the presence of requeueing or channel closure. (emphasis added)
So, it is clear that RabbitMQ, from 2.7.0 onward, is making a rather drastic improvement over the original AMQP specification with regard to message ordering.
With multiple (parallel) consumers, order of processing cannot be guaranteed.
The third paragraph (pasted in the question) goes on to give a disclaimer, which I will paraphrase: "if you have multiple processors in the queue, there is no longer a guarantee that messages will be processed in order." All they are saying here is that RabbitMQ cannot defy the laws of mathematics.
Consider a line of customers at a bank. This particular bank prides itself on helping customers in the order they came into the bank. Customers line up in a queue, and are served by the next of 3 available tellers.
This morning, it so happened that all three tellers became available at the same time, and the next 3 customers approached. Suddenly, the first of the three tellers became violently ill, and could not finish serving the first customer in the line. By the time this happened, teller 2 had finished with customer 2 and teller 3 had already begun to serve customer 3.
Now, one of two things can happen. (1) The first customer in line can go back to the head of the line or (2) the first customer can pre-empt the third customer, causing that teller to stop working on the third customer and start working on the first. This type of pre-emption logic is not supported by RabbitMQ, nor any other message broker that I'm aware of. In either case, the first customer actually does not end up getting helped first - the second customer does, being lucky enough to get a good, fast teller off the bat. The only way to guarantee customers are helped in order is to have one teller helping customers one at a time, which will cause major customer service issues for the bank.
It is not possible to ensure that messages get handled in order in every possible case, given that you have multiple consumers. It doesn't matter if you have multiple queues, multiple exclusive consumers, different brokers, etc. - there is no way to guarantee a priori that messages are answered in order with multiple consumers. But RabbitMQ will make a best-effort.
Message ordering is preserved in Kafka, but only within partitions rather than globally. If your data need both global ordering and partitions, this does make things difficult. However, if you just need to make sure that all of the same events for the same user, etc... end up in the same partition so that they are properly ordered, you may do so. The producer is in charge of the partition that they write to, so if you are able to logically partition your data this may be preferable.
I think there are two things in this question which are not similar, consumption order and processing order.
Message Queues can -to a degree- give you a guarantee that messages will get consumed in order, they can't, however, give you any guarantees on the order of their processing.
The main difference here is that there are some aspects of message processing which cannot be determined at consumption time, for example:
As mentioned a consumer can fail while processing, here the message's consumption order was correct, however, the consumer failed to process it correctly, which will make it go back to the queue. At this point the consumption order is intact, but the processing order is not.
If by "processing" we mean that the message is now discarded and finished processing completely, then consider the case when your processing time is not linear, in other words processing one message takes longer than the other. For example, if message 3 takes longer to process than usual, then messages 4 and 5 might get consumed and finish processing before message 3 does.
So even if you managed to get the message back to the front of the queue (which by the way violates the consumption order) you still cannot guarantee they will also be processed in order.
If you want to process the messages in order:
Have only 1 consumer instance at all times, or a main consumer and several stand-by consumers.
Or don't use a messaging queue and do the processing in a synchronous blocking method, which might sound bad but in many cases and business requirements it is completely valid and sometimes even mission critical.
There are proper ways to guarantuee the order of messages within RabbitMQ subscriptions.
If you use multiple consumers, they will process the message using a shared ExecutorService. See also ConnectionFactory.setSharedExecutor(...). You could set a Executors.newSingleThreadExecutor().
If you use one Consumer with a single queue, you can bind this queue using multiple bindingKeys (they may have wildcards). The messages will be placed into the queue in the same order that they were received by the message broker.
For example you have a single publisher that publishes messages where the order is important:
try (Connection connection2 = factory.newConnection();
Channel channel2 = connection.createChannel()) {
// publish messages alternating to two different topics
for (int i = 0; i < messageCount; i++) {
final String routingKey = i % 2 == 0 ? routingEven : routingOdd;
channel2.basicPublish(exchange, routingKey, null, ("Hello" + i).getBytes(UTF_8));
}
}
You now might want to receive messages from both topics in a queue in the same order that they were published:
// declare a queue for the consumer
final String queueName = channel.queueDeclare().getQueue();
// we bind to queue with the two different routingKeys
final String routingEven = "even";
final String routingOdd = "odd";
channel.queueBind(queueName, exchange, routingEven);
channel.queueBind(queueName, exchange, routingOdd);
channel.basicConsume(queueName, true, new DefaultConsumer(channel) { ... });
The Consumer will now receive the messages in the order that they were published, regardless of the fact that you used different topics.
There are some good 5-Minute Tutorials in the RabbitMQ documentation that might be helpful:
https://www.rabbitmq.com/tutorials/tutorial-five-java.html