How to configure Kafka RPC caller topic and group - apache-kafka

I'm trying to implement an RPC architecture using Kafka as a message broker. The decision of using Kafka instead of another message broker solution is dictated by the current context.
The actual implementation consists on two different types of service:
The receiver: this service receives messages from a Kafka topic which consumes, processes the messages and then publish the response message to a response topic;
The caller: this service receives HTTP requests, then publish messages to the receiver topic, consumes the response topic of the receiver service for the response message, then returns it as an HTTP response.
The request/response messages published in the topics are related by the message key.
The receiver implementation was fairly simple: at startup, it creates the "request" and "response" topic, then starts consuming the request topic with the service group id (many instances of the receiver will share the same group id in order to implement a proper request balance). When a request arrives, the service processes the request and then publish the response in the response topic.
My problem is with the caller implementation, in particular while consuming the response from the response queue.
With the following assumptions:
The HTTP requests must be managed concurrently;
There could be more than one instance of this caller service.
every single thread/service must receive all the messages in the response topic, in order to find the message with the corresponding request key.
As an example, imagine that two receiver services produce two messages with keys 1 and 2 respectively. These messages will be published in the receiver topic, and processed. The response will then be published in the topic receiver-responses. If the two receiver services share the same group-id, it could be that response 1 arrives to the service that published message 2 and vice versa, resulting in a HTTP timeout.
To avoid this problem, I've managed to think these possible solutions:
Creating a new group for every request (EDIT: but a group cannot be deleted via code, hence it would be necessary another service to clean the zookeeper from these groups);
Creating a new topic for every request, then delete it afterwards.
Hoping that I made myself sufficiently clear - I must admit I am a beginner to Kafka - my question would be:
Which solution is more costly than the other? Or is there another topic/group configuration that could achieve the assumption 3?
Thanks.

I think I've found a possible solution. A group will be automatically deleted by the zookeeper when it's offset doesn't update for a period of time, determined by the configuration offsets.topic.retention.minutes.
The offset update time check should be possible to set up by setting the configuration offsets.retention.check.interval.ms.
This way, when a consumer connects to the response topic searching for the reply message, the created group can be abandoned, and it will be deleted by the zookeeper later in time.

Related

Kafka with multiple instances of microservices and end-users

This is more of a design/architecture question.
We have a microservice A (MSA) with multiple instances (say 2) running of it behind LB.
The purpose of this microservice is to get the messages from Kafka topic and send to end users/clients. Both instances use same consumer group id for a particular client/user so as messages are not duplicated. And we have 2 (or =#instances) partitions of Kafka topic
End users/clients connect to LB to fetch the message from MSA. Long polling is used here.
Request from client can land to any instance. If it lands to MSA1, it will pull the data from kafka partion1 and if it lands to MSA2, it will pull the data from partition2.
Now, a producer is producing the messages, we dont have high messages count. So, lets say producer produce msg1 and it goes to partition1. End user/client will not get this message unless it's request lands to MSA1, which might not happen always as there are other requests coming to LB.
We want to solve this issue. We want that client gets the message near realtime.
One of the solution can be having a distributed persistent queue (e.g. ActiveMQ) where both MSA1 and MSA2 keep on putting the messages after reading from Kafka and client just fetch the message from queue. But this will cause separate queue for every end-user/client/groupid.
Is this a good solution, can we go ahead with this? Anything that we should change here. We are deploying our system on AWS, so if any AWS managed service can help here e.g. SNS+SQS combination?
Some statistics:
~1000 users, one group id per user
2-4 instances of microservice
long polling every few seconds (~20s)
average message size ~10KB
Broadly you have three possible approaches:
You can dispense with using Kafka's consumer group functionality and allow each instance to consume from all partitions.
You can make the instances of each service aware of each other. For example, an instance which gets a request which can be fulfilled by another instance will forward the request there. This is most effective if the messages can be partitioned by client on the producer end (so that a request from a given client only needs to be routed to an instance). Even then, the consumer group functionality introduces some extra difficulty (rebalances mean that the consumer currently responsible for a given partition might not have seen all the messages in the partition). You may want to implement your own variant of the consumer group coordination protocol, only on rebalance, the instance starts from some suitably early point regardless of where the previous consumer got to.
If you can't reliably partition by client in the producer (e.g. the client is requesting a stream of all messages matching arbitrary criteria) then Kafka is really not going to be a fit and you probably want a database (with all the expense and complexity that implies).

KafkaRestProxy multiple instances issue

I'm having an architecture of microservices where each service's producer write to the same topic. I have two instance of kafkaRestproxy each listen to that topic but the problem here is that :
Suppose a request come to instance-1 of a restproxy and it will redirect to the microservice and that service done with the job and write the response to the topic but the response is consumed by the second instance of the restproxy let say instance-2.
What should I do to solve this? Is their any kind of application_id we can attach to the request so when that microservice done with the job and if another instance of restproxy consumed that response then we can redirect the response to that instance of restproxy which gets that request?
Your proxies form a Kafka Consumer group, just as any other application. When you request records, you give both the consumer group and the consumer instance name (such as a host of the HTTP client) GET /consumers/(string:group_name)/instances/(string:instance)/records
You should generally not try to strictly control which consumers get which information beyond assigning a unique instance to each request, to allow for parallel consumption (assuming this is what you want).
Also, the rest proxy isn't consuming anything unless you have another application that's requesting that information, e.g. the GET request above.

How to handle HTTP request to a Message Broker Producer/Consumer?

Let's say you have a POST request with some product as the payload. Traditionally, your HttpRequest lifecycle should end with an HttpResponse carrying the requested action's result, in our case a response saying "Product created" might be enough.
But with a message broker, things might turn like this:
The request handler create the appropriate message, CreateProduct(...), and produces it to a topic in the message broker.
Then what ???
A consumer retrieves and process the message by actually creating the product in a persistent database.
Then What ???
What should happens at step 2 ?
If we send a response saying "Your product should be created very soon, keep waiting, we keep you posted":
How can the client be notified after a response has been sent already
?
Are we forced to use WebSocket so we can keep the link opened ?
What should happens at step 4 ?
I have my opinion but I would like to know how you handle it in production.
The app that actually created the product can produce a message saying "Product created" to a status topic in the message broker, so the original message's producer can consume it and then notify the client some how. The only way I see it possible is through a WebSocket connection.
So I would like to know if WebSocket is the only way to do Http Request/Response involving a message broker ? and is it reasonable to use a message broker for Http Request/Response ?
You could think of this in a fully asynchronous fashion ( no websocket needed then).
You do an POST Http request and this will create an unique ID associated with your job. This ID will be stored in a database as well, with a status like 'processing'.
Besides the ID will be returned to your client.
Your job ID ( and its payload parameters) travels inside Kafka and finally goes to a consumer. This consumer will process the job and commit stuff to external DB ( or whatever).
When the job is done you update the job status to 'done' or something like this.
In the meantime, client side, you poll an endpoint that will ask your Job DB state if the job is over or not.
This is a very common way to cover your needs.
Yannick

Spring Cloud Stream Kafka - How to implement idempotency to support distributed transaction management (eventual consistency)

I have the following typical scenario:
An order service used to purchase products. Acts as the commander of the distributed transaction.
A product service with the list of products and its stock.
A payment service.
Orders DB Products DB
| |
--------------- ---------------- ----------------
| OrderService | | ProductService | | PaymentService |
--------------- ---------------- ----------------
| | |
| -------------------- |
--------------- | Kafka orders topic |-------------
---------------------
The normal flow would be:
The user orders a product.
Order service creates an order in DB and publishes a message in Kafka topic "orders" to reserve a product (PRODUCT_RESERVE_REQUEST).
Product service decreases the product stock one unit in its DB and publishes a message in "orders" saying PRODUCT_RESERVED
Order service gets the PRODUCT_RESERVED message and orders the payment publishing a message PAYMENT_REQUESTED
Payment service orders the payment and answers with a message PAYED
Order service reads the PAYED message and marks the order as COMPLETED, finishing the transaction.
I am having trouble to deal with error cases, e.g: let's assume this:
Payment service fails to charge for the product, so it publishes a message PAYMENT_FAILED
Order service reacts publishing a message UNDO_PRODUCT_RESERVATION
Product service increases the stock in the DB to cancel the reservation and publishes PRODUCT_UNRESERVATION_COMPLETED
Order service finishes the transaction saving the final state of the order as CANCELLED_PAYMENT_FAILED.
In this scenario imagine that for whatever reason, order service publishes a UNDO_PRODUCT_RESERVATION message but doesn't receive the PRODUCT_UNRESERVATION_COMPLETED message, so it retries publishing another UNDO_PRODUCT_RESERVATION message.
Now, imagine that those two UNDO_PRODUCT_RESERVATION messages for the same order end up arriving to ProductService. If I process both of them I could end up setting an invalid stock for the product.
In this scenario how can I implement idempotency?
UPDATE:
Following Artem's instructions I can now detect duplicated messages (by checking the message header) and ignore them but there may still be situations like the following where I shouldn't ignore the duplicated messages:
Order Service sends UNDO_PRODUCT_RESERVATION
Product service gets the message and starts processing it but crashes before updating the stock.
Order Service doesn't get a response so it retries sending UNDO_PRODUCT_RESERVATION
Product service knows this is a duplicated message BUT, in this case it should repeat the processing again.
Can you help me come up with a way to support this scenario as well? How could I distinguish when I should discard the message or reprocess it?
We used spring-integration-kafka to produce and consume messages with Kafka in our microservices. In our case, we send org.springframework.messaging.Message objects to topics and get the same type from topics after deserialization from byte-array. In Message entity there are message-id, sent-time etc. headers values other than message payload which is the actual object that you want to transfer from one microservice to others. We use unique message-id value to implement idempotency. On producer side, you must implement some logic to ensure that, the message-id of the Message is the same when it is produced multiple times. This is actually related to your produce logic. In our case, we use Publishing Events Using Local Transactions which is very well described in the blog https://www.nginx.com/blog/event-driven-data-management-microservices/ by Chris Richardson. With this approach we can recrate Message object with the same message-id on producer side. On consumer side, we persist all the consumed message id values to database and check this ids before processing the received messages. If we see a message whose id is in our persistent store, we simply ignore it.
In your case, To implement idempotency:
you should keep a unique identifier with the messages,
On producer side, you must generate the same identifier when it is produced multiple times,
On consumer side, you must check the received id to detect whether it is consumed before or not
Regarding to Second Scenario Which is Described in UPDATE,
I think you should change your mind a little bit. If you want to implement publish-subscribe mechanism which is more suitable in microservices architecture, you shouldn't wait response on producer side. In this scenario, you wait other message to know whether the consumer consumed the message or not and if it is not consumed by the consumer, you send it again.
How about the implementation below;
On producer side, you send messages to Kafka within a transaction in producer. You should provide a mechanism here to send messages to kafka only the transaction on producer side is committed. This is Atomicity issue and i give a link above which shows how to solve this issue.
On Consumer side, you poll messages from kafka topic one by one in order and you get the next message only when the current message can be consumed. If it is not consumed, you shouldn't get the next message. Because the next message might be related to current message and if you consume the next message you may corrupt consistency of your data. Its not producer's concern when the message not consumed. On consumer side, you should provide retry and replay mechanisms to consume messages.
I think you shouldn't wait response on producer side. Kafka is a very smart tool, and with its offset commit capability, as a consumer you don't have to consume messages when you poll messages from topic. If you have a problem while processing messages, you simply don't commit offset to get next message.
With the implementation described above, you don't have a problem like "How could I distinguish when I should discard the message or reprocess it?"
Regards...
actually because of the complications you mentioned about organizing transaction over multiple micro services over Apache Kafka, I developed another concept and wrote a blog about it.
If you reach a state of complication that Kafka solution might not be feasible anymore, you might find it as an interesting read. It is too long to explain here but basically it uses a J2EE container fully with Micro Service principle and with full transaction support between the Micro Services with the help of the Spring Boot + Netflix.
Micro Services Fanout and Transaction Problems and Solutions with Spring Boot and Netflix

Does Kafka support request response messaging

I am investigating Kafka 9 as a hobby project and completed a few "Hello World" type examples.
I have got to thinking about Real World Kafka applications based on request response messaging in general and more specifically how to link a Kafka request message to its response message.
I was thinking along the lines of using a generated UUID as the request message key and employ this request UUID as the associated response message key. Much the same type of mechanism that WebSphere MQ has message correlation id.
My end 2 end process would be.
1). Kafka client generates a random UUID and sends a single Kafka request message.
2). The server would consume this request message extract & store the request UUID value
3). complete a Business Process using the message payload.
4). Respond with a response message that employs the stored UUID value from the request message as response message Key.
5). the Kafka client polls the response topic until it either timeouts or retrieves a message with the original request UUID value.
What I concerned about is that the Kafka Consumer polling will remove other clients messages from the response topic and increment the offsets making other clients fail.
Am I trying to apply Kafka in a use case it was never designed for?
Is it possible to implement request/response messaging in Kafka?
Even though Kafka provides convenience methods to persist the committed offsets for a given consumer group, you're not required to use that behavior and can write your own if you feel the need. Even so, the use of Kafka the way you've described it is a bit awkward for the use case as each client needs to repeatedly search the topic for a specific response. That's inefficient at best.
You could break the problem into two parts, continuing to use Kafka to deliver requests to and responses from your server. The only piece you'd need to add would be some sort of API layer that your clients talk to and which encapsulates the Kafka-specific logic from your clients. This layer would need a local DB (relational or NoSQL) that could store responses by uuid making it very fast and easy for the API to answer whether a response is available for a specific uuid.
Easier! You can only write on zookeeper that the UUID X should be answered on partition Y, and make the producer that sent that UUID consume the partition Y... Does that make sense?
I think you need a well defined shard key of the service that invokes the request. Your request should contain this shard key and the name of the topic where to post response. Also you should create some sort of state machine and when a message regarding your task comes you transition to some state... this would be for strict async design
In theory, you could
assign an ID to each request and message that is supposed to get a result message;
create a hash function that would map this ID to an identifier of of a partition,
when sending the result message, use the same hash function to get the identifier of the partition to send it to,
in the producer you could only observe that given partition.
That would reduce the need to crawl many messages in that topic to filter out the result required by the waiting request handler.