Kafka topic filtering vs. ephemeral topics for microservice request/reply pattern - apache-kafka

I'm trying to implement a request/reply pattern with Kafka. I am working with named services and unnamed clients that send messages to those services, and clients may expect a reply. Many (10s-100s) of clients may interact with a single service, or consumer group of services.
Strategy one: filtering messages
The first thought was to have two topics per service - the "HelloWorld" service would consume the "HelloWorld" topic, and produce replies back to the "HelloWorld-Reply" topic. Clients would consume that reply topic and filter on unique message IDs to know what replies are relevant to them.
The drawback there is it seems like it might create unnecessary work for clients to filter out a potentially large amount of irrelevant messages when many clients are interacting with one service.
Strategy two: ephemeral topics
The second idea was to create a unique ID per client, and send that ID along with messages. Clients would consume their own unique topic "[ClientID]" and services would send to that topic when they have a reply. Clients would thus not have to filter irrelevant messages.
The drawback there is clients may have a short lifespan, e.g. they may be single use scripts, and they would have to create their topic beforehand and delete it afterward. There might have to be some extra process to purge unused client topics if a client dies during processing.
Which of these seems like a better idea?

We are using Kafka in production as a handler for event based messages and request/response messages. our approach to implementing request/response is your first strategy because, when the number of clients grows, you have to create many topics which some of them are completely useless. another reason for choosing the first strategy was our topic naming guideline that each service should belong to only one topic for tacking. however, Kafka is not made for request/response messages but I recommend the first strategy because:
few numbers of topics
better service tracking
better topic naming
but you have to be careful about your consumer groups. which may causes of data loss.
A better approach is using the first strategy with many partitions in one topic (service) that each client sends and receives its messages with a unique key. Kafka guarantees that all messages with the same key will go to a specific partition. this approach doesn't need filtering irrelevant messages and maybe is a combination of your two strategies.
Update:
As #ValBonn said in the suggested approach you always have to be sure that the number of partitions >= number of clients.

Related

Kafka with multiple instances of microservices and end-users

This is more of a design/architecture question.
We have a microservice A (MSA) with multiple instances (say 2) running of it behind LB.
The purpose of this microservice is to get the messages from Kafka topic and send to end users/clients. Both instances use same consumer group id for a particular client/user so as messages are not duplicated. And we have 2 (or =#instances) partitions of Kafka topic
End users/clients connect to LB to fetch the message from MSA. Long polling is used here.
Request from client can land to any instance. If it lands to MSA1, it will pull the data from kafka partion1 and if it lands to MSA2, it will pull the data from partition2.
Now, a producer is producing the messages, we dont have high messages count. So, lets say producer produce msg1 and it goes to partition1. End user/client will not get this message unless it's request lands to MSA1, which might not happen always as there are other requests coming to LB.
We want to solve this issue. We want that client gets the message near realtime.
One of the solution can be having a distributed persistent queue (e.g. ActiveMQ) where both MSA1 and MSA2 keep on putting the messages after reading from Kafka and client just fetch the message from queue. But this will cause separate queue for every end-user/client/groupid.
Is this a good solution, can we go ahead with this? Anything that we should change here. We are deploying our system on AWS, so if any AWS managed service can help here e.g. SNS+SQS combination?
Some statistics:
~1000 users, one group id per user
2-4 instances of microservice
long polling every few seconds (~20s)
average message size ~10KB
Broadly you have three possible approaches:
You can dispense with using Kafka's consumer group functionality and allow each instance to consume from all partitions.
You can make the instances of each service aware of each other. For example, an instance which gets a request which can be fulfilled by another instance will forward the request there. This is most effective if the messages can be partitioned by client on the producer end (so that a request from a given client only needs to be routed to an instance). Even then, the consumer group functionality introduces some extra difficulty (rebalances mean that the consumer currently responsible for a given partition might not have seen all the messages in the partition). You may want to implement your own variant of the consumer group coordination protocol, only on rebalance, the instance starts from some suitably early point regardless of where the previous consumer got to.
If you can't reliably partition by client in the producer (e.g. the client is requesting a stream of all messages matching arbitrary criteria) then Kafka is really not going to be a fit and you probably want a database (with all the expense and complexity that implies).

Can Kafka be used for real time notification?

I am trying to understand how Kafka can be used for real time notification. Let's say I have a kafka topic for alerting purposes. This topic is used by various services to send updates to the users.
There are 10 instances of notification service running and consuming messages from the topic.
Online users would be distributed among 10 instances. For ex: User1 might be connected to Instance 8 with a websocket connection.
So how to ensure that users are notified correctly? That is, how to ensure that only Instance8 is processing the message for the User1.?
This problem needs to be addressed through multiple angles - let's look at each one...
First - the consumer side...
You'll need as many partitions as there are consumer application instances i.e. the notification service - in your case you've got 10 instances so 10 partitions (or a multiple of 10) to the topic. This will ensure none of the service instances are left idle. Also, they'll need to be a part of the same consumer group. Now, there are a few different partition assignment approaches available and you might need to look into these to find out the one that suits your situation - here's a good reference article.
An example - If you've got 100 users and user-1 to user-10 must be handled by notification-service-1, then StickyAssignor might suit you best.
Alternatively, you could even write your custom partition assignor and the reference article mentioned above does provide some information on this as well
Second - the producer side...
The producer applications writing data to the given Kafka topic should ensure that they send data related to a particular user to a certain partition.
As Kafka messages are made up of key-value pairs, you'll need to make sure that the keys are NOT null. The best would be to use some user-related-information as the key - this way you can make sure that messages in any partition are consumed by the designated consumer instance.
Lastly, please note that I've left out the part on which users (socket connections) are mapped to which notification service instance as it is beyond Kafka and I'm not sure if that part is designed to be strict or not.

Multi-tenancy support in apache-kafka

I have a scenario where multiple clients want to produce and consume messages. Is there a way to achieve multi-tenancy in apache kafka so that one client remain intact by a huge inflow of another client.
Basically I want a way to tag client to brokers and all topic/partitions on that client will fall under the tagged brokers.
Here one client produces and consumes messages internally. They are doing this to achieve distributed processing. eg parent message is broken down to children and produces/process these children in any machine them.
One reason that clients would fail from other clients is from I/O saturation on the brokers due to those other clients.
The way to protect from that would be to enforce quotas.
Not sure I understand what you mean by "tag" since you cannot control partition placement within a cluster, so sending data from particular clients to certain brokers is not possible unless you were to first create a topic then manually re-assign the partiion and replica placements.
By client do you mean a customer tenant? Regardless of that, if the concern is to have a clear load separation between clients(to avoid denial of attack or starvation) one possibility is to have different topics for different clients. This will separate load to some extent(partition based) but not fully as still various resources are shared. Also it has its own de-merits. The demerits are
The partitions are not shared. If one client is not having enough load and when the partitions are not occupied much with read/write, the partitions are not used for another client in another topic.
Another pitfall is the number of topics will grow according to number of clients and not according to the message load. Not good again.
Another approach which is probably a better one to avoid denial of attack or starvation of a client is to limit the rate of messages produced(throttle) by a client.

Kafka instead of Rest for communication between microservices

I want to change the communication between (micro)-services from REST to Kafka.
I'm not sure about the topics and wanted to hear some opinions about that.
Consider the following setup:
I have an API-Gateway that provides CRUD functions via REST for web applications. So I have 4 endpoints which users can call.
The API-Gateway will produce the request and consumes the responses from the second service.
The second service consumes the requests, access the database to execute the CRUD operations on the database and produces the result.
How many topics should I create?
Do I have to create 8 (2 per endpoint (request/response)) or is there a better way to do it?
Would like to hear some experience or links to talks / documentation on that.
The short answer for this question is; It depends on your design.
You can use only one topic for all your operations or you can use several topics for different operations. However you must know that;
Your have to produce messages to kafka in the order that they created and you must consume the messages in the same order to provide consistency. Messages that are send to kafka are ordered within a topic partition. Messages in different topic partitions are not ordered by kafka. Lets say, you created an item then deleted that item. If you try to consume the message related to delete operation before the message related to create operation you get error. In this scenario, you must send these two messages to same topic partition to ensure that the delete message is consumed after create message.
Please note that, there is always a trade of between consistency and throughput. In this scenario, if you use a single topic partition and send all your messages to the same topic partition you will provide consistency but you cannot consume messages fast. Because you will get messages from the same topic partition one by one and you will get next message when the previous message consumed. To increase throughput here, you can use multiple topics or you can divide the topic into partitions. For both of these solutions you must implement some logic on producer side to provide consistency. You must send related messages to same topic partition. For instance, you can partition the topic into the number of different entity types and you send the messages of same entity type crud operation to the same partition. I don't know whether it ensures consistency in your scenario or not but this can be an alternative. You should find the logic which provides consistency with multiple topics or topic partitions. It depends on your case. If you can find the logic, you provide both consistency and throughput.
For your case, i would use a single topic with multiple partitions and on producer side i would send related messages to the same topic partition.
--regards

apache- kafka with 100 millions of topics

I'm trying to replace rabbit mq with apache-kafka and while planning, I bumped in to several conceptual planning problem.
First we are using rabbit mq for per user queue policy meaning each user uses one queue. This suits our need because each user represent some job to be done with that particular user, and if that user causes a problem, the queue will never have a problem with other users because queues are seperated ( Problem meaning messages in the queue will be dispatch to the users using http request. If user refuses to receive a message (server down perhaps?) it will go back in retry queue, which will result in no loses of message (Unless queue goes down))
Now kafka is fault tolerant and failure safe because it write to a disk.
And its exactly why I am trying to implement kafka to our structure.
but there are problem to my plannings.
First, I was thinking to create as many topic as per user meaning each user would have each topic (What problem will this cause? My max estimate is that I will have around 1~5 million topics)
Second, If I decide to go for topics based on operation and partition by random hash of users id, if there was a problem with one user not consuming message currently, will the all user in the partition have to wait ? What would be the best way to structure this situation?
So as conclusion, 1~5 millions users. We do not want to have one user blocking large number of other users being processed. Having topic per user will solve this issue, it seems like there might be an issue with zookeeper if such large number gets in (Is this true? )
what would be the best solution for structuring? Considering scalability?
First, I was thinking to create as many topic as per user meaning each user would have each topic (What problem will this cause? My max estimate is that I will have around 1~5 million topics)
I would advise against modeling like this.
Google around for "kafka topic limits", and you will find the relevant considerations for this subject. I think you will find you won't want to make millions of topics.
Second, If I decide to go for topics based on operation and partition by random hash of users id
Yes, have a single topic for these messages and then route those messages based on the relevant field, like user_id or conversation_id. This field can be present as a field on the message and serves as the ProducerRecord key that is used to determine which partition in the topic this message is destined for. I would not include the operation in the topic name, but in the message itself.
if there was a problem with one user not consuming message currently, will the all user in the partition have to wait ? What would be the best way to structure this situation?
This depends on how the users are consuming messages. You could set up a timeout, after which the message is routed to some "failed" topic. Or send messages to users in a UDP-style, without acks. There are many ways to model this, and it's tough to offer advice without knowing how your consumers are forwarding messages to your clients.
Also, if you are using Kafka Streams, make note of the StreamPartitioner interface. This interface appears in KStream and KTable methods that materialize messages to a topic and may be useful in a chat applications where you have clients idling on a specific TCP connection.