kafka topic with Server Sent Events (SSE) - apache-kafka

I have a requirement to use Kafka for SSE events, my application is a mid-tier application which hooks up Mobile chat calls from front-end with the server-side.
P.S: I saw a post similar to my requirement "https://stackoverflow.com/questions/65071532/replay-kafka-topic-with-server-sent-events" I have a few questions on it.
If we go with Kafka, is it a good practice to use multiple group id's, if an application is deployed in 200 machines(creating group id with machine name), as it will create 200 group id's?
Whenever we get a response from the server for a particular call we will need to map it to the same machine, in order to achieve it we have to consume messages in all 200 machines and read the header to see if it belongs to the same machine if machine name matches then we will send the response but other machines which consumes the same have to discard it after reading the header, will it not be an over kill in terms of CPU consumption or memory usage?

Related

Kafka - sending the reply exactly to the sender

I need some guidance wrt Kafka and Distributed Systems
I have created a distributed system using Kafka. Let us assume the following architecture and data flow
The load balancer sends the request to the Service Containers
The Service containers send the message to the Kafka Worker Queue and maintains a Set to maintain the messages it has sent to the Kafka
A single GPU worker gets the message and performs work accordingly and writes the answer to the answer queue. The request ID is in the answer.
All the services get the Answer and they check if the request ID of the answer belongs to the Set. If no, they drop the message. Otherwise they send the reply to the client.
My questions are following
In this architecture the answer is recieved by all the services and they get very less useful messages for them. How to improve it so that ONLY the service sending the message gets the answer.
How can I do this in a autoscaling enviornment where the number of service nodes/containers/pods is dynamic?

Is a message queue like RabbitMQ the ideal solution for this application?

I have been working on a project that is basically an e-commerce. It's a multi tenant application in which every client has its own domain and the website adjusts itself based on the clients' configuration.
If the client already has a software that manages his inventory like an ERP, I would need a medium on which, when the e-commerce generates an order, external applications like the ERP can be notified that this has happened to take actions in response. It would be like raising events over different applications.
I thought about storing these events in a database and having the client make requests in a short interval to fetch the data, but something about polling and using a REST Api for this seems hackish.
Then I thought about using Websockets, but if the client is offline for some reason when the event is generated, the delivery cannot be assured.
Then I encountered Message Queues, RabbitMQ to be specific. With a message queue, modeling the problem in a simplistic manner, the e-commerce would produce events on one end and push them to a queue that a clients worker would be processing as events arrive.
I don't know what is the best approach, to be honest, and would love some of you experienced developers give me a hand with this.
I do agree with Steve, using a message queue in your situation is ideal. Message queueing allows web servers to respond to requests quickly, instead of being forced to perform resource-heavy procedures on the spot. You can put your events to the queue and let the consumer/worker handle the request when the consumer has time to handle the request.
I recommend CloudAMQP for RabbitMQ, it's easy to try out and you can get started quickly. CloudAMQP is a hosted RabbitMQ service in the cloud. I also recommend this RabbitMQ guide: https://www.cloudamqp.com/blog/2015-05-18-part1-rabbitmq-for-beginners-what-is-rabbitmq.html
Your idea of using a message queue is a good one, better than database or websockets for the reasons you describe. With the message queue (RabbitMQ, or another server/broker based system such as Apache Qpid) approach you should consider putting a broker in a "DMZ" sort of network location so that your internal ecommerce system can push events out to it, and your external clients can reach into without risking direct access to your core business systems. You could also run a separate broker per client.

Solutions of Kafka project to analyze HTTP requests on web server

Context:
A Web server that receives millions of HTTP requests every day. Of
course, there must be a project(named handler) who is responsible for handling
these requests and response them with some information.
Seen from the server side, I would like to use Kafka to extract some information from them and analyze it in real time(or each time interval).
Question:
how can I use these requests as the producer of Kafka?
how to build a customer of Kafka?(all this data need to be analyzed and then returned, but Kafka is "just" a message system)
Some imaginations:
A1.1 Maybe I can let the project "handler" call the jar of Kafka then, it can trigger the producer code to send message Kafka.
A1.2 Maybe I can create another project who listens to all the HTTP requests at the server, but there are other HTTP requests at the server.
I tried to think a lot of solutions, but I am not so sure about them, I would like to ask your guys if you have already known some mature ideas or you have some ideas to implement this?
You can use elk . kafka as the log broker

Http Kafka producer

Our application receives events through a HAProxy server on HTTPs, which should be forwarded and stored to Kafka cluster.
What should be the best option for this ?
This layer should receive events from HAProxy & produce them to Kafka cluster, in a reliable and efficient way (and should scale horizontally).
Please suggest.
I'd suggest to write a simple application in Java that just receives events and sends it to Kafka. The Java client for Kafka is the official client thus is the most reliable. The other option is to use an arbitrary language together with the official Kafka REST Proxy.
Every instance of the app should send the messages to all partitions based on some partition key. Then you can run multiple instances of the app and they don't even need to know about each other.
Just write a simple application which consumes the messages from the Proxy
and send the response which you have obtained to the producer by setting the Kafka Configurationsproducer.data(). If the configurations are done successfully. you can able to consume the messages from the Proxy server which you use and see the response output in /tmp/kafka-logs/topicname/00000000000000.log.
this link will help you to tritw enter link description here
Good Day
Keep Coding

Send broadcast message to 100K subscribers using ejabberd

I am creating PubSub like messaging application in which all subscribers of particular channel will get the message if publishers sends to the channel moreover 100k is the maximum subscriber count.
using ejabberd may I know the possibility of performance i.e can ejabberd handle 100k subscribers and will able to send message to all ?
Performance depends on many elements. Payload size, push frequency, node configuration, type of online clients connection (slow / fast), machine specification.
However, you should be able to reach that level, indeed.