Kafka for API gateway to store messages - rest

I need to build a secure REST API for different services where client services can post and receive messages from other clients( like mail box. but messages are going to be in JSON form. and should be persistent. I am expecting around 5000 client services. With around 50 message per service per day).
My questions are:
Can I use Kafka for this(I think I will be needing some wrapper over
Kafka to manage other task) ?
If yes then outbox and inbox are going to be a separate topic for
each service?( 2 topics per service. 5000*2 topics. My plan is to
create them dynamically as new client joins in)
what are the alternative technologies to write this kind of gateway.
Any help will be appreciated.

You can't use Kafka to implement REST API because REST API implies request/response while Kafka is just a message queue (Kafka doesn't provide a mechanism to respond to messages). You can use Kafka to produce messages to be consumed by other services. The idea of message queues is to decouple producer from consumer and vice versa. When a consumer receives a message it acts on it, that's it. But when you say inbox/outbox you imply that there's a response for a message which means that producers and consumers pace should be similar which couples them which is against the nature of message queues.
It seems like in your case it makes more sense to use http requests/response or even websockets. If you want to save the request/response data (making it persistent) you can save it either in a database, object storage (like S3), log it or send each message to Kafka so that Kafka stores all of your messages, writes to Kafka will actually be very fast because Kafka is roughly-speaking an append-only log. You can then search messages values using ksqldb.

Related

Which messaging system for a web dashboard?

I would like to make a Web Dashboard system and I am facing a problem. I need to get an information that is in the cache of one of the instances of my program, for this I had thought of doing Pub/Sub with Kafka however I don't know how to do to Publish and get a response from one of my Subscriber. Do you know a pattern that allows this and a service that allows me to do this?
EDIT: I would like to design an infrastructure that follows this pattern:
Attached diagram is showing simple request->response flow, Kafka is designed for different types of architecture, so IMHO you should not focus on Kafka in this case.
However, if you still want to use Kafka for some other reasons I can suggest to you two options:
Stick with request->response flow and use ReplyingKafkaTemplate or AggregatingKafkaTemplate to handle it, second one is an extension of first one, this adds functionality to handle more responses then one. You can send a request to Kafka topic from the Dashboard application, then poll the message by one of the Bot instances, next, send reply to reply topic, and then process reply in Dashboard application.
Use Kafka to implement Event-Carried State Transfer pattern, move state (mutual guilds data) from Bot Instances directly to Dashboard application via Kafka topic. You can use several tools to implement this:
Bot applications send events to Kafka topic via simple KafkaProducer or KafkaTemplate, then use one of the Kafka Connect sink connectors to save data in Dashboards database.
Bot applications send events to Kafka topic via simple KafkaProducer or KafkaTemplate. Run Kafka Streams thread in Dashboard application and build a state using Kafka Streams functionalities - grouping, aggregating etc. Then read the state directly from Kafka Streams internal RocksDB database.

Event broadcasting in Kafka?

Is there a way to have a event delivered to every subscriber of a topic regardless of consumer group? think "Refresh your local cache" kind of scenario
As far as Kafka in concerned, you cannot subscribe to a topic without a consumer group.
Out of the box, this isn't a pattern of a Kafka consumer; there isn't a way to make all consumers in a group read all messages from all partitions. There cannot be more consumer clients than partitions (thereby making "fan out" hard), and only one message goes to any one consumer before the message offset gets committed and the entire consumer group seeks those offsets forward to consume later events.
You'd need a layer above the consumer to decouple yourself from the consumer-group limitations.
For example, with Interactive Queries, you'd consume a topic and build a StateStore from the data that comes in, effectively building a distributed cache. With that, you can layer in an RPC layer (mentioned in those docs) that allows external applications over a protocol of your choice (e.g. HTTP) to later query and poll that data. From an application that is polling the data, you then would have the option of forwarding "notification events" via any method of your choice.
As for a framework that already exposes most of this out-of-the-box, checkout Azkarra Streams (I have no affiliation)
Or you can use alternative solutions such as Kafka Connect and write data to Slack, Telegram, etc. "message boards" instead, where many people explicitly subscribe to those channel notifications.

Process messages pushed through Kafka

I haven't used Kafka before and wanted to know if messages are published through Kafka what are the possible ways to capture that info?
Is Kafka only way to receive that info via "Consumers" or can Rest APIs be also used here?
Haven't used Kafka before and while reading up I did find that Kafka needs ZooKeeper running too.
I don't need to publish info just process data received from Kafka publisher.
Any pointers will help.
Kafka is a distributed streaming platform that allows you to process streams of records in near real-time.
Producers publish records/messages to Topics in the cluster.
Consumers subscribe to Topics and process those messages as they are available.
The Kafka docs are an excellent place to get up to speed on the core concepts: https://kafka.apache.org/intro
Is Kafka only way to receive that info via "Consumers" or can Rest APIs be also used here?
Kafka has its own TCP based protocol, not a native HTTP client (assuming that's what you actually mean by REST)
Consumers are the only way to get and subsequently process data, however plenty of external tooling exists to make it so you don't have to write really any code if you don't want to in order to work on that data

Kafka vs JMS for event publishing

In our scenario we have a set of micro services which interact with other services by sending event messages. We anticipate millions of messages per day at the peak. Every message is targeted to one or more listener types.
Our requirements are as follows:
Zero lost messages.
Ability to dynamically register multiple listeners of a specific
type in order to increase throughput.
Listeners are not guaranteed to be alive when messages are
dispatched.
We consider 2 options:
Send each message to JMS main queue then listeners of that queue will route the messages to specific queues according to message content, and then target services will listen to those specific queues.
Send messages to a Kafka topic by message type then target services will subscribe to the relevant topic and consume the messages.
What are the cons and pros for using either JMS or Kafka for that purpose?
Your first requirement is "zero lost messages". However, if you want publish-subscribe semantics (i.e. topics in JMS), but listeners are not guaranteed to be alive when messages are dispatched then JMS is a non-starter as those messages will simply be discarded (i.e. lost).
I would suggest to go with Kafka as it has fault tolerance mechanism and even if some message lost or not captured by any listener you can easily retrieve it from Kafka cluster.
Along with this you can easily add new listener / listener in group and kafka along with zookeeper will take care of managing it very well.
In summary, Kafka is a distributed publish-subscribe messaging system that is designed to be fast, scalable, and durable. Like many publish-subscribe messaging systems, Kafkamaintains feeds of messages in topics. Producers write data to topics and consumers read from topics.
Very easy for integration.

Is RabbitMQ capable of "pushing" messages from a queue to a consumer?

With RabbitMQ, is there a way to "push" messages from a queue TO a consumer as opposed to having a consumer "poll and pull" messages FROM a queue?
This has been the cause of some debate on a current project i'm on. The argument from one side is that having consumers (i.e. a windows service) "poll" a queue until a new message arrives is somewhat inefficient and less desirable than the idea having the message "pushed" automatically from the queue to the subscriber(s)/consumer(s).
I can only seem to find information supporting the idea of consumers "polling and pulling" messages off of a queue (e.g. using a windows service to poll the queue for new messages). There isn't much information on the idea of "pushing" messages to a consumer/subscriber...
Having the server push messages to the client is one of the two ways to get messages to the client, and the preferred way for most applications. This is known as consuming messages via a subscription.
The client is connected. (The AMQP/RabbitMQ/most messaging systems model is that the client is always connected - except for network interruptions, of course.)
You use the client API to arrange that your channel consume messages by supplying a callback method. Then whenever a message is available the server sends it to the client over the channel and the client application gets it via an asynchronous callback (typically one thread per channel). You can set the "prefetch count" on the channel which controls the amount of pipelining your client can do over that channel. (For further parallelism an application can have multiple channels running over one connection, which is a common design that serves various purposes.)
The alternative is for the client to poll for messages one at a time, over the channel, via a get method.
You "push" messages from Producer to Exchange.
https://www.rabbitmq.com/tutorials/tutorial-three-python.html
BTW this is fitting very well IoT scenarios. Devices produce messages and sends them to an exchange. Queue is handling persistence, FIFO and other features, as well as delivery of messages to subscribers.
And, by the way, you never "Poll" the queue. Instead, you always subscribe to publisher. Similar to observer pattern. Generally, I would say genius principle.
So it is similar to post box or post office, except it sends you a notification when message is available.
Quoting from the docs here:
AMQP brokers either deliver messages to consumers subscribed to
queues, or consumers fetch/pull messages from queues on demand.
And from here:
Storing messages in queues is useless unless applications can consume
them. In the AMQP 0-9-1 Model, there are two ways for applications to
do this:
Have messages delivered to them ("push API")
Fetch messages as needed ("pull API")
With the "push API", applications have to indicate interest in
consuming messages from a particular queue. When they do so, we say
that they register a consumer or, simply put, subscribe to a queue. It
is possible to have more than one consumer per queue or to register an
exclusive consumer (excludes all other consumers from the queue while
it is consuming).
Each consumer (subscription) has an identifier called a consumer tag.
It can be used to unsubscribe from messages. Consumer tags are just
strings.
RabbitMQ broker is like server that wont send data to consumer without consumer client getting registering itself to server. but then question comes like below
Can RabbitMQ keep client consumer details and connect to client when packet comes?
Answer is no. so what is alternative well then write plugin by yourself that maintain client information in some kind of config. Plugin will pull from RabbitMQ Queue and push to client.
Please give look at this plugin might help.
https://www.rabbitmq.com/shovel.html
Frankly speaking Client need to implement AMQP protocol to receive so and should listen connection on some port for that. This sound like another server.
Regards,
Vishal