I’m evaluating Kafka-pixy REST proxy. The REST proxy is returning messages / events sent on a Kafka topic after the consumer joins the topic, but not the messages before that.
Is there an equivalent of —from-beginning in Kafka-pixy?
Related
I know that in Kafka, the consumer pulls messages off the broker topics (pull) ?
I get the feeling that Pulsar works the same way, considering that the receive method blocks. But I can't find a confirmation. Can someone point me to a reference or correct me ?
Thanks
Pulsar's Documentation clearly explains how message consumption works:
The Pulsar Consumer origin reads messages from one or more topics in
an Apache Pulsar cluster.
The Pulsar Consumer origin subscribes to Pulsar topics, processes
incoming messages, and then sends acknowledgements back to Pulsar as
the messages are read.
Messages can be received from brokers either synchronously (sync) or asynchronously (async).
receive method receives messages synchronously. The consumer process will be blocked until a message becomes available. For example,
Message msg = consumer.receive();
An asynchronous receive will return immediately with a value of type CompletableFuture that completes once a new message is available. For example,
CompletableFuture<Message> asyncMessage = consumer.receiveAsync();
In Pulsar document:
There is a queue at the consumer side to receive messages pushed from the broker. You can configure the queue size with the receiverQueueSize parameter. The default size is 1000). Each time consumer.receive() is called, a message is dequeued from the buffer.
So broker pushes messages to a queue on consumer side. When the receive method is invoked, a message will be dequeued and returned.
Pulsar consumer will regularly send permit request to Pulsar broker to ask for more message when the half of the queue is consumed. This is described here.
In short, as described here
Pulsar also uses a push-based approach but with an API that simulates consumer pulls.
I use a SpringBoot App to produce or consume/listen to Kafka
messages.
I produce a message in the topic and consume/listen to the
specific message by comparing the messageKey and then send the
consumed message for further processing.
I am stuck with what approach will be better suited to my requirement to get specific message i.e. Kafka Listener or Kafka Consumer what ?
KafkaListener is a Spring specific concept that wraps the Kafka Consumer API.
There is no way to get a message by a particular offset given the key. You must calculate the partition, then scan the entire offset
I wish to describe the following scenario:
I have a node.js backend application (It uses a single thread event loop).
This is the general architecture of the system:
Producer -> Kafka -> Consumer -> Database
Let's say that the producer sends a message to Kafka, and the purpose of this message is the make a certain query in database and retrieve the query result.
However, as we all know Kafka is an asynchronous system. If the producer sends a message to Kafka, it gets a response that the message has been accepted by a Kafka broker. Kafka broker doesn't wait until the consumer polls the message and processes it.
In this case, how can the producer get the query result operated on the database?
The flow using Kafka will look like this:
The only way of the Producer A be aware of what happened with the message consumed by the Consumer A is producing another message. Which will be handled accordingly by any other consumer available (in this case, Consumer B).
As you already mentioned, this flow is asynchronous. This can be useful when you have a very heavy processing on your query, like a report generation or something like that, and the second producer will notify an user inbox for example.
If that is not the case, perhaps you should use HTTP, which is synchronous and you will have the response at the end of processing.
You must generate new flow for communicate the query result:
Consumer (now its a producer) -> Kafka topic -> Producer (now its a consumer)
You should consider using another synchronous communication mechanism like HTTP.
During creating kafka producer, we can assign a client id. What is it used for? Can I get the producer client id in a consumer? For example, to see which producer produced the message?
No, a consumer cannot get the producer's client-id.
From the Kaka documentation, client-ids are:
An id string to pass to the server when making requests. The purpose
of this is to be able to track the source of requests beyond just
ip/port by allowing a logical application name to be included in
server-side request logging.
They are only used for identifying clients in the broker logs.
No, you'd have to pass it on as part of the key or value if you need it at the consumer side.
Kafka's philosophy is to decouple producers and consumers. A topic can be read by 0-n consumers and be written to by 0-n producers. Kafka is usually used for communication between (micro)service boundaries where services don't care about who produced a message, just about its contents.
I have one kafka producer and consumer.The kafka producer is publishing to one topic and the data is taken and some processing is done. The kafka consumer is reading from another topic about whether the processing of data from topic 1 was successful or not ie topic 2 has success or failure messages.Now Iam starting my consumer and then publishing the data to topic 1 .I want to make the producer and consumer synchronous ie once the producer publishes the data the consumer should read the success or failure message for that data and then the producer should proceed with the next set of data .
Apache Kafka and Publish/Subscribe messaging in general seeks to de-couple producers and consumers through the use of streaming async events. What you are describing is more like a batch job or a synchronous Remote Procedure Call (RPC) where the Producer and Consumer are explicitly coupled together. The standard Apache Kafka Producers/Consumer APIs do not support this Message Exchange Pattern but you can always write your own simple wrapper on top of the Kafka API's that uses Correlation IDs, Consumption ACKs, and Request/Response messages to make your own interface that behaves as you wish.
Short Answer : You can't do that, Kafka doesn't provide that support.
Long Answer: As Hans explained, Publish/Subscribe messaging model keeps Publish and subscribe completely unaware of each other and I believe that is where the power of this model lies. Producer can produce without worrying about if there is any consumer and consumer can consume without worrying about how many producers are there.
The closest you can do is, you can make your producer synchronous. Which means you can wait till your message is received and acknowledged by broker.
if you want to do that, flush after every send.