Kafka - is message available directly after the producer recieves ACK? - apache-kafka

Kafka - is a message available directly after the producer receives ACK? (assuming that ack=all is set in current configuration)
Scenario:
consumer seeks to end on partition 1
my producer produces a message on Kafka topic (partition 1)
ack-all is completed
consumer poll the partition 1
Is it possible that the consumer will not receive that new message, assuming that the poll duration will be respectively small? Or ack-all guarantees that in moment of request is processed data is available to read

Related

Does Kafka send notification fo all consumers that new message has arrived?

Imagine there are 1 producer and 1000 consumers with same group id (the producer and consumer group id is not the same).
When message arrived and Kafka place it to the queue, does Kafka send notification to 1000 consumers that new message has been arrived (and after that, only one consumer takes the message)?
If it's not, how does consumer know that new message has been arrived?
Does Kafka send notification fo all consumers that new message has arrived?
Kafka works differently.
In the case you describe, all consumers would regularly try to fetch messages from the brokers. Thus, it's not necessary for the broker to send a notification, because the consumer pro-actively poll for new messages anyway.

Consumer timeout during rebalance

When a consumer drops from a group and a rebalance is triggered, I understand no messages are consumed -
But does an in-flight request for messages stay queued passed the max wait time?
Or does Kafka send any payload back during the rebalance?
UPDATE
For clarification, I'm referring specifically to the consumer polling process.
From my understanding, when one of the consumers drop from the consumer group, a rebalance of the partitions to consumers is performed.
During the rebalance, will an error be sent back to the consumer if it's already polled and waiting for max time to pass?
Or does Kafka wait the max time and send an empty payload?
Or does Kafka queue the request passed max wait time until the rebalance is complete?
Bottom line - I'm trying to explain periodic timeouts from consumers.
This may be in the docs, but I'm not sure where to find it.
Kafka producers doesn't directly send messages to their consumers, rather they send them to the brokers.
The inflight requests corresponds to the producer and not to the consumer.
Whether the consumer leaves a group and a rebalance is triggered or not is quite immaterial to the behaviour of the producer.
Producer messages are queued in the buffer, batched, optionally compressed and sent to the Kafka broker as per the configuration.
In-flight requests are the maximum number of unacknowledged requests
the client will send on a single connection before blocking.
Note that when we say ack, it is acknowledgement by the broker and not by the consumer.
Does Kafka send any payload back during the rebalance?
Kafka broker doesn't notify of any rebalance to its producers.

Kafka - Message versus Record versus offset

I am new to Streaming Broker [like Kafka], and coming from Queueing Messaging System [like JMS, Rabbit MQ].
I read from Kafka docs that, messages are stored in Kafka partitions in offset as record. And consumer reads from offset.
What is the difference between message and record [does multiple/partial messages constitute a record?]
When comsumer reads from offset, is there a possibility that consumer reads partial message? IS there a need for consumer to string these parital messages based on some logic?
OR
1 message = 1 record = 1 offset
EDIT1:
The question was popped because, the "batch size" decides how many bytes of message should be published on to the borker. Lets say there are 2 messages with message1 = 100bytes and message2= 200 bytes and batchsize is set to 150bytes. Does this mean 100 bytes from message1 and 50 bytes from message2 are sent to broker at once? If yes, how are these 2 messages stored in offset?
In Kafka, a Producer sends messages or records (both terms can be used interchangeably) to Topics. A topic is divided into one or more Partitions that are distributed among the Kafka Cluster, which is generally composed of at least three Brokers.
A message/record is sent to a leader partition (which is owned by a single broker) and associated to an Offset. An Offset is a monotonically increasing numerical identifier used to uniquely identify a record inside a topic/partition, e.g. the first message stored in a record partition will have the offset 0 and so on.
Offsets are used both to identify the position of a message in a topic/partition as well as for the position of a Consumer Group.
For optimisation purpose , a producer will batch messages per partition. A batch is considered to be ready when either the configured batch.sized or linger.ms are reached. For example, if you have a batch.size set to 200KB and you send two messages (150KB and 100KB), they will be part potentially of the same batch. But the producer will never fragment a single message into chuncks.
No, a consumer cannot read partial messages.

Retry to consume messages from Kafka topic

I'm working on a module where it consumes messages from a Kafka topic and publish to a downstream system. In the event of downstream system is unavailable consumer do not acknowledge the Kakfa message. Because of this when my consumer receives messages when downstream system is unavailable offset of the kakfa will not be committed. But if I receive new message after downstream system comes up and when I acknowledge that message, latest offset will be committed and consumer never receive those messages which were in the topic without the offset commit.
i.e Let's say my consumer is consumed up to offset 4. Consumer receive two messages when downstream is unavailable and because of that my consumer didn't commit the offset. So number of messages in the toipc is now 6, but offset is still 4. Now downstream system is available and consumer receive a new message (7th message). Since there is no issue from downstream, consumer acknowledge the 7th message and offset of the topic will be set to 7.
Is there any method that my consumer can receive the 5th and 6th messages before it receives the 7th message? I use spring cloud stream in the implementation.
See this answer.
You need a SeekToCurrentErrorHandler and throw an exception so that the offsets are reset.

Offset and Partition - Kafka Sink Processor

We are using kafka Topology forward to send a record to a kafka topic.
We were using a separate producer to publish the message earlier and we were able to grab the offset and partition of the message. Now we want to replace it with Context.forward.
How can we get the offset and partition of the record sent by Kafka Sink Processor using context.forward
Publish message to topic in producer.type=sync mode. when you call send() method, it will return all the details you are looking.