Does Kafka guarantee message ordering within a single partition with ANY config param values? - apache-kafka

If I set Kafka config param at Producer as:
1. retries = 3
2. max.in.flight.requests.per.connection = 5
then its likely that Messages within one partition may not be in send_order.
Does Kafka takes any extra step to make sure that messages within a partition remains in sent order only
OR
With above configuration, its possible to have out of order messages within a partition ?

Unfortunately, no.
With your current configuration, there is a chance message will arrive unordered because of your retries and max.in.flight.requests.per.connection settings..
With retries config set to greater than 0 you will lose ordering in the following case (just an example with random numbers):
You send a message/batch to partition 0 which is located on broker 0, and brokers 1 and 2 are ISR.
Broker 0 fails, broker 1 becomes leader.
Your message/batch returns a failure and needs to be retried.
Meanwhile, you send next message/batch to partition 0 which is now known to be on broker 1, and this happens before your previous batch actually gets retried.
Message/batch 2 gets acknowledged (succeeds).
Message/batch 1 is re sent and now gets acknowledged too.
Order lost.
I might be wrong, but in this case, reordering can probably happen even with max.in.flight.requests.per.connection set to 1 you can lose message order in case of broker failover, e.g. batch can be sent to the broker earlier than the previous failed batch figures out it should go to that broker too.
Regarding max.in.flight.requests.per.connection and retries being set together it's even simpler - if you have multiple unacknowledged requests to a broker, the first one to fail will arrive unordered.
However, please take into account this is only relevant to situations where a message/batch fails to acknowledge for some reason (sent to wrong broker, broker died etc.)
Hope this helps

Related

Best way to configure retries in Kaka Producer

With In-Sync replicas configured as Acks=all and min.insync.replicas = N,
Want to understand how retries should be configured for the message/record for unprocessed producer records
Example:
When Kafka fails to process the record with ISR online was N-1 during processing and minimum configured ISR was N replicas.
What is acks?
The acks parameter control how many partition replicas must receive the record before the producer can consider the write successful.
There are 3 values for the acks parameter:
acks=0, the producer will not wait for a reply from the broker before assuming the message sent successfully.
acks=1, the producer will receive a successful response from the broker the moment the leader replica received the message. If the message can't be written to the leader, the producer will receive an error response and can retry.
acks=all, the producer will receive a successful response from the broker once all in-sync replicas received the message.
In your case, acks=all, which is the safest way since you can make sure one more broker has the message.
Retries:
If the producer receives the error message, then the value of the retries property comes into the picture. You can use retry.backoff.ms property to configure the time between retries. It is recommended, test how long it takes to recover from a crashed broker and setting the number of retries and delay between them such that the total amount of time spent retrying will be longer than the time it takes the Kafka cluster to recover from scratch.
Also, check the below link,
https://www.confluent.io/blog/hands-free-kafka-replication-a-lesson-in-operational-simplicity/
For above scenario, you will get NotEnoughReplicasException. At this stage, Kafka reties the message
Default Retries Value as 0 for kafka <=2.0 and this value set to very high value for Kafka >=2.1
Also, there is another setting with name "retry.backof.ms". Default value for this setting is 100ms. Kafka producer will retry every 100ms to produce this message till it gets succeed.
To avoid it to retry for infinite number of times or you can set "delivery.timout.ms" to make sure the Producer retries to send the message for that man millisecs. Default value for it is 120000ms. which is nothing but 2mins. Means, Producer will not retry after 2mins and considered the message as fail.
Also, you might need to consider "max.in.flight.requests" setting to make sure the retry messages are processed in Sequence when your Kafka messages have a Key in it

If the partition a Kafka producer try to send messages to went offline, can the producer try to send to a different partition?

My Kafka cluster has 5 brokers and the replication factor is 3 for topics. At some time some partitions went offline but eventually they went back online. My questions are:
How many brokers were down does it indicate, given the fact that there were offline partitions? I think given the cluster setup above, I can afford to lose 2 brokers at the same time. However, if there were 2 brokers down, for some partitions they no longer have quorum; will these partitions go offline in this case?
If there are offline partitions, and a Kafka producer tries to send messages to them and fails, will the producer try a different partition that may be online? The messages have no key in them.
Not sure if I understood your question completely right but I have the impression that you are mixing up partitions and replications. Or at least, your question cannot be looked at isolated on the producer. As soon as one broker is down some things will happen on the cluster.
Each TopicPartition has one Partition Leader and your clients (e.g. Producer and Consumer) are only communicating with this one leader, independen of the number of replications.
In the case where two out of five broker are not available, Kafka will move the partition leader as well as the replicas to a healthy broker. In that scenario you should therefore not get into trouble although it might take some time and retries for the new leader to be selected and the new replications to be created on the healthy broker. A leader selection can be made fast as you have set the replication factor to three, so even if two brokers go down, one broker should still have the complete data (assuming all partitions were in-sync). However, creating two new replicas could take some time dependent on the amount of data. For that scenario you need to look into the topic level configuration min.insync.replicas and the KafkaProducer confiruation acks (see below).
I think the following are the most important configurations for your KafkaProducer to handle such situation:
bootstrap.servers: If you are anticipating regular connection problems with your brokers, you should ensure that you list all five of them. Although it is sufficient to only mention one address (as one broker will then communicate will all other brokers in the cluster) it is safe to have them all listed in case one or even two broker are not available.
acks: This defaults to 1 and defines the number of acknowledgments the producer requires the partition leader to have received before considering a request as successful. Possible values are 0, 1 and all.
retries: This value defaults to 2147483647 and will cause the client to resend any record whose send fails with a potentially transient error until the time of delivery.timeout.ms is reached
delivery.timeout.ms: An upper bound on the time to report success or failure after a call to send() returns. This limits the total time that a record will be delayed prior to sending, the time to await acknowledgement from the broker (if expected), and the time allowed for retriable send failures. The producer may report failure to send a record earlier than this config if either an unrecoverable error is encountered, the retries have been exhausted, or the record is added to a batch which reached an earlier delivery expiration deadline. The value of this config should be greater than or equal to the sum of request.timeout.ms and linger.ms.
You will find more details on the documentation on the Producer configs.

A kafka record is acknowledged but no data returned to consumer

There is a Kafka (version 2.2.0) cluster of 3 nodes. One node becomes artificially unavailable (network disconnection). Then we have the following behaviour:
We send a record to a producer with the given topic-partition (to the specific partition, let's say #0).
We receive a record metadata from the producer what confirms that it has been acknowledged.
Immediately after that we poll a consumer assigned to the same topic-partition and an offset taken from the record's metadata. The poll timeout was set to 30 seconds. No data is returned (an empty set is returned).
This happens inconsistently from time to time (under described circumstances with one Kafka node failure).
Essentially my question is: should data be immediately available for consumers ones it is acknowledged? What the reasonable timeout for that if not?
UPD: some configuration details:
number of partitions for the topic: 1
default replication factor: 3
sync replication factor: 2
acks for producer: all
The default setting of acks on the producer is 1. This means the producer waits for the acknowledgement from the leader replica only. If the leader dies right after acknowledging, the message won't be delivered.
Should data be immediately available for consumers? Yes, in general there should be very little lag per default, should be effectively on the milliseconds range per default and without load.
If you want to make sure that a message can't be lost, you have to configure the producer to "acks=all" in addition to min.insync.replicas=2. This will make sure all in sync replicas acknowledge the message, and that minimum 2 nodes do. So you are still allowed to lose one node and be fine. Lose 2 nodes and you won't be able to send, but even then messages won't be lost.

Kafka broker message loss scenario on leadership change

I am trying to understand the following behavior of message loss in Kafka. Briefly, when a broker dies early on and subsequently after some message processing, all other brokers die. If the broker that died first starts up, then it does not catch up with other brokers after they come up. Instead all the other brokers report errors and reset their offset to match the first broker. Is this behavior expected and what are the changes/settings to ensure zero message loss?
Kafka version: 2.11-0.10.2.0
Reproducible steps
Started 1 zookeeper instance and 3 kafka brokers
Created one topic with replication factor of 3 and partition of 3
Attached a kafka-console-consumer to topic
Used Kafka-console-producer to produce 2 messages
Killed two brokers (1&2)
Sent two messages
Killed last remaining broker (0)
Bring up broker (1) who had not seen the last two messages
Bring up broker (2) who had seen the last two messages and it shows an error
[2017-06-16 14:45:20,239] INFO Truncating log my-second-topic-1 to offset 1. (ka
fka.log.Log)
[2017-06-16 14:45:20,253] ERROR [ReplicaFetcherThread-0-1], Current offset 2 for
partition [my-second-topic,1] out of range; reset offset to 1 (kafka.server.Rep
licaFetcherThread)
Finally connect kafka-console-consumer and it sees two messages instead of the four that were published.
Response here : https://kafka.apache.org/documentation/#producerconfigs
The number of acknowledgments the producer requires the leader to have received before considering a request complete. This controls the durability of records that are sent. The following settings are allowed:
acks=0 If set to zero then the producer will not wait for any acknowledgment from the server at all. The record will be immediately added to the socket buffer and considered sent. No guarantee can be made that the server has received the record in this case, and the retries configuration will not take effect (as the client won't generally know of any failures). The offset given back for each record will always be set to -1.
acks=1 This will mean the leader will write the record to its local log but will respond without awaiting full acknowledgement from all followers. In this case should the leader fail immediately after acknowledging the record but before the followers have replicated it then the record will be lost.
acks=all This means the leader will wait for the full set of in-sync replicas to acknowledge the record. This guarantees that the record will not be lost as long as at least one in-sync replica remains alive. This is the strongest available guarantee. This is equivalent to the acks=-1 setting.
By default acks=1 so set it to 'all' :
acks=all in your producer.properties file
Check if unclean.leader.election.enable = true and if so, set it to false so that only replicas that are insync can become the leader. If an out of sync replica is allowed to become leader then messages can get truncated and lost.

How is ordering guaranteed during failures in Kafka Async Producer?

If I am using Kafka Async producer, assume there are X number of messages in buffer.
When they are actually processed on the client, and if broker or a specific partition is down for sometime, kafka client would retry and if a message is failed, would it mark the specific message as failed and move on to the next message (this could lead to out of order messages) ? Or, would it fail the remaining messages in the batch in order to preserve order?
I next to maintain the ordering, so would ideally want to kafka to fail the batch from the place where it failed, so I can retry from the failure point, how would I achieve that?
Like it says in the kafka documentation about retries
Setting a value greater than zero will cause the client to resend any
record whose send fails with a potentially transient error. Note that
this retry is no different than if the client resent the record upon
receiving the error. Allowing retries will potentially change the
ordering of records because if two records are sent to a single
partition, and the first fails and is retried but the second succeeds,
then the second record may appear first.
So, answering to your title question, no kafka doesn't have order guarantees under async sends.
I am updating the answers base on Peter Davis question.
I think that if you want to send in batch mode, the only way to secure it I would be to set the max.in.flight.requests.per.connection=1 but as the documentation says:
Note that if this setting is set to be greater than 1 and there are
failed sends, there is a risk of message re-ordering due to retries
(i.e., if retries are enabled).
Starting with Kafka 0.11.0, there is the enable.idempotence setting, as documented.
enable.idempotence: When set to true, the producer will ensure that
exactly one copy of each message is written in the stream. If false,
producer retries due to broker failures, etc., may write duplicates of
the retried message in the stream. Note that enabling idempotence
requires max.in.flight.requests.per.connection to be less than or
equal to 5, retries to be greater than 0 and acks must be all. If
these values are not explicitly set by the user, suitable values will
be chosen. If incompatible values are set, a ConfigException will be
thrown.
Type: boolean Default: false
This will guarantee that messages are ordered and that no loss occurs for the duration of the producer session. Unfortunately, the producer cannot set the sequence id, so Kafka can make these guarantees only per producer session.
Have a look at Apache Pulsar if you need to set the sequence id, which would allow you to use an external sequence id, which would guarantee ordered and exactly-once messaging across both broker and producer failovers.