How does Apache Kafka use open file descriptors? - apache-kafka

I wanted to know how does Kafka use open file descriptors. Why is it recommended to have a large number of open file descriptor. Does it impact Producer and Consumer throughput.

Brokers create and maintain file handles for each log segment files and network connections. The total number could be very huge if the broker hosts many partitions and partition has many log segment files. This applies for the network connection as well.
I don't immediately see any possible performance declines caused by setting a large file-max, but the page cache miss matters.

Kafka keeps one file descriptor open for every segment file, and it fails miserably if the limit is too low. I don't know if it affects consumer throughput, but I assume it doesn't since Kafka appears to ignore the limit until it is reached.
The number of segment files is the number of partitions multiplied by some number that is dependent on the retention policy. The default retention policy is to start a new segment after one week (or 1GB, whatever occurs first) and to delete a segment when all data in it is more than one week old.
(disclaimer: This answer is for Kafka 1.0 based on what I have learnt from one installation I have)

We can check in below ways.
if a broker hosts many partitions. For example, a Kafka broker needs at least the following number of file descriptors to just track log segment files:
(number of partitions)*(partition size / segment size)

Related

Kafka PersistentWindowStore rebalancing mechanics

I am creating a 30-minute de-duplication store for a Kafka Streams application loosely based upon this confluent code (to solve a different problem to Kafka's exactly-once processing guarantee), and want to minimise topology startup time.
This code makes use of a persistent window store, which requires that I specify the number of log segments to make use of. Assuming I want to use 2 segments, and am using the default segment size of 1GB, does this mean that during rebalancing, the client will have to read 2GB of data before the application launches?
The segment parameter configures something different in Kafka Streams -- it's not related to segments in the brokers (just the same name).
Using a windowed store, the retention time of the store, is divided by the number of segments. If all data is a segment is older than the retention time, the complete segment is dropped and a new empty segment is created. Those segments, only exist client-side.
The number of record that need to be restored, only depend on the retention time (and your input data rate). It's independent of segments size. Segment size only defined how fine grained older records are expired.

How is Apache Kafka offset generated?

Went through
How is the kafka offset value computed?
From the kafka documentation on replication:
The purpose of adding replication in Kafka is for stronger durability and higher availability. We want to guarantee that any successfully published message will not be lost and can be consumed, even when there are server failures. Such failures can be caused by machine error, program error, or more commonly, software upgrades.
From the kafka documentation on Efficiency:
The message log maintained by the broker is itself just a directory of files, each populated by a sequence of message sets that have been written to disk in the same format used by the producer and consumer. Maintaining this common format allows optimization of the most important operation: network transfer of persistent log chunks.
I did not see anywhere details regarding how the offset is generated for a topic. Will be offsets be generated by a single machine in the cluster in which case there is one master or Kafka has distributed logging that relies on some kind of clock synchronization and generates messages in a consistent order among all the nodes.
Any pointers or additional information will be helpful.
Offsets are not generated explicitly for each message and messages do also no store their offset.
A topic consists of partitions, and messages are written to partitions in junks, called segments (on the file system, there will be a folder for a topic, with subfolders for each partition -- a segment corresponds to a file within a partitions folder).
Furthermore, a index is maintained per partitions and stored along with the segment files, that uses the offset of the first message per segment as key and point to the segment. For all consecutive messages within a segment, the offset of a message can be computed by it's logical position within the segment (including the offset of the first messages).
If you start a new topic or actually a new partition, a first segment is generated and its start offset zero is inserted into the index. Message get written to the segment until it's full. A new segment is started and it's start offset get's added to the index -- the start offset of the new segment can easily be computed by the start offset of the latest segment plus the number of message within this segment.
Thus, for each partitions, the broker that hosts this partitions (ie, the leader for this partition) tracks the offset for this partitions by maintaining the index. If segments are deleted because retention time passed, the segment file get's deleted and the entry in the index is removed.

Kafka optimal retention and deletion policy

I am fairly new to kafka so forgive me if this question is trivial. I have a very simple setup for purposes of timing tests as follows:
Machine A -> writes to topic 1 (Broker) -> Machine B reads from topic 1
Machine B -> writes message just read to topic 2 (Broker) -> Machine A reads from topic 2
Now I am sending messages of roughly 1400 bytes in an infinite loop filling up the space on my small broker very quickly. I'm experimenting with setting different values for log.retention.ms, log.retention.bytes, log.segment.bytes and log.segment.delete.delay.ms. First I set all of the values to the minimum allowed, but it seemed this degraded performance, then I set them to the maximum my broker could take before being completely full, but again the performance degrades when a deletion occurs. Is there a best practice for setting these values to get the absolute minimum delay?
Thanks for the help!
Apache Kafka uses Log data structure to manage its messages. Log data structure is basically an ordered set of Segments whereas a Segment is a collection of messages. Apache Kafka provides retention at Segment level instead of at Message level. Hence, Kafka keeps on removing Segments from its end as these violate retention policies.
Apache Kafka provides us with the following retention policies -
Time Based Retention
Under this policy, we configure the maximum time a Segment (hence messages) can live for. Once a Segment has spanned configured retention time, it is marked for deletion or compaction depending on configured cleanup policy. Default retention time for Segments is 7 days.
Here are the parameters (in decreasing order of priority) that you can set in your Kafka broker properties file:
Configures retention time in milliseconds
log.retention.ms=1680000
Used if log.retention.ms is not set
log.retention.minutes=1680
Used if log.retention.minutes is not set
log.retention.hours=168
Size based Retention
In this policy, we configure the maximum size of a Log data structure for a Topic partition. Once Log size reaches this size, it starts removing Segments from its end. This policy is not popular as this does not provide good visibility about message expiry. However it can come handy in a scenario where we need to control the size of a Log due to limited disk space.
Here are the parameters that you can set in your Kafka broker properties file:
Configures maximum size of a Log
log.retention.bytes=104857600
So according to your use case you should configure log.retention.bytes so that your disk should not get full.

Partitions and Replications for the Apache Kafka

I have read the entire Documentation from the suggested website http://kafka.apache.org/ and did not able to understand the Hardware Requirements
1)I need a clarification on: How many Partitions and Replication is Required for collecting minimum 50GB of data per/day for single topic
2)It is given that the 0000000000000.log file is able to store up-to 100GB of data. Is it possible to reduce this log file size for reducing the usage of I/O ?
If the data is uniformed ingested during the entire day, that means that you need to ingest something like 600kb per second, all depends on the number of messages that are on those 600kb (according to Jay Creps explanation here you need to calculate something like 22 bytes of overhead per message) (keep in mind that the way you ACK the messages from the producer is also very important)
But you should be able with 1 topic and 1 partition to get this throughput from a producer.
1.Check this link it has the answer to choose #partitions:
http://www.confluent.io/blog/how-to-choose-the-number-of-topicspartitions-in-a-kafka-cluster/][1]
Yes it is possible to change the maximum size of log file in kafka. You have to set the below mentioned property on each of the brokers and then restart the brokers.
log.segment.bytes=1073741824
Above line will set the log segment size to 1GB.

Need help to understand Kafka storage

I am new in kafka. From the link : http://notes.stephenholiday.com/Kafka.pdf
It is mentioned:
"Every time a producer publishes a message to a partition, the broker
simply appends the message to the last segment file. For better
performance, we flush the segment files to disk only after a
configurable number of messages have been published or a certain
amount of time has elapsed. A message is only exposed to the consumers
after it is flushed."
Now my question is
What is segment file here?
When I create a topic with partition then each partition will have an index file and a .log file.
is this (.log file) the segment file? if so then it is already in disk so why it is saying "For better performance, we flush the segment files to
disk". if it is flushing to disk then where in the disk it is flushing?
It seems that until it flush to disk , it is not available to the the consumer. Then we adding some latency to read the message, but why?
Also want help to understand that when consumer wants to read some data then is it reading from disk (partition, segment file) or there is some cache mechanism , if so then how and when data is persisting into the cache?
I am not sure all questions are valid or not, but it will help me understand if anybody can clear it.
You can think this segment file as OS pagecache.
Kafka has a very simple storage layout. Each partition of a topic
corresponds to a logical log. Physically, a log is implemented as a
set of segment files of equal sizes. Every time a producer publishes a
message to a partition, the broker simply appends the message to the
last segment file. Segment file is flushed to disk after configurable
number of messages has been published or after certain amount of time.
Messages are exposed to consumer after it gets flushed.
And also please refer to document below.
http://kafka.apache.org/documentation/#appvsosflush
Kafka always immediately writes all data to the filesystem and
supports the ability to configure the flush policy that controls when
data is forced out of the OS cache and onto disk using the flush. This
flush policy can be controlled to force data to disk after a period of
time or after a certain number of messages has been written. There are
several choices in this configuration.
Don't get confused when you see the filesystem word there, OS pagecache is also a filesystem and the link you have mentioned is really very much outdated.