Kafka cluster with 3 brokers(version:1.1.0) and is well running for over 6 months.
Then we modified partitions from 3 to 48 for every topic after 2018/12/12, then the brokers shutdown every 5-10 days.
Then we upgraded the broker from 1.1.0 to 2.1.0, but the brokers still keep shutting down every 5-10 days.
Each time, one broker shut down after the following error log, then several minutes later, the other 2 brokers shut down too, with the same error but other partition log files.
[2019-01-11 17:16:36,572] INFO [ProducerStateManager partition=__transaction_state-11] Writing producer snapshot at offset 807760 (kafka.log.ProducerStateManager)
[2019-01-11 17:16:36,572] INFO [Log partition=__transaction_state-11, dir=/kafka/logs] Rolled new log segment at offset 807760 in 4 ms. (kafka.log.Log)
[2019-01-11 17:16:46,150] WARN Resetting first dirty offset of __transaction_state-35 to log start offset 194404 since the checkpointed offset 194345 is invalid. (kafka.log.LogCleanerManager$)
[2019-01-11 17:16:46,239] ERROR Failed to clean up log for __transaction_state-11 in dir /kafka/logs due to IOException (kafka.server.LogDirFailureChannel)
java.nio.file.NoSuchFileException: /kafka/logs/__transaction_state-11/00000000000000807727.log
at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
at sun.nio.fs.UnixCopyFile.move(UnixCopyFile.java:409)
at sun.nio.fs.UnixFileSystemProvider.move(UnixFileSystemProvider.java:262)
at java.nio.file.Files.move(Files.java:1395)
at org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:809)
at org.apache.kafka.common.record.FileRecords.renameTo(FileRecords.java:222)
at kafka.log.LogSegment.changeFileSuffixes(LogSegment.scala:488)
at kafka.log.Log.asyncDeleteSegment(Log.scala:1838)
at kafka.log.Log.$anonfun$replaceSegments$6(Log.scala:1901)
at kafka.log.Log.$anonfun$replaceSegments$6$adapted(Log.scala:1896)
at scala.collection.immutable.List.foreach(List.scala:388)
at kafka.log.Log.replaceSegments(Log.scala:1896)
at kafka.log.Cleaner.cleanSegments(LogCleaner.scala:583)
at kafka.log.Cleaner.$anonfun$doClean$6(LogCleaner.scala:515)
at kafka.log.Cleaner.$anonfun$doClean$6$adapted(LogCleaner.scala:514)
at scala.collection.immutable.List.foreach(List.scala:388)
at kafka.log.Cleaner.doClean(LogCleaner.scala:514)
at kafka.log.Cleaner.clean(LogCleaner.scala:492)
at kafka.log.LogCleaner$CleanerThread.cleanLog(LogCleaner.scala:353)
at kafka.log.LogCleaner$CleanerThread.cleanFilthiestLog(LogCleaner.scala:319)
at kafka.log.LogCleaner$CleanerThread.doWork(LogCleaner.scala:300)
at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:82)
Suppressed: java.nio.file.NoSuchFileException: /kafka/logs/__transaction_state-11/00000000000000807727.log -> /kafka/logs/__transaction_state-11/00000000000000807727.log.deleted
at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
at sun.nio.fs.UnixCopyFile.move(UnixCopyFile.java:396)
at sun.nio.fs.UnixFileSystemProvider.move(UnixFileSystemProvider.java:262)
at java.nio.file.Files.move(Files.java:1395)
at org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:806)
... 17 more
[2019-01-11 17:16:46,245] INFO [ReplicaManager broker=2] Stopping serving replicas in dir /kafka/logs (kafka.server.ReplicaManager)
[2019-01-11 17:16:46,314] INFO Stopping serving logs in dir /kafka/logs (kafka.log.LogManager)
[2019-01-11 17:16:46,326] ERROR Shutdown broker because all log dirs in /kafka/logs have failed (kafka.log.LogManager)
if you have not changed log.retention.bytes or log.retention.hours or log.retention.minutes or log.retention.ms configs, Kafka tries to delete logs after 7 days. So based on the exception, Kafka wants to clean up file /kafka/logs/__transaction_state-11/00000000000000807727.log but, there is no such file in Kafka log directory and it throws an exception which causes broker shut down.
if you are able to shut down cluster and Zookeeper do it and clean up /kafka/logs/__transaction_state-11 manually.
Note: I don't know it is harmful or not but you can follow safely remove Kafka topic posts.
Related
I am currently using the confluent platform community license. I started Zookeeper, Kafka and the schema-registry - all are used in local mode. However when starting the schema-registry for the first time, 50 messages are sent and stored inside the __consumer_offset topic (__consumer_offsets-0 to __consumer_offsets-49). Those messages are stored in the kafka-logs and when I am trying to start the services again, it fails. To be more precise: Zookeeper works but Kafka fails with the error:
"ERROR Shutdown broker because all log dirs have failed".
As suggested in some other posts I deleted the log.dirs directory referenced in the zookeeper.properties file and the log.dirs directory referenced in the server.properties file. After doing this I can start kafka again without any error - but the 50 messages are stored in __consumer_offset again when starting the schema-registry and after stopping kafka and trying to start kafka again it fails with the same error.
Any help is greatly appreciated. :)
EDIT:
Above that error theres another error saying:
"ERROR Failed to clean up log for _schemas-0 in dir /mnt/c/Users/Username/Desktop/Big_Data/confluent-6.0.0/kafka-logs due to IOException (kafka.server.LogDirFailureChannel) java.io.IOException: Invalid argument"
and also two warnings:
"WARN [ReplicaManager broker=0] Stopping serving replicas in dir /mnt/c/Users/Username/Desktop/Big_Data/confluent-6.0.0/kafka-logs (kafka.server.ReplicaManager)"
and
"WARN [ReplicaManager broker=0] Broker 0 stopped fetcher for partitions __consumer_offsets-22, ... (all of the 50 offsets are then listed)"
I am seeing the following exception in one of the broker log files.
Set up : 3 brokers
I am ok to remove the files c:\tmp directory. However, little curious why this broker got into this state.
log4j:ERROR Failed to rename [C:\confluent-5.5.0/logs/log-cleaner.log] to [C:\confluent-5.5.0/logs/log-cleaner.log.2020-06-18-09].
[2020-06-18 14:10:41,361] ERROR Failed to clean up log for __consumer_offsets-10 in dir C:\tmp\kafka-logs-3 due to IOException (kafka.s
erver.LogDirFailureChannel)
java.nio.file.FileSystemException: C:\tmp\kafka-logs-3\__consumer_offsets-10\00000000000000000000.timeindex.cleaned -> C:\tmp\kafka-log
s-3\__consumer_offsets-10\00000000000000000000.timeindex.swap: The process cannot access the file because it is being used by another p
rocess.
at sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:86)
at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:97)
at sun.nio.fs.WindowsFileCopy.move(WindowsFileCopy.java:387)
at sun.nio.fs.WindowsFileSystemProvider.move(WindowsFileSystemProvider.java:287)
at java.nio.file.Files.move(Files.java:1395)
at org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:834)
at kafka.log.AbstractIndex.renameTo(AbstractIndex.scala:207)
at kafka.log.LogSegment.changeFileSuffixes(LogSegment.scala:497)
at kafka.log.Log.$anonfun$replaceSegments$4(Log.scala:2288)
at kafka.log.Log.$anonfun$replaceSegments$4$adapted(Log.scala:2288)
at scala.collection.immutable.List.foreach(List.scala:392)
at kafka.log.Log.replaceSegments(Log.scala:2288)
at kafka.log.Cleaner.cleanSegments(LogCleaner.scala:605)
at kafka.log.Cleaner.$anonfun$doClean$6(LogCleaner.scala:530)
at kafka.log.Cleaner.doClean(LogCleaner.scala:529)
at kafka.log.Cleaner.clean(LogCleaner.scala:503)
at kafka.log.LogCleaner$CleanerThread.cleanLog(LogCleaner.scala:372)
at kafka.log.LogCleaner$CleanerThread.cleanFilthiestLog(LogCleaner.scala:345)
at kafka.log.LogCleaner$CleanerThread.tryCleanFilthiestLog(LogCleaner.scala:325)
at kafka.log.LogCleaner$CleanerThread.doWork(LogCleaner.scala:314)
at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:96)
Suppressed: java.nio.file.FileSystemException: C:\tmp\kafka-logs-3\__consumer_offsets-10\00000000000000000000.timeindex.cleaned
-> C:\tmp\kafka-logs-3\__consumer_offsets-10\00000000000000000000.timeindex.swap: The process cannot access the file because it is bei
ng used by another process.
at sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:86)
at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:97)
at sun.nio.fs.WindowsFileCopy.move(WindowsFileCopy.java:301)
at sun.nio.fs.WindowsFileSystemProvider.move(WindowsFileSystemProvider.java:287)
at java.nio.file.Files.move(Files.java:1395)
at org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:831)
... 15 more
[2020-06-18 14:10:41,441] WARN [ReplicaManager broker=3] Stopping serving replicas in dir C:\tmp\kafka-logs-3 (kafka.server.ReplicaMana
ger)
[2020-06-18 14:10:41,445] INFO [ReplicaFetcherManager on broker 3] Removed fetcher for partitions Set(__consumer_offsets-22, __consumer
_offsets-4, stock-prices-2, __consumer_offsets-7, __consumer_offsets-46, stock-prices-1, __consumer_offsets-25, __consumer_offsets-49,
__consumer_offsets-16, __consumer_offsets-28, __consumer_offsets-31, __consumer_offsets-37, stock-prices-0, __consumer_offsets-19, stoc
k_topic-0, __consumer_offsets-13, __consumer_offsets-43, __consumer_offsets-1, __consumer_offsets-34, __consumer_offsets-10, __consumer
_offsets-40) (kafka.server.ReplicaFetcherManager)
[2020-06-18 14:10:41,448] INFO [ReplicaAlterLogDirsManager on broker 3] Removed fetcher for partitions Set(__consumer_offsets-22, __con
sumer_offsets-4, stock-prices-2, __consumer_offsets-7, __consumer_offsets-46, stock-prices-1, __consumer_offsets-25, __consumer_offsets
-49, __consumer_offsets-16, __consumer_offsets-28, __consumer_offsets-31, __consumer_offsets-37, stock-prices-0, __consumer_offsets-19,
stock_topic-0, __consumer_offsets-13, __consumer_offsets-43, __consumer_offsets-1, __consumer_offsets-34, __consumer_offsets-10, __con
sumer_offsets-40) (kafka.server.ReplicaAlterLogDirsManager)
[2020-06-18 14:10:41,492] WARN [ReplicaManager broker=3] Broker 3 stopped fetcher for partitions __consumer_offsets-22,__consumer_offse
ts-4,stock-prices-2,__consumer_offsets-7,__consumer_offsets-46,stock-prices-1,__consumer_offsets-25,__consumer_offsets-49,__consumer_of
fsets-16,__consumer_offsets-28,__consumer_offsets-31,__consumer_offsets-37,stock-prices-0,__consumer_offsets-19,stock_topic-0,__consume
r_offsets-13,__consumer_offsets-43,__consumer_offsets-1,__consumer_offsets-34,__consumer_offsets-10,__consumer_offsets-40 and stopped m
oving logs for partitions because they are in the failed log directory C:\tmp\kafka-logs-3. (kafka.server.ReplicaManager)
[2020-06-18 14:10:41,494] WARN Stopping serving logs in dir C:\tmp\kafka-logs-3 (kafka.log.LogManager)
[2020-06-18 14:10:41,576] ERROR Shutdown broker because all log dirs in C:\tmp\kafka-logs-3 have failed (kafka.log.LogManager)
[2019-10-29 10:09:36,903] INFO [ReplicaManager broker=0] Broker 0 stopped fetcher for partitions __consumer_offsets-30,__consumer_offsets-8,__consumer_offsets-4,__consumer_offsets-27,__consumer_offsets-7,__consumer_offsets-46,__consumer_offsets-33,__consumer_offsets-23,__consumer_offsets-49,__consumer_offsets-36,__consumer_offsets-42,topic-0,__consumer_offsets-17,__consumer_offsets-48,__consumer_offsets-11,__consumer_offsets-14,__consumer_offsets-20,__consumer_offsets-0,__consumer_offsets-39,__consumer_offsets-45,__consumer_offsets-1,__consumer_offsets-26,__consumer_offsets-29,__consumer_offsets-10 and stopped moving logs for partitions because they are in the failed log directory C:\tmp\kafka-logs. (kafka.server.ReplicaManager)
[2019-10-29 10:09:36,908] INFO Stopping serving logs in dir C:\tmp\kafka-logs (kafka.log.LogManager)
[2019-10-29 10:09:36,952] ERROR Shutdown broker because all log dirs in C:\tmp\kafka-logs have failed (kafka.log.LogManager)
i have started zookeeper,Kafka and producer also. But when i tried to consume data immediately this error is coming in Windows
command: .\bin\windows\Kafka-console-consumer.bat --bootstrap-server localhost:9092 --topic topic
I had the similar issue and had to do trial and error. But what I eventually did was to disable the other JRE versions and leave only one enabled. See image attached. This seems to have resolved my problem since my broker doesn't crash anymore.
below is the error i am getting in the console while trying to start the kafka server with kafka-server-start command in command prompt .
ERROR Error while creating log for kafka_example-0 in dir
C:\Users\user11\Softwareskafka_2.12-2.3.0kafka_logs (kafka.server.LogDirFailureChannel)
java.io.IOException: The requested operation cannot be performed on a file with a user-mapped section open
at java.io.RandomAccessFile.setLength(Native Method)
at kafka.log.AbstractIndex.$anonfun$resize$1(AbstractIndex.scala:188)
at scala.runtime.java8.JFunction0$mcZ$sp.apply(JFunction0$mcZ$sp.java:23)
at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:253)
at kafka.log.AbstractIndex.resize(AbstractIndex.scala:174)
at kafka.log.AbstractIndex.$anonfun$trimToValidSize$1(AbstractIndex.scala:240)
at scala.runtime.java8.JFunction0$mcZ$sp.apply(JFunction0$mcZ$sp.java:23)
at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:253)
at kafka.log.AbstractIndex.trimToValidSize(AbstractIndex.scala:240)
INFO [ReplicaManager broker=0] Stopping serving replicas in dir
C:\Users\user11\Softwareskafka_2.12-2.3.0kafka_logs (kafka.server.ReplicaManager)
ERROR [ReplicaManager broker=0] Error while making broker the leader for partition Topic: kafka_example; Partition: 0; Leader: None; AllReplicas: ; InSyncReplicas: in dir None (kafka.server.ReplicaManager)
org.apache.kafka.common.errors.KafkaStorageException: Error while creating log for kafka_example-0 in dir C:\Users\user11\Softwareskafka_2.12-2.3.0kafka_logs
Caused by: java.io.IOException: The requested operation cannot be performed on a file with a user-mapped section open
at java.io.RandomAccessFile.setLength(Native Method)
at kafka.log.AbstractIndex.$anonfun$resize$1(AbstractIndex.scala:188)
at scala.runtime.java8.JFunction0$mcZ$sp.apply(JFunction0$mcZ$sp.java:23)
at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:253)
at kafka.log.AbstractIndex.resize(AbstractIndex.scala:174)
at kafka.log.AbstractIndex.$anonfun$trimToValidSize$1(AbstractIndex.scala:240)
at scala.runtime.java8.JFunction0$mcZ$sp.apply(JFunction0$mcZ$sp.java:23)
at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:253)
at kafka.log.AbstractIndex.trimToValidSize(AbstractIndex.scala:240)
at kafka.log.LogSegment.recover(LogSegment.scala:397)
at kafka.log.Log.recoverSegment(Log.scala:493)
at kafka.log.Log.recoverLog(Log.scala:608)
at kafka.log.Log.$anonfun$loadSegments$3(Log.scala:568)
INFO Replica loaded for partition --from-beginning-0 with initial high watermark 0 (kafka.cluster.Replica)
INFO [Partition --from-beginning-0 broker=0] --from-beginning-0 starts at Leader Epoch 0 from offset 0. Previous Leader Epoch was: -1 (kafka.cluster.Partition)
INFO [GroupMetadataManager brokerId=0] Scheduling loading of offsets and group metadata from __consumer_offsets-0 (kafka.coordinator.group.GroupMetadataManager)
INFO [ReplicaFetcherManager on broker 0] Removed fetcher for partitions Set(__consumer_offsets-0, --from-beginning-0, Kafka_Example-0) (kafka.server.ReplicaFetcherManager)
INFO [ReplicaAlterLogDirsManager on broker 0] Removed fetcher for partitions Set(__consumer_offsets-0, --from-beginning-0, Kafka_Example-0) (kafka.server.ReplicaAlterLogDirsManager)
INFO [ReplicaManager broker=0] Broker 0 stopped fetcher for partitions __consumer_offsets-0,--from-beginning-0,Kafka_Example-0 and stopped moving logs for partitions because they are in the failed log directory C:\Users\user11\Softwareskafka_2.12-2.3.0kafka_logs. (kafka.server.ReplicaManager)
INFO Stopping serving logs in dir C:\Users\user11\Softwareskafka_2.12-2.3.0kafka_logs (kafka.log.LogManager)
ERROR Shutdown broker because all log dirs in C:\Users\user11\Softwareskafka_2.12-2.3.0kafka_logs have failed (kafka.log.LogManager)
on my local using java 8 version and above subject mentioned is kafka version.
I have a 3-node Zookeeper cluster version 3.4.11 and 2-node Kafka cluster version 0.11.3. We wrote a producer that send messages to the specific topic and partitions of the Kafka cluster (I did it before and the producer is tested). Here are the Brokers configs:
broker.id=1
listeners=PLAINTEXT://node1:9092
num.partitions=24
delete.topic.enable=true
default.replication.factor=2
log.dirs=/data
zookeeper.connect=zoo1:2181,zoo2:2181,zoo3:2181
log.retention.hours=168
zookeeper.session.timeout.ms=40000
zookeeper.connection.timeout.ms=10000
offsets.topic.replication.factor=2
transaction.state.log.replication.factor=2
transaction.state.log.min.isr=2
In the beginning, there is no topic on the brokers and they will be created automatically. When I start the producer, Kafka cluster shows a strange behavior:
1- It creates all topics but while the rate of producing data is 10KB per second, in less than one minutes the log directory of each broker goes from zero data to 9.0 Gigabyte data! and all brokers turned off (because of lack of log-dir capacity)
2- Just when producing data started I try to consume data using the console-consumer and it just errors
WARN Error while fetching metadata with correlation id 2 : {Topic1=LEADER_NOT_AVAILABLE} (org.apache.kafka.clients.NetworkClient)
3- Here is the error which repeatedly appears in brokers log:
INFO Updated PartitionLeaderEpoch. New: {epoch:0, offset:0}, Current: {epoch:-1, offset-1} for Partition: Topic6-6. Cache now contains 0 entries. (kafka.server.epoch.LeaderEpochFileCache)
WARN Newly rolled segment file 00000000000000000000.log already exists; deleting it first (kafka.log.Log)
WARN Newly rolled segment file 00000000000000000000.index already exists; deleting it first (kafka.log.Log)
WARN Newly rolled segment file 00000000000000000000.timeindex already exists; deleting it first (kafka.log.Log)
ERROR [Replica Manager on Broker 1]: Error processing append operation on partition Topic6-6 (kafka.server.ReplicaManager)
kafka.common.KafkaException: Trying to roll a new log segment for topic partition Topic6-6 with start offset 0 while it already exists.
After many repeatation of the above log we have:
ERROR [ReplicaManager broker=1] Error processing append operation on partition Topic24-10 (kafka.server.ReplicaManager)
org.apache.kafka.common.errors.InvalidOffsetException: Attempt to append an offset (402) to position 5 no larger than the last offset appended (402)
And at the last (when there is no space in log-dir) it errors:
FATAL [Replica Manager on Broker 1]: Error writing to highwatermark file: (kafka.server.ReplicaManager)
java.io.FileNotFoundException: /data/replication-offset-checkpoint.tmp (No space left on device)
and shutdown!
4- I set up new single node Kafka version 0.11.3 one another machine and it works well using the same producer and using the same Zookeeper cluster.
5- I turned one of the two Kafka brokers off and just using one broker (of the cluster) it behaves the same as when I used two node Kafka cluster.
What is the problem?
UPDATE1: I tried Kafka version 2.1.0 but the same result!
UPDATE2: I found out that root of the problem. In producing I create 25 topics each of which has 24 partitions. Surprisingly each topic just after creating (using kafka-topic.sh command and when no data is stored) occupy 481MB space! For example in the log directory of topic "20" for each partition directory I have the following files that totally has 21MB:
00000000000000000000.index (10MB) 00000000000000000000.log(0MB) 00000000000000000000.timeindex(10MB) leader-epoch-checkpoint(4KB)
and Kafka writes the following lines for each topic-partitions in the server.log file:
[2019-02-05 10:10:54,957] INFO [Log partition=topic20-14, dir=/data] Loading producer state till offset 0 with message format version 2 (kafka.log.Log)
[2019-02-05 10:10:54,957] INFO [Log partition=topic20-14, dir=/data] Completed load of log with 1 segments, log start offset 0 and log end offset 0 in 1 ms (kafka.log.Log)
[2019-02-05 10:10:54,958] INFO Created log for partition topic20-14 in /data with properties {compression.type -> producer, message.format.version -> 2.1-IV2, file.delete.delay.ms -> 60000, max.message.bytes -> 1000012, min.compaction.lag.ms -> 0, message.timestamp.type -> CreateTime, message.downconversion.enable -> true, min.insync.replicas -> 1, segment.jitter.ms -> 0, preallocate -> false, min.cleanable.dirty.ratio -> 0.5, index.interval.bytes -> 4096, unclean.leader.election.enable -> false, retention.bytes -> -1, delete.retention.ms -> 86400000, cleanup.policy -> [delete], flush.ms -> 9223372036854775807, segment.ms -> 604800000, segment.bytes -> 1073741824, retention.ms -> 604800000, message.timestamp.difference.max.ms -> 9223372036854775807, segment.index.bytes -> 10485760, flush.messages -> 9223372036854775807}. (kafka.log.LogManager)
[2019-02-05 10:10:54,958] INFO [Partition topic20-14 broker=0] No checkpointed highwatermark is found for partition topic20-14 (kafka.cluster.Partition)
[2019-02-05 10:10:54,958] INFO Replica loaded for partition topic20-14 with initial high watermark 0 (kafka.cluster.Replica)
[2019-02-05 10:10:54,958] INFO [Partition topic20-14 broker=0] topic20-14 starts at Leader Epoch 0 from offset 0. Previous Leader Epoch was: -1 (kafka.cluster.Partition)
There is no error on the server log. I even can consume data if I produce data on the topic. As the total log directory space is 10GB, Kafka needs 12025MB for 25 topics in my scenario, which is greater than total directory space and Kafka will error and shutdown!
Just for a test I set up another Kafka broker (namely broker2) using the same Zookeeper cluster and creating a new topic with 24 partitions there just occupy 100K for all the empty partitions!
So I'm really confused! The Broker1 and Broker2, the same version of Kafka (0.11.3) is running and just the OS and System File is different:
In case Broker1 (occupy 481MB data for the new topic):
OS CentOS 7 and XFS as System File
In case Broker2 (occupy 100KB data for a new topic):
OS Ubuntu 16.04 and ext4 as System File
Why Kafka preallocate 21MB for each partition?
It's normal behavior and the preallocated size of indices are controlling using the server property: segment.index.bytes and the default value is 10485760 bytes or 10MB. That's because indices in each partition directory allocate 10MB:
00000000000000000000.index (10MB)
00000000000000000000.log(0MB)
00000000000000000000.timeindex(10MB)
leader-epoch-checkpoint(4KB)
On the other hand, the Kafka document says about that property:
We preallocate this index file and shrink it only after log rolls.
But in my case, it never shrank the indices. After a lot of searches I found out Java 8 in some versions (192 in my case) has a bug in working with many small files and it's fixed in update 202. So I updated my Java version to 202 and it solved the problem.