Kafka Broker Not able to start - apache-kafka

I am having a 3 node Kafka Cluster. One of the broker is not starting, i am getting below error. I have tried deleting index files but still, same error coming. Please help to understand what is this issue and how can I recover.
INFO [2018-09-05 11:58:49,585] kafka.log.Log:[Logging$class:info:66] - [pool-4-thread-1] - [Log partition=Topic3-15, dir=/var/lib/kafka/kafka-logs] Completed load of log with 1 segments, log start offset 11547004 and log end offset 11559178 in 1552 ms
INFO [2018-09-05 11:58:49,589] kafka.log.Log:[Logging$class:info:66] - [pool-4-thread-1] - [Log partition=Topic3-13, dir=/var/lib/kafka/kafka-logs] Recovering unflushed segment 12399433
ERROR [2018-09-05 11:58:49,591] kafka.log.LogManager:[Logging$class:error:74] - [main] - There was an error in one of the threads during logs loading: java.lang.IllegalArgumentException: inconsistent range
WARN [2018-09-05 11:58:49,591] kafka.log.Log:[Logging$class:warn:70] - [pool-4-thread-1] - [Log partition=Topic3-35, dir=/var/lib/kafka/kafka-logs] Found a corrupted index file corresponding to log file /var/lib/kafka/kafka-logs/Topic3-35/00000000000011110038.log due to Corrupt time index found, time index file (/var/lib/kafka/kafka-logs/Topic3-35/00000000000011110038.timeindex) has non-zero size but the last timestamp is 0 which is less than the first timestamp 1536129815049}, recovering segment and rebuilding index files...
INFO [2018-09-05 11:58:49,594] kafka.log.ProducerStateManager:[Logging$class:info:66] - [pool-4-thread-1] - [ProducerStateManager partition=Topic3-35] Loading producer state from snapshot file '/var/lib/kafka/kafka-logs/Topic3-35/00000000000011110038.snapshot'
ERROR [2018-09-05 11:58:49,599] kafka.server.KafkaServer:[MarkerIgnoringBase:error:159] - [main] - [KafkaServer id=2] Fatal error during KafkaServer startup. Prepare to shutdown
java.lang.IllegalArgumentException: inconsistent range
at java.util.concurrent.ConcurrentSkipListMap$SubMap.(ConcurrentSkipListMap.java:2620)
at java.util.concurrent.ConcurrentSkipListMap.subMap(ConcurrentSkipListMap.java:2078)
at java.util.concurrent.ConcurrentSkipListMap.subMap(ConcurrentSkipListMap.java:2114)
at kafka.log.Log$$anonfun$12.apply(Log.scala:1561)
at kafka.log.Log$$anonfun$12.apply(Log.scala:1560)
at scala.Option.map(Option.scala:146)
at kafka.log.Log.logSegments(Log.scala:1560)
at kafka.log.Log.kafka$log$Log$$recoverSegment(Log.scala:358)
at kafka.log.Log.recoverLog(Log.scala:448)
at kafka.log.Log.loadSegments(Log.scala:421)
at kafka.log.Log.(Log.scala:216)
at kafka.log.Log$.apply(Log.scala:1747)
at kafka.log.LogManager.kafka$log$LogManager$$loadLog(LogManager.scala:255)
at kafka.log.LogManager$$anonfun$loadLogs$2$$anonfun$11$$anonfun$apply$15$$anonfun$apply$2.apply$mcV$sp(LogManager.scala:335)
at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:62)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
INFO [2018-09-05 11:58:49,606] kafka.server.KafkaServer:[Logging$class:info:66] - [main] - [KafkaServer id=2] shutting down

Related

To achieve Kafka-mule integration using Mule 4, I installed zookeeper&kafka on my machine. Both the servers connection goes off after sometime

I started by creating zookeeper and kafka servers respectively on my machine. The connection gets established for both the servers but goes off after a few mins, throwing some errors.
ZOOKEEPER ERROR LOG-
'''2020-05-11 14:32:36,908 [myid:] - INFO [SyncThread:0:FileTxnLog#284] - Creating new log file: log.1
2020-05-11 14:36:05,170 [myid:] - WARN [NIOWorkerThread-1:NIOServerCnxn#373] - Close of session 0x10000cc587b0003
java.io.IOException: An existing connection was forcibly closed by the remote host
at sun.nio.ch.SocketDispatcher.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:43)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
at sun.nio.ch.IOUtil.read(IOUtil.java:197)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:324)
at org.apache.zookeeper.server.NIOServerCnxnFactory$IOWorkRequest.doWork(NIOServerCnxnFactory.java:522)
at org.apache.zookeeper.server.WorkerService$ScheduledWorkRequest.run(WorkerService.java:154)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-05-11 14:36:10,589 [myid:] - INFO [SessionTracker:ZooKeeperServer#600] - Expiring session 0x10000cc587b0003, timeout of 6000ms exceeded
'''
KAFKA ERROR LOG - Below are the kafka server error logs## Heading ##
'''[2020-05-11 14:33:41,077] INFO [KafkaServer id=0] started (kafka.server.KafkaServer)
[2020-05-11 14:36:04,762] ERROR Error while writing to checkpoint file C:\Kafka\Kafkakafka-logs\replication-offset-checkpoint (kafka.server.LogDirFailureChannel)
java.nio.file.FileAlreadyExistsException: C:\Kafka\Kafkakafka-logs\replication-offset-checkpoint.tmp -> C:\Kafka\Kafkakafka-logs\replication-offset-checkpoint
at sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:81)
at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:97)
.....
[2020-05-11 14:36:04,776] INFO [ReplicaManager broker=0] Stopping serving replicas in dir C:\Kafka\Kafkakafka-logs (kafka.server.ReplicaManager)
[2020-05-11 14:36:04,776] ERROR [ReplicaManager broker=0] Error while writing to highwatermark file in directory C:\Kafka\Kafkakafka-logs (kafka.server.ReplicaManager)
org.apache.kafka.common.errors.KafkaStorageException: Error while writing to checkpoint file C:\Kafka\Kafkakafka-logs\replication-offset-checkpoint
Caused by: java.nio.file.FileAlreadyExistsException: C:\Kafka\Kafkakafka-logs\replication-offset-checkpoint.tmp -> C:\Kafka\Kafkakafka-logs\replication-offset-checkpoint
at sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:81)
'''

kafka_2.12-2.3.0 broker shutdown in windows 10

below is the error i am getting in the console while trying to start the kafka server with kafka-server-start command in command prompt .
ERROR Error while creating log for kafka_example-0 in dir
C:\Users\user11\Softwareskafka_2.12-2.3.0kafka_logs (kafka.server.LogDirFailureChannel)
java.io.IOException: The requested operation cannot be performed on a file with a user-mapped section open
at java.io.RandomAccessFile.setLength(Native Method)
at kafka.log.AbstractIndex.$anonfun$resize$1(AbstractIndex.scala:188)
at scala.runtime.java8.JFunction0$mcZ$sp.apply(JFunction0$mcZ$sp.java:23)
at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:253)
at kafka.log.AbstractIndex.resize(AbstractIndex.scala:174)
at kafka.log.AbstractIndex.$anonfun$trimToValidSize$1(AbstractIndex.scala:240)
at scala.runtime.java8.JFunction0$mcZ$sp.apply(JFunction0$mcZ$sp.java:23)
at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:253)
at kafka.log.AbstractIndex.trimToValidSize(AbstractIndex.scala:240)
INFO [ReplicaManager broker=0] Stopping serving replicas in dir
C:\Users\user11\Softwareskafka_2.12-2.3.0kafka_logs (kafka.server.ReplicaManager)
ERROR [ReplicaManager broker=0] Error while making broker the leader for partition Topic: kafka_example; Partition: 0; Leader: None; AllReplicas: ; InSyncReplicas: in dir None (kafka.server.ReplicaManager)
org.apache.kafka.common.errors.KafkaStorageException: Error while creating log for kafka_example-0 in dir C:\Users\user11\Softwareskafka_2.12-2.3.0kafka_logs
Caused by: java.io.IOException: The requested operation cannot be performed on a file with a user-mapped section open
at java.io.RandomAccessFile.setLength(Native Method)
at kafka.log.AbstractIndex.$anonfun$resize$1(AbstractIndex.scala:188)
at scala.runtime.java8.JFunction0$mcZ$sp.apply(JFunction0$mcZ$sp.java:23)
at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:253)
at kafka.log.AbstractIndex.resize(AbstractIndex.scala:174)
at kafka.log.AbstractIndex.$anonfun$trimToValidSize$1(AbstractIndex.scala:240)
at scala.runtime.java8.JFunction0$mcZ$sp.apply(JFunction0$mcZ$sp.java:23)
at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:253)
at kafka.log.AbstractIndex.trimToValidSize(AbstractIndex.scala:240)
at kafka.log.LogSegment.recover(LogSegment.scala:397)
at kafka.log.Log.recoverSegment(Log.scala:493)
at kafka.log.Log.recoverLog(Log.scala:608)
at kafka.log.Log.$anonfun$loadSegments$3(Log.scala:568)
INFO Replica loaded for partition --from-beginning-0 with initial high watermark 0 (kafka.cluster.Replica)
INFO [Partition --from-beginning-0 broker=0] --from-beginning-0 starts at Leader Epoch 0 from offset 0. Previous Leader Epoch was: -1 (kafka.cluster.Partition)
INFO [GroupMetadataManager brokerId=0] Scheduling loading of offsets and group metadata from __consumer_offsets-0 (kafka.coordinator.group.GroupMetadataManager)
INFO [ReplicaFetcherManager on broker 0] Removed fetcher for partitions Set(__consumer_offsets-0, --from-beginning-0, Kafka_Example-0) (kafka.server.ReplicaFetcherManager)
INFO [ReplicaAlterLogDirsManager on broker 0] Removed fetcher for partitions Set(__consumer_offsets-0, --from-beginning-0, Kafka_Example-0) (kafka.server.ReplicaAlterLogDirsManager)
INFO [ReplicaManager broker=0] Broker 0 stopped fetcher for partitions __consumer_offsets-0,--from-beginning-0,Kafka_Example-0 and stopped moving logs for partitions because they are in the failed log directory C:\Users\user11\Softwareskafka_2.12-2.3.0kafka_logs. (kafka.server.ReplicaManager)
INFO Stopping serving logs in dir C:\Users\user11\Softwareskafka_2.12-2.3.0kafka_logs (kafka.log.LogManager)
ERROR Shutdown broker because all log dirs in C:\Users\user11\Softwareskafka_2.12-2.3.0kafka_logs have failed (kafka.log.LogManager)
on my local using java 8 version and above subject mentioned is kafka version.

Kafka can not access the file because it is being used by another process

I have a problem installing Kafka on windows.
I installed kafka cluster of 3 instances in 3 differents servers (each server contain one kafka and zookeeper )
all work well, but when an instance kafka stop (or fail ) , when i try to restart this instance .
I have this error message:
`[2017-11-10 15:17:53,999] INFO [ThrottledRequestReaper-Fetch]: Starting
(kafka.server.ClientQuotaManager$ThrottledRequestReaper)
[2017-11-10 15:17:53,999] INFO [ThrottledRequestReaper-Produce]: Starting
(kafka.server.ClientQuotaManager$ThrottledRequestReaper)
[2017-11-10 15:17:53,999] INFO [ThrottledRequestReaper-Request]: Starting
(kafka.server.ClientQuotaManager$ThrottledRequestReaper)
[2017-11-10 15:17:54,109] INFO Loading logs. (kafka.log.LogManager)
[2017-11-10 15:17:54,171] WARN Found a corrupted index file due to
requirement failed: Corrupt index found, index file (C:\Tools\Kafka\kafka-
logs\hubone.dataCollect.orbiwise.ArchiveQueue-0\00000000000000000015.index)
has non-zero size but the last offset is 15 which is no larger than the base
offset 15.}. deleting C:\Tools\Kafka\kafka-
logs\hubone.dataCollect.orbiwise.Arch`iveQueue-
0\00000000000000000015.timeindex, C:\Tools\Kafka\kafka-
logs\hubone.dataCollect.orbiwise.ArchiveQueue-0\00000000000000000015.index,
and C:\Tools\Kafka\kafka-logs\hubone.dataCollect.orbiwise.ArchiveQueue-
0\00000000000000000015.txnindex and rebuilding index... (kafka.log.Log)
[2017-11-10 15:17:54,171] ERROR There was an error in one of the threads
**during logs loading: java.nio.file.FileSystemException:
C:\Tools\Kafka\kafka-logs\hubone.dataCollect.orbiwise.ArchiveQueue-
0\00000000000000000015.timeindex: The process cannot access the file because
it is being used by another process.**
(kafka.log.LogManager)
[2017-11-10 15:17:54,171] FATAL [Kafka Server 1], Fatal error during
KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer)
java.nio.file.FileSystemException: C:\Tools\Kafka\kafka-
logs\hubone.dataCollect.orbiwise.ArchiveQueue-
0\00000000000000000015.timeindex: The process cannot access the file because
it is being used by another process.
at sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:86)
at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:97)
at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:102)
at sun.nio.fs.WindowsFileSystemProvider.implDelete(WindowsFileSystemProvider.java:269)
at sun.nio.fs.AbstractFileSystemProvider.deleteIfExists(AbstractFileSystemProvider.java:108)
at java.nio.file.Files.deleteIfExists(Files.java:1165)
at kafka.log.Log$$anonfun$loadSegmentFiles$3.apply(Log.scala:318)
at kafka.log.Log$$anonfun$loadSegmentFiles$3.apply(Log.scala:279)
at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732)
at kafka.log.Log.loadSegmentFiles(Log.scala:279)
at kafka.log.Log.loadSegments(Log.scala:383)
at kafka.log.Log.<init>(Log.scala:186)
at kafka.log.Log$.apply(Log.scala:1609)
at kafka.log.LogManager$$anonfun$loadLogs$2$$anonfun$5$$anonfun$apply$12$$anonfun$apply$1.apply$mcV$sp(LogManager.scala:172)
at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:57)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
when i delete kafka-logs , the offsets are corrupted .
could you help me to fix this problem ?
i opened an issue in kafka :
https://issues.apache.org/jira/browse/KAFKA-6200
I had a similar issue.
I simply deleted the files in ..\tmp**kafka-logs**\ directory.
Then i restarted the services and it worked like a charm.
In my case the issue was resolved after emptying the recycle bin on windows after deleting logs from tmp and log directories

Kafka Startup Error on topic delete

Kafka Version: 0.10.2.1 (Server)
Zookeeper: bundled with Kafka
Issue
If you delete the topic and then restart the broker - it fails. Broker is configured for delete.topic.enable=true
Delete Command:
./kafka-topics.sh --zookeeper localhost:2182 --delete --topic MY.TOPIC.NAME
Only way out for now is to go to log directory manually and remove the
topic dirs using rm-rf. Post that its ok.
Error:
[2017-06-09 12:24:43,359] ERROR There was an error in one of the threads during logs loading: java.lang.StringIndexOutOfBoundsException: String index out of range: -1 (kafka.log.LogManager)
[2017-06-09 12:24:43,360] FATAL [Kafka Server 101], Fatal error during KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer)
java.lang.StringIndexOutOfBoundsException: String index out of range: -1
at java.lang.String.substring(String.java:1967)
at kafka.log.Log$.parseTopicPartitionName(Log.scala:1146)
at kafka.log.LogManager.$anonfun$loadLogs$10(LogManager.scala:153)
at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:57)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
[2017-06-09 12:24:43,363] INFO [Kafka Server 101], shutting down (kafka.server.KafkaServer)
If your topic indeed has the dot (".") in the name as shown, I believe you have been hit by this defect:
KAFKA-5232 Kafka broker fails to start if a topic containing dot in its name is marked for delete but hasn't been deleted during previous uptime

kafka 0.9.0.1 fails to start with fatal exception

I see deleting and rebuilding some index. Found that its expected in 0.9.0.1
but after that it fails saying unsafe memory access, any hints on this?
2016-03-16 22:14:01,113] WARN Found a corrupted index file, /kafka_data/kafkain-3655/00000000000000000000.index, deleting and rebuilding index... (kafka.log.Log)
[2016-03-16 22:14:01,137] WARN Found a corrupted index file, /kafka_data/kafkain-1172/00000000000000000000.index, deleting and rebuilding index... (kafka.log.Log)
[2016-03-16 22:14:01,151] WARN Found a corrupted index file, /kafka_data/kafkain-2362/00000000000000000000.index, deleting and rebuilding index... (kafka.log.Log)
[2016-03-16 22:14:01,152] ERROR There was an error in one of the threads during logs loading: java.lang.InternalError: a fault occurred in a recent unsafe memory access operation in compiled Java code (kafka.log.LogManager)
[2016-03-16 22:14:01,154] FATAL Fatal error during KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer)
java.lang.InternalError: a fault occurred in a recent unsafe memory access operation in compiled Java code
at java.io.RandomAccessFile.open0(Native Method)
at java.io.RandomAccessFile.open(RandomAccessFile.java:316)
at java.io.RandomAccessFile.<init>(RandomAccessFile.java:243)
at kafka.log.OffsetIndex$$anonfun$resize$1.apply(OffsetIndex.scala:277)
at kafka.log.OffsetIndex$$anonfun$resize$1.apply(OffsetIndex.scala:276)
at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:262)
at kafka.log.OffsetIndex.resize(OffsetIndex.scala:276)
at kafka.log.OffsetIndex$$anonfun$trimToValidSize$1.apply$mcV$sp(OffsetIndex.scala:265)
at kafka.log.OffsetIndex$$anonfun$trimToValidSize$1.apply(OffsetIndex.scala:265)
at kafka.log.OffsetIndex$$anonfun$trimToValidSize$1.apply(OffsetIndex.scala:265)
at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:262)
at kafka.log.OffsetIndex.trimToValidSize(OffsetIndex.scala:264)
at kafka.log.LogSegment.recover(LogSegment.scala:199)
at kafka.log.Log$$anonfun$loadSegments$4.apply(Log.scala:188)
at kafka.log.Log$$anonfun$loadSegments$4.apply(Log.scala:160)
at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:778)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:777)
at kafka.log.Log.loadSegments(Log.scala:160)
at kafka.log.Log.<init>(Log.scala:90)
at kafka.log.LogManager$$anonfun$loadLogs$2$$anonfun$3$$anonfun$apply$10$$anonfun$apply$1.apply$mcV$sp(LogManager.scala:150)
at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:60)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
[2016-03-16 22:14:01,158] INFO shutting down (kafka.server.KafkaServer)
This error could be due to the fact that the node is out of space in log.dirs. In itself, the removal and rebuilding of the index - it's not terrible, but if space is insufficient, the node can not be started. If replication factor allows it, you can simply remove the part of the log, then after they run normally all data replicated