Getting "Failure when accessing a lock file: java.io.IOException" with Artemis 2.17.0 and openjdk-1.8.0.302 - activemq-artemis

2022-03-01 01:28:16,860 WARN [org.apache.activemq.artemis.core.server.impl.FileLockNodeManager] Failure when accessing a lock file: java.io.IOException: The system cannot find the file specified
at sun.nio.ch.FileDispatcherImpl.lock0(Native Method) [rt.jar:1.8.0_302]
at sun.nio.ch.FileDispatcherImpl.lock(FileDispatcherImpl.java:107) [rt.jar:1.8.0_302]
at sun.nio.ch.FileChannelImpl.tryLock(FileChannelImpl.java:1114) [rt.jar:1.8.0_302]
at java.nio.channels.FileChannel.tryLock(FileChannel.java:1155) [rt.jar:1.8.0_302]
at org.apache.activemq.artemis.core.server.impl.FileLockNodeManager.tryLock(FileLockNodeManager.java:389) [artemis-server-2.17.0.jar:2.17.0]
at org.apache.activemq.artemis.core.server.impl.FileLockNodeManager.lock(FileLockNodeManager.java:408) [artemis-server-2.17.0.jar:2.17.0]
at org.apache.activemq.artemis.core.server.impl.FileLockNodeManager.getState(FileLockNodeManager.java:349) [artemis-server-2.17.0.jar:2.17.0]
at org.apache.activemq.artemis.core.server.impl.FileLockNodeManager.access$500(FileLockNodeManager.java:36) [artemis-server-2.17.0.jar:2.17.0]
at org.apache.activemq.artemis.core.server.impl.FileLockNodeManager$MonitorLock.run(FileLockNodeManager.java:516) [artemis-server-2.17.0.jar:2.17.0]
at org.apache.activemq.artemis.core.server.ActiveMQScheduledComponent.runForExecutor(ActiveMQScheduledComponent.java:313) [artemis-commons-2.17.0.jar:2.17.0]
at org.apache.activemq.artemis.core.server.ActiveMQScheduledComponent.bookedRunForScheduler(ActiveMQScheduledComponent.java:328) [artemis-commons-2.17.0.jar:2.17.0]
at org.apache.activemq.artemis.core.server.ActiveMQScheduledComponent.runForScheduler(ActiveMQScheduledComponent.java:339) [artemis-commons-2.17.0.jar:2.17.0]
at org.apache.activemq.artemis.core.server.ActiveMQScheduledComponent.lambda$start$0(ActiveMQScheduledComponent.java:166) [artemis-commons-2.17.0.jar:2.17.0]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [rt.jar:1.8.0_302]
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [rt.jar:1.8.0_302]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [rt.jar:1.8.0_302]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [rt.jar:1.8.0_302]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [rt.jar:1.8.0_302]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [rt.jar:1.8.0_302]
at org.apache.activemq.artemis.utils.ActiveMQThreadFactory$1.run(ActiveMQThreadFactory.java:118) [artemis-commons-2.17.0.jar:2.17.0]
After getting the above error artemis server is going down with java.io.IOException: An unexpected network error occurred:
2022-03-01 02:00:10,690 WARN [org.apache.activemq.artemis.core.server] AMQ222010: Critical IO Error, shutting down the server. file=NIOSequentialFile \\shared-path\activemq-bindings-2.bindings, message=An unexpected network error occurred: ActiveMQIOErrorException[errorType=IO_ERROR message=An unexpected network error occurred]
at org.apache.activemq.artemis.core.io.nio.NIOSequentialFile.internalWrite(NIOSequentialFile.java:410) [artemis-journal-2.17.0.jar:2.17.0]
at org.apache.activemq.artemis.core.io.nio.NIOSequentialFile.writeDirect(NIOSequentialFile.java:374) [artemis-journal-2.17.0.jar:2.17.0]
at org.apache.activemq.artemis.core.io.AbstractSequentialFile.write(AbstractSequentialFile.java:239) [artemis-journal-2.17.0.jar:2.17.0]
at org.apache.activemq.artemis.core.io.AbstractSequentialFile.write(AbstractSequentialFile.java:252) [artemis-journal-2.17.0.jar:2.17.0]
at org.apache.activemq.artemis.core.journal.impl.JournalImpl.appendRecord(JournalImpl.java:2939) [artemis-journal-2.17.0.jar:2.17.0]
at org.apache.activemq.artemis.core.journal.impl.JournalImpl.access$100(JournalImpl.java:92) [artemis-journal-2.17.0.jar:2.17.0]
at org.apache.activemq.artemis.core.journal.impl.JournalImpl$7.run(JournalImpl.java:1255) [artemis-journal-2.17.0.jar:2.17.0]
at org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:42) [artemis-commons-2.17.0.jar:2.17.0]
at org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:31) [artemis-commons-2.17.0.jar:2.17.0]
at org.apache.activemq.artemis.utils.actors.ProcessorBase.executePendingTasks(ProcessorBase.java:65) [artemis-commons-2.17.0.jar:2.17.0]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [rt.jar:1.8.0_302]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [rt.jar:1.8.0_302]
at org.apache.activemq.artemis.utils.ActiveMQThreadFactory$1.run(ActiveMQThreadFactory.java:118) [artemis-commons-2.17.0.jar:2.17.0]
Caused by: java.io.IOException: An unexpected network error occurred
at sun.nio.ch.FileDispatcherImpl.write0(Native Method) [rt.jar:1.8.0_302]
at sun.nio.ch.FileDispatcherImpl.write(FileDispatcherImpl.java:75) [rt.jar:1.8.0_302]
at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93) [rt.jar:1.8.0_302]
at sun.nio.ch.IOUtil.write(IOUtil.java:51) [rt.jar:1.8.0_302]
at sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:211) [rt.jar:1.8.0_302]
at org.apache.activemq.artemis.core.io.nio.NIOSequentialFile.doInternalWrite(NIOSequentialFile.java:434) [artemis-journal-2.17.0.jar:2.17.0]
at org.apache.activemq.artemis.core.io.nio.NIOSequentialFile.internalWrite(NIOSequentialFile.java:406) [artemis-journal-2.17.0.jar:2.17.0]
... 12 more

These exceptions are coming from the underlying java.nio.channels.FileChannel provided by the JVM. There's nothing that ActiveMQ Artemis can do about them except report the failures and respond accordingly. From what I can tell everything is working as designed from the broker's point of view. Disk related failures like the ones you are experiencing are generally considered to be "critical" which means the broker will shut itself down when it encounters such an error.
Since the java.io.IOException is reporting "An unexpected network error occurred" it looks like you're using a network-based filesystem (e.g. NFS). If so, I recommend you ensure the integrity of this filesystem and of the network that allows connectivity from the broker.

Related

How to enable automatic retries for Mongo Socket exception?

I'm running a spring boot gradle project. Spring boot version is 2.3.2.RELEASE, MongoDB version is 4.0.5.
Per below site, retryable writes are enabled by default from MongoDB 3.6+ https://docs.mongodb.com/manual/core/retryable-writes/
I saw the following exceptions in the project, MongoReadException and MongoWriteException while reading/saving to the mongo database. I would like to know if these exceptions are automatically retried starting Spring boot 2.3.2.RELEASE.
Below are the exceptions I wanted to retry,
Caused by: org.springframework.data.mongodb.UncategorizedMongoDbException: Exception sending message; nested exception is com.mongodb.MongoSocketWriteException: Exception sending message
Caused by: org.springframework.data.mongodb.UncategorizedMongoDbException: Prematurely reached end of stream; nested exception is com.mongodb.MongoSocketReadException: Prematurely reached end of stream
To try it out, I created a spring boot demo app in my local that tries to save a document to a database whose uri is provided with invalid value in the properties file. I get MongoTimeoutException on save document which is as expected. However, I don't see any retries on the save operation.
I also tried to add the property retryWrites=true to the connection uri, it didn't work
spring.data.mongodb.uri=mongodb://XXX:XXXX#invalid-server:5555/dummyDB?&retrywrites=true
Does anyone know how to enable the retries ? How do I test it locally ?
Please help. Thanks in advance.
Below is the exception stacktrace with MongoSocketOpenException. I would like to test if this exception is auto retried, however, it's not.
nested exception is org.springframework.dao.DataAccessResourceFailureException: Timed out after 30000 ms while waiting to connect. Client view of cluster state is {type=UNKNOWN, servers=[{address=localhost:27017, type=UNKNOWN, state=CONNECTING, exception={com.mongodb.MongoSocketOpenException: Exception opening socket}, caused by {java.net.ConnectException: Connection refused}}]; nested exception is com.mongodb.MongoTimeoutException: Timed out after 30000 ms while waiting to connect. Client view of cluster state is {type=UNKNOWN, servers=[{address=localhost:27017, type=UNKNOWN, state=CONNECTING, exception={com.mongodb.MongoSocketOpenException: Exception opening socket}, caused by {java.net.ConnectException: Connection refused}}]] with root cause
com.mongodb.MongoTimeoutException: Timed out after 30000 ms while waiting to connect. Client view of cluster state is {type=UNKNOWN, servers=[{address=localhost:27017, type=UNKNOWN, state=CONNECTING, exception={com.mongodb.MongoSocketOpenException: Exception opening socket}, caused by {java.net.ConnectException: Connection refused}}]
at com.mongodb.internal.connection.BaseCluster.getDescription(BaseCluster.java:177) ~[mongodb-driver-core-4.0.5.jar:na]
at com.mongodb.internal.connection.SingleServerCluster.getDescription(SingleServerCluster.java:41) ~[mongodb-driver-core-4.0.5.jar:na]
at com.mongodb.client.internal.MongoClientDelegate.getConnectedClusterDescription(MongoClientDelegate.java:147) ~[mongodb-driver-sync-4.0.5.jar:na]
at com.mongodb.client.internal.MongoClientDelegate.createClientSession(MongoClientDelegate.java:98) ~[mongodb-driver-sync-4.0.5.jar:na]
at com.mongodb.client.internal.MongoClientDelegate$DelegateOperationExecutor.getClientSession(MongoClientDelegate.java:278) ~[mongodb-driver-sync-4.0.5.jar:na]
at com.mongodb.client.internal.MongoClientDelegate$DelegateOperationExecutor.execute(MongoClientDelegate.java:202) ~[mongodb-driver-sync-4.0.5.jar:na]
at com.mongodb.client.internal.MongoCollectionImpl.executeSingleWriteRequest(MongoCollectionImpl.java:1008) ~[mongodb-driver-sync-4.0.5.jar:na]
at com.mongodb.client.internal.MongoCollectionImpl.executeInsertOne(MongoCollectionImpl.java:469) ~[mongodb-driver-sync-4.0.5.jar:na]
at com.mongodb.client.internal.MongoCollectionImpl.insertOne(MongoCollectionImpl.java:452) ~[mongodb-driver-sync-4.0.5.jar:na]
at com.mongodb.client.internal.MongoCollectionImpl.insertOne(MongoCollectionImpl.java:446) ~[mongodb-driver-sync-4.0.5.jar:na]
at org.springframework.data.mongodb.core.MongoTemplate.lambda$insertDocument$15(MongoTemplate.java:1442) ~[spring-data-mongodb-3.0.2.RELEASE.jar:3.0.2.RELEASE]
at org.springframework.data.mongodb.core.MongoTemplate.execute(MongoTemplate.java:566) ~[spring-data-mongodb-3.0.2.RELEASE.jar:3.0.2.RELEASE]
at org.springframework.data.mongodb.core.MongoTemplate.insertDocument(MongoTemplate.java:1436) ~[spring-data-mongodb-3.0.2.RELEASE.jar:3.0.2.RELEASE]
at org.springframework.data.mongodb.core.MongoTemplate.doInsert(MongoTemplate.java:1236) ~[spring-data-mongodb-3.0.2.RELEASE.jar:3.0.2.RELEASE]
at org.springframework.data.mongodb.core.MongoTemplate.insert(MongoTemplate.java:1168) ~[spring-data-mongodb-3.0.2.RELEASE.jar:3.0.2.RELEASE]
at org.springframework.data.mongodb.repository.support.SimpleMongoRepository.save(SimpleMongoRepository.java:85) ~[spring-data-mongodb-3.0.2.RELEASE.jar:3.0.2.RELEASE]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.8.0_101]
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[na:1.8.0_101]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_101]
at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_101]
at org.springframework.data.repository.core.support.ImplementationInvocationMetadata.invoke(ImplementationInvocationMetadata.java:72) ~[spring-data-commons-2.3.2.RELEASE.jar:2.3.2.RELEASE]
at org.springframework.data.repository.core.support.RepositoryComposition$RepositoryFragments.invoke(RepositoryComposition.java:382) ~[spring-data-commons-2.3.2.RELEASE.jar:2.3.2.RELEASE]
at org.springframework.data.repository.core.support.RepositoryComposition.invoke(RepositoryComposition.java:205) ~[spring-data-commons-2.3.2.RELEASE.jar:2.3.2.RELEASE]
at org.springframework.data.repository.core.support.RepositoryFactorySupport$ImplementationMethodExecutionInterceptor.invoke(RepositoryFactorySupport.java:549) ~[spring-data-commons-2.3.2.RELEASE.jar:2.3.2.RELEASE]
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186) ~[spring-aop-5.2.8.RELEASE.jar:5.2.8.RELEASE]
at org.springframework.data.repository.core.support.QueryExecutorMethodInterceptor.doInvoke(QueryExecutorMethodInterceptor.java:155) ~[spring-data-commons-2.3.2.RELEASE.jar:2.3.2.RELEASE]
at org.springframework.data.repository.core.support.QueryExecutorMethodInterceptor.invoke(QueryExecutorMethodInterceptor.java:130) ~[spring-data-commons-2.3.2.RELEASE.jar:2.3.2.RELEASE]
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186) ~[spring-aop-5.2.8.RELEASE.jar:5.2.8.RELEASE]
at org.springframework.data.projection.DefaultMethodInvokingMethodInterceptor.invoke(DefaultMethodInvokingMethodInterceptor.java:80) ~[spring-data-commons-2.3.2.RELEASE.jar:2.3.2.RELEASE]
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186) ~[spring-aop-5.2.8.RELEASE.jar:5.2.8.RELEASE]
at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:95) ~[spring-aop-5.2.8.RELEASE.jar:5.2.8.RELEASE]
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186) ~[spring-aop-5.2.8.RELEASE.jar:5.2.8.RELEASE]
at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:212) ~[spring-aop-5.2.8.RELEASE.jar:5.2.8.RELEASE]
at com.sun.proxy.$Proxy52.save(Unknown Source) ~[na:na]
Retries apply to operations like reads and writes.
Operations are only sent once the driver successfully connects to the server(s) you indicate in connection string/client configuration.
Your server is not accessible by the client therefore that connection isn't working therefore no operations can be performed therefore the concept of operation retries doesn't apply.
The monitoring is retried automatically, once you fix the connectivity issue between the client and the server (or start the server) the client will recognize this automatically.

NoClassDefFoundError: javax/sql/DataSource

Trying to run Keycloak 9 with RedHat SSO 7.4 (rh-sso-7/sso74-openshift-rhel8 image, version 7.4) and getting the following error:
15:40:43,022 ERROR [org.jboss.as.server] (Controller Boot Thread) WFLYSRV0055: Caught exception during boot: org.jboss.as.controller.persistence.ConfigurationPersistenceException: WFLYCTL0085: Failed to parse configuration
at org.jboss.as.controller.persistence.XmlConfigurationPersister.load(XmlConfigurationPersister.java:143) [wildfly-controller-10.1.12.SP1-redhat-00001.jar:10.1.12.SP1-redhat-00001]
at org.jboss.as.server.ServerService.boot(ServerService.java:387) [wildfly-server-10.1.12.SP1-redhat-00001.jar:10.1.12.SP1-redhat-00001]
at org.jboss.as.controller.AbstractControllerService$1.run(AbstractControllerService.java:383) [wildfly-controller-10.1.12.SP1-redhat-00001.jar:10.1.12.SP1-redhat-00001]
at java.base/java.lang.Thread.run(Thread.java:834) [java.base:]
Caused by: javax.xml.stream.XMLStreamException: WFLYCTL0083: Failed to load module org.wildfly.extension.undertow
at org.jboss.as.controller.parsing.DeferredExtensionContext.load(DeferredExtensionContext.java:100) [wildfly-controller-10.1.12.SP1-redhat-00001.jar:10.1.12.SP1-redhat-00001]
at org.jboss.as.server.parsing.StandaloneXml_11.readServerElement(StandaloneXml_11.java:240) [wildfly-server-10.1.12.SP1-redhat-00001.jar:10.1.12.SP1-redhat-00001]
at org.jboss.as.server.parsing.StandaloneXml_11.readElement(StandaloneXml_11.java:140) [wildfly-server-10.1.12.SP1-redhat-00001.jar:10.1.12.SP1-redhat-00001]
at org.jboss.as.server.parsing.StandaloneXml.readElement(StandaloneXml.java:129) [wildfly-server-10.1.12.SP1-redhat-00001.jar:10.1.12.SP1-redhat-00001]
at org.jboss.as.server.parsing.StandaloneXml.readElement(StandaloneXml.java:52) [wildfly-server-10.1.12.SP1-redhat-00001.jar:10.1.12.SP1-redhat-00001]
at org.jboss.staxmapper.XMLMapperImpl.processNested(XMLMapperImpl.java:122) [staxmapper-1.3.0.Final-redhat-1.jar:1.3.0.Final-redhat-1]
at org.jboss.staxmapper.XMLMapperImpl.parseDocument(XMLMapperImpl.java:76) [staxmapper-1.3.0.Final-redhat-1.jar:1.3.0.Final-redhat-1]
at org.jboss.as.controller.persistence.XmlConfigurationPersister.load(XmlConfigurationPersister.java:126) [wildfly-controller-10.1.12.SP1-redhat-00001.jar:10.1.12.SP1-redhat-00001]
... 3 more
Caused by: java.util.concurrent.ExecutionException: java.lang.NoClassDefFoundError: javax/sql/DataSource
at java.base/java.util.concurrent.FutureTask.report(FutureTask.java:122) [java.base:]
at java.base/java.util.concurrent.FutureTask.get(FutureTask.java:191) [java.base:]
at org.jboss.as.controller.parsing.DeferredExtensionContext.load(DeferredExtensionContext.java:92) [wildfly-controller-10.1.12.SP1-redhat-00001.jar:10.1.12.SP1-redhat-00001]
... 10 more
Caused by: java.lang.NoClassDefFoundError: javax/sql/DataSource
at org.jboss.as.clustering.controller.CommonUnaryRequirement.<clinit>(CommonUnaryRequirement.java:41)
at org.wildfly.extension.undertow.ApplicationSecurityDomainSingleSignOnDefinition$Attribute.<clinit>(ApplicationSecurityDomainSingleSignOnDefinition.java:63)
at org.wildfly.extension.undertow.UndertowSubsystemParser_9_0.<init>(UndertowSubsystemParser_9_0.java:357)
at org.jboss.as.controller.extension.ExtensionRegistry$ExtensionParsingContextImpl.attemptCurrentParserInitialization(ExtensionRegistry.java:508) [wildfly-controller-10.1.12.SP1-redhat-00001.jar:10.1.12.SP1-redhat-00001]
at org.jboss.as.controller.extension.ExtensionRegistry$ExtensionParsingContextImpl.access$200(ExtensionRegistry.java:434) [wildfly-controller-10.1.12.SP1-redhat-00001.jar:10.1.12.SP1-redhat-00001]
at org.jboss.as.controller.extension.ExtensionRegistry.initializeParsers(ExtensionRegistry.java:249) [wildfly-controller-10.1.12.SP1-redhat-00001.jar:10.1.12.SP1-redhat-00001]
at org.jboss.as.controller.parsing.DeferredExtensionContext.loadModule(DeferredExtensionContext.java:116) [wildfly-controller-10.1.12.SP1-redhat-00001.jar:10.1.12.SP1-redhat-00001]
at org.jboss.as.controller.parsing.DeferredExtensionContext.access$000(DeferredExtensionContext.java:44) [wildfly-controller-10.1.12.SP1-redhat-00001.jar:10.1.12.SP1-redhat-00001]
at org.jboss.as.controller.parsing.DeferredExtensionContext$1.call(DeferredExtensionContext.java:74) [wildfly-controller-10.1.12.SP1-redhat-00001.jar:10.1.12.SP1-redhat-00001]
at org.jboss.as.controller.parsing.DeferredExtensionContext$1.call(DeferredExtensionContext.java:71) [wildfly-controller-10.1.12.SP1-redhat-00001.jar:10.1.12.SP1-redhat-00001]
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) [java.base:]
at org.jboss.threads.ContextClassLoaderSavingRunnable.run(ContextClassLoaderSavingRunnable.java:35) [jboss-threads-2.3.3.Final-redhat-00001.jar:2.3.3.Final-redhat-00001]
at org.jboss.threads.EnhancedQueueExecutor.safeRun(EnhancedQueueExecutor.java:1982) [jboss-threads-2.3.3.Final-redhat-00001.jar:2.3.3.Final-redhat-00001]
at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.doRunTask(EnhancedQueueExecutor.java:1486) [jboss-threads-2.3.3.Final-redhat-00001.jar:2.3.3.Final-redhat-00001]
at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.run(EnhancedQueueExecutor.java:1348) [jboss-threads-2.3.3.Final-redhat-00001.jar:2.3.3.Final-redhat-00001]
at java.base/java.lang.Thread.run(Thread.java:834) [java.base:]
at org.jboss.threads.JBossThread.run(JBossThread.java:485) [jboss-threads-2.3.3.Final-redhat-00001.jar:2.3.3.Final-redhat-00001]
Caused by: java.lang.ClassNotFoundException: javax.sql.DataSource from [Module "org.jboss.as.clustering.common" version 7.3.3.GA-redhat-00004 from local module loader #3fc79729 (finder: local module finder #34f6515b (roots: /opt/eap/modules,/opt/eap/modules/system/layers/openshift,/opt/eap/modules/system/layers/keycloak,/opt/eap/modules/system/layers/base/.overlays/jbeap20306,/opt/eap/modules/system/layers/base/.overlays/layer-base-jboss-eap-7.3.3.CP,/opt/eap/modules/system/layers/base))]
at org.jboss.modules.ModuleClassLoader.findClass(ModuleClassLoader.java:255) [jboss-modules.jar:1.10.0.Final-redhat-00001]
at org.jboss.modules.ConcurrentClassLoader.performLoadClassUnchecked(ConcurrentClassLoader.java:410) [jboss-modules.jar:1.10.0.Final-redhat-00001]
at org.jboss.modules.ConcurrentClassLoader.performLoadClass(ConcurrentClassLoader.java:398) [jboss-modules.jar:1.10.0.Final-redhat-00001]
at org.jboss.modules.ConcurrentClassLoader.loadClass(ConcurrentClassLoader.java:116) [jboss-modules.jar:1.10.0.Final-redhat-00001]
... 17 more
Please notice, that Java 11 is used.
I have also added the following param:
-Xbootclasspath/a:$JBOSS_HOME/jboss-modules.jar
Without this one, it was crashing on another error.

Getting so many Exceptions while running the kafka connect worker

Getting so many exceptions while running the Kafka Connect worker.
I have set all the worker properties and all the jar paths looks fine.
The exceptions are below:
2020-07-23 18:41:58 WARN Reflections:104 - could not create Dir
using jarFile from url
file:/kafka/bin/../clients/build/libs/kafka-clients*.jar. skipping.
java.lang.NullPointerException
at java.util.zip.ZipFile.<init>(ZipFile.java:213)
at java.util.zip.ZipFile.<init>(ZipFile.java:155)
at java.util.jar.JarFile.<init>(JarFile.java:166)
at java.util.jar.JarFile.<init>(JarFile.java:130)
at org.reflections.vfs.Vfs$DefaultUrlTypes$1.createDir(Vfs.java:216)
at org.reflections.vfs.Vfs.fromURL(Vfs.java:99)
at org.reflections.vfs.Vfs.fromURL(Vfs.java:91)
at org.reflections.Reflections.scan(Reflections.java:240)
at org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader$InternalReflections.scan(DelegatingClassLoader.java:373)
at org.reflections.Reflections$1.run(Reflections.java:198)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-07-23 18:41:58 WARN Reflections:377 - could not create Vfs.Dir from url. ignoring the exception and continuing
org.reflections.ReflectionsException: Could not open url connection at org.reflections.vfs.JarInputDir$1$1.<init>(JarInputDir.java:37)
at org.reflections.vfs.JarInputDir$1.iterator(JarInputDir.java:33)
at org.reflections.Reflections.scan(Reflections.java:243)
at org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader$InternalReflections.scan(DelegatingClassLoader.java:373)
at org.reflections.Reflections$1.run(Reflections.java:198)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.FileNotFoundException: /kafka/bin/../clients/build/libs/kafka-clients*.jar (No such file or directory)
at java.io.FileInputStream.open0(Native Method)
at java.io.FileInputStream.open(FileInputStream.java:195)
at java.io.FileInputStream.<init>(FileInputStream.java:138) -
kafkaconnectsladev.log
If you are seeing WARN Reflections, then this is a warning, and not an error, and it is safe to ignore.
You can edit the log4j.properties file to silence the warnings, if you want. Using the Confluent Docker images, this is done via the CONNECT_LOG4J_LOGGERS variable
Thanks guys, yes these are all warnings and with newer version of the kafka client library, i am not seeing these.

Why would Spark Streaming application stall when consuming from Kafka on YARN?

I'm writing a Spark Streaming app in Scala. The goal of the app is to consume the latest records from Kafka and print them to stdout.
The app works perfectly when I run it locally using --master local[n]. However, when I run the app in YARN (and produce to the topic that I am consuming from), the app gets stuck at:
16/11/18 20:53:05 INFO JobScheduler: Added jobs for time 1479502385000 ms
After repeating the line above several times, Spark gives the following error:
16/11/18 20:54:47 WARN TaskSetManager: Lost task 0.0 in stage 9.0 (TID 9, r3d3.hadoop.REDACTED.REDACTED): java.net.ConnectException: Connection timed out
at sun.nio.ch.Net.connect0(Native Method)
at sun.nio.ch.Net.connect(Net.java:454)
at sun.nio.ch.Net.connect(Net.java:446)
at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:648)
at kafka.network.BlockingChannel.connect(BlockingChannel.scala:57)
at kafka.consumer.SimpleConsumer.connect(SimpleConsumer.scala:44)
at kafka.consumer.SimpleConsumer.getOrMakeConnection(SimpleConsumer.scala:142)
at kafka.consumer.SimpleConsumer.kafka$consumer$SimpleConsumer$$sendRequest(SimpleConsumer.scala:69)
at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(SimpleConsumer.scala:109)
at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(SimpleConsumer.scala:109)
at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(SimpleConsumer.scala:109)
at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply$mcV$sp(SimpleConsumer.scala:108)
at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(SimpleConsumer.scala:108)
at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(SimpleConsumer.scala:108)
at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
at kafka.consumer.SimpleConsumer.fetch(SimpleConsumer.scala:107)
at org.apache.spark.streaming.kafka.KafkaRDD$KafkaRDDIterator.fetchBatch(KafkaRDD.scala:150)
at org.apache.spark.streaming.kafka.KafkaRDD$KafkaRDDIterator.getNext(KafkaRDD.scala:162)
at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at org.apache.spark.util.NextIterator.foreach(NextIterator.scala:21)
at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
at org.apache.spark.util.NextIterator.to(NextIterator.scala:21)
at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
at org.apache.spark.util.NextIterator.toBuffer(NextIterator.scala:21)
at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
at org.apache.spark.util.NextIterator.toArray(NextIterator.scala:21)
at org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$12.apply(RDD.scala:927)
at org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$12.apply(RDD.scala:927)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1858)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1858)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Error from the streaming UI:
org.apache.spark.streaming.dstream.DStream.print(DStream.scala:757)
com.REDACTED.bdp.Main$.main(Main.scala:88)
com.REDACTED.bdp.Main.main(Main.scala)
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
java.lang.reflect.Method.invoke(Method.java:498)
org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Errors from YARN application logs (stdout):
java.lang.NullPointerException
at org.apache.spark.streaming.kafka.KafkaRDD$KafkaRDDIterator.close(KafkaRDD.scala:158)
at org.apache.spark.util.NextIterator.closeIfNeeded(NextIterator.scala:66)
at org.apache.spark.streaming.kafka.KafkaRDD$KafkaRDDIterator$$anonfun$1.apply(KafkaRDD.scala:101)
at org.apache.spark.streaming.kafka.KafkaRDD$KafkaRDDIterator$$anonfun$1.apply(KafkaRDD.scala:101)
at org.apache.spark.TaskContextImpl$$anon$1.onTaskCompletion(TaskContextImpl.scala:60)
at org.apache.spark.TaskContextImpl$$anonfun$markTaskCompleted$1.apply(TaskContextImpl.scala:79)
at org.apache.spark.TaskContextImpl$$anonfun$markTaskCompleted$1.apply(TaskContextImpl.scala:77)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:77)
at org.apache.spark.scheduler.Task.run(Task.scala:91)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
[2016-11-21 15:57:49,925] ERROR Exception in task 0.1 in stage 33.0 (TID 34) (org.apache.spark.executor.Executor)
org.apache.spark.util.TaskCompletionListenerException
at org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:91)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Another error from YARN application logs:
[2016-11-21 15:52:32,264] WARN Exception encountered while connecting to the server : (org.apache.hadoop.ipc.Client)
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category READ is not supported in state standby
at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:375)
at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:558)
at org.apache.hadoop.ipc.Client$Connection.access$1800(Client.java:373)
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:727)
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:723)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:722)
at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:373)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1493)
at org.apache.hadoop.ipc.Client.call(Client.java:1397)
at org.apache.hadoop.ipc.Client.call(Client.java:1358)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
at com.sun.proxy.$Proxy9.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:771)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:252)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2116)
at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1315)
at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1311)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1311)
at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1424)
at org.apache.spark.deploy.yarn.Client$.org$apache$spark$deploy$yarn$Client$$sparkJar(Client.scala:1195)
at org.apache.spark.deploy.yarn.Client$.populateClasspath(Client.scala:1333)
at org.apache.spark.deploy.yarn.ExecutorRunnable.prepareEnvironment(ExecutorRunnable.scala:290)
at org.apache.spark.deploy.yarn.ExecutorRunnable.env$lzycompute(ExecutorRunnable.scala:61)
at org.apache.spark.deploy.yarn.ExecutorRunnable.env(ExecutorRunnable.scala:61)
at org.apache.spark.deploy.yarn.ExecutorRunnable.startContainer(ExecutorRunnable.scala:80)
at org.apache.spark.deploy.yarn.ExecutorRunnable.run(ExecutorRunnable.scala:68)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
The weird part is that about 5% of the time, the app reads from Kafka successfully, for whatever reason.
The cluster and YARN seem to be working properly.
The cluster is secured using Kerberos.
What might be the source of this error?
tl;dr The answer does not offer an answer and merely suggests a possible next step.
My understanding of when the Lost task event could be reported for a streaming job is when the job was executed and it could not finish which in your case is the connection issue between a Spark executor and a Kafka broker.
16/11/18 20:54:47 WARN TaskSetManager: Lost task 0.0 in stage 9.0 (TID 9, r3d3.hadoop.REDACTED.REDACTED): java.net.ConnectException: Connection timed out
at sun.nio.ch.Net.connect0(Native Method)
at sun.nio.ch.Net.connect(Net.java:454)
at sun.nio.ch.Net.connect(Net.java:446)
at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:648)
at kafka.network.BlockingChannel.connect(BlockingChannel.scala:57)
at kafka.consumer.SimpleConsumer.connect(SimpleConsumer.scala:44)
at kafka.consumer.SimpleConsumer.getOrMakeConnection(SimpleConsumer.scala:142)
at kafka.consumer.SimpleConsumer.kafka$consumer$SimpleConsumer$$sendRequest(SimpleConsumer.scala:69)
at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(SimpleConsumer.scala:109)
at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(SimpleConsumer.scala:109)
at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(SimpleConsumer.scala:109)
at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply$mcV$sp(SimpleConsumer.scala:108)
at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(SimpleConsumer.scala:108)
at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(SimpleConsumer.scala:108)
at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
at kafka.consumer.SimpleConsumer.fetch(SimpleConsumer.scala:107)
at org.apache.spark.streaming.kafka.KafkaRDD$KafkaRDDIterator.fetchBatch(KafkaRDD.scala:150)
The pattern of the error message is as follows:
Lost task [id] in stage [taskSetId] (TID [tid], [host], executor [executorId]): [reason]
that translates to your case as having the Spark executor running on host r3d3.hadoop.REDACTED.REDACTED.
The reason for the failure is what follows which says:
java.net.ConnectException: Connection timed out
at sun.nio.ch.Net.connect0(Native Method)
at sun.nio.ch.Net.connect(Net.java:454)
at sun.nio.ch.Net.connect(Net.java:446)
at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:648)
at kafka.network.BlockingChannel.connect(BlockingChannel.scala:57)
at kafka.consumer.SimpleConsumer.connect(SimpleConsumer.scala:44)
at kafka.consumer.SimpleConsumer.getOrMakeConnection(SimpleConsumer.scala:142)
at kafka.consumer.SimpleConsumer.kafka$consumer$SimpleConsumer$$sendRequest(SimpleConsumer.scala:69)
at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(SimpleConsumer.scala:109)
at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(SimpleConsumer.scala:109)
at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(SimpleConsumer.scala:109)
at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply$mcV$sp(SimpleConsumer.scala:108)
at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(SimpleConsumer.scala:108)
at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(SimpleConsumer.scala:108)
at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
at kafka.consumer.SimpleConsumer.fetch(SimpleConsumer.scala:107)
And I would ask myself when could a Kafka broker be unavailable for a client (which in your case is a Spark Streaming application which may or may not contribute to understand the root cause of the issue).
I think it might be unrelated to Apache Spark and would look for more answers in Kafka circles.

kafka.common.KafkaStorageException: I/O exception in append to log

I get some big problems of kafka,when I shutdown my consumer application then change a groupId and restart it,my kafka brokers will stop working, this is the stack trace I get
[2016-07-11 17:02:47,314] INFO [Group Metadata Manager on Broker 0]: Loading offsets and group metadata from [__consumer_offsets,0] (kafka.coordinator.GroupMetadataManager)
[2016-07-11 17:02:47,955] FATAL [Replica Manager on Broker 0]: Halting due to unrecoverable I/O error while handling produce request: (kafka.server.ReplicaManager)
kafka.common.KafkaStorageException: I/O exception in append to log '__consumer_offsets-38'
at kafka.log.Log.append(Log.scala:318)
at kafka.cluster.Partition$$anonfun$9.apply(Partition.scala:442)
at kafka.cluster.Partition$$anonfun$9.apply(Partition.scala:428)
at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:262)
at kafka.utils.CoreUtils$.inReadLock(CoreUtils.scala:268)
at kafka.cluster.Partition.appendMessagesToLeader(Partition.scala:428)
at kafka.server.ReplicaManager$$anonfun$appendToLocalLog$2.apply(ReplicaManager.scala:401)
at kafka.server.ReplicaManager$$anonfun$appendToLocalLog$2.apply(ReplicaManager.scala:386)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.immutable.Map$Map1.foreach(Map.scala:109)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.AbstractTraversable.map(Traversable.scala:105)
at kafka.server.ReplicaManager.appendToLocalLog(ReplicaManager.scala:386)
at kafka.server.ReplicaManager.appendMessages(ReplicaManager.scala:322)
at kafka.coordinator.GroupMetadataManager.store(GroupMetadataManager.scala:228)
at kafka.coordinator.GroupCoordinator$$anonfun$handleCommitOffsets$9.apply(GroupCoordinator.scala:429)
at kafka.coordinator.GroupCoordinator$$anonfun$handleCommitOffsets$9.apply(GroupCoordinator.scala:429)
at scala.Option.foreach(Option.scala:236)
at kafka.coordinator.GroupCoordinator.handleCommitOffsets(GroupCoordinator.scala:429)
at kafka.server.KafkaApis.handleOffsetCommitRequest(KafkaApis.scala:280)
at kafka.server.KafkaApis.handle(KafkaApis.scala:76)
at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:60)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.FileNotFoundException: /tmp/kafka-logs/__consumer_offsets-38/00000000000000000000.index (No such file or directory)
at java.io.RandomAccessFile.open0(Native Method)
at java.io.RandomAccessFile.open(RandomAccessFile.java:316)
at java.io.RandomAccessFile.<init>(RandomAccessFile.java:243)
at kafka.log.OffsetIndex$$anonfun$resize$1.apply(OffsetIndex.scala:277)
at kafka.log.OffsetIndex$$anonfun$resize$1.apply(OffsetIndex.scala:276)
at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:262)
at kafka.log.OffsetIndex.resize(OffsetIndex.scala:276)
at kafka.log.OffsetIndex$$anonfun$trimToValidSize$1.apply$mcV$sp(OffsetIndex.scala:265)
at kafka.log.OffsetIndex$$anonfun$trimToValidSize$1.apply(OffsetIndex.scala:265)
at kafka.log.OffsetIndex$$anonfun$trimToValidSize$1.apply(OffsetIndex.scala:265)
at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:262)
at kafka.log.OffsetIndex.trimToValidSize(OffsetIndex.scala:264)
Probably your /tmp is automatically cleaned up i.e. systemd-tmpfiles.
https://www.freedesktop.org/software/systemd/man/systemd-tmpfiles.html