I have a Storm cluster with 2 nodes and 1 ZooKeeper. One of the worker dies because of the following error. Does any one have an idea on why stormconf.ser file is getting deleted?
I am using 0.9.2 Storm and 3.4.6 ZK version.
o.a.c.f.s.ConnectionStateManager [INFO] State change: CONNECTED
2015-01-31 01:23:06 o.a.c.f.s.ConnectionStateManager [WARN] There are no ConnectionStateListeners registered.
2015-01-31 01:23:07 b.s.d.worker [ERROR] Error on initialization of server mk-worker
java.io.FileNotFoundException: File '/home/Programs/apache-storm-0.9.2-incubating/stormtmp/supervisor/stormdist/storm-topology-1-1422602934/stormconf.ser' does not exist
at org.apache.commons.io.FileUtils.openInputStream(FileUtils.java:299) ~[commons-io-2.4.jar:2.4]
at org.apache.commons.io.FileUtils.readFileToByteArray(FileUtils.java:1763) ~[commons-io-2.4.jar:2.4]
at backtype.storm.config$read_supervisor_storm_conf.invoke(config.clj:212) ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
at backtype.storm.daemon.worker$worker_data.invoke(worker.clj:180) ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
at backtype.storm.daemon.worker$fn__5940$exec_fn__1396__auto____5941.invoke(worker.clj:356) ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
at clojure.lang.AFn.applyToHelper(AFn.java:185) [clojure-1.5.1.jar:na]
at clojure.lang.AFn.applyTo(AFn.java:151) [clojure-1.5.1.jar:na]
at clojure.core$apply.invoke(core.clj:617) ~[clojure-1.5.1.jar:na]
at backtype.storm.daemon.worker$fn__5940$mk_worker__5996.doInvoke(worker.clj:347) [storm-core-0.9.2-incubating.jar:0.9.2-incubating]
at clojure.lang.RestFn.invoke(RestFn.java:512) [clojure-1.5.1.jar:na]
at backtype.storm.daemon.worker$_main.invoke(worker.clj:454) [storm-core-0.9.2-incubating.jar:0.9.2-incubating]
at clojure.lang.AFn.applyToHelper(AFn.java:172) [clojure-1.5.1.jar:na]
at clojure.lang.AFn.applyTo(AFn.java:151) [clojure-1.5.1.jar:na]
at backtype.storm.daemon.worker.main(Unknown Source) [storm-core-0.9.2-incubating.jar:0.9.2-incubating]
2015-01-31 01:23:07 b.s.util [INFO] Halting process: ("Error on initialization")
That was known issue in the old releases of Storm. Usually what you need to do is to clear the directories that Storm is using. Check for the name of those files in Storm's configuration file conf/storm.yaml.
Just to make it clear: this issue appears to be fixed in Storm 0.9.4.
Details in the issue linked by #tousif: https://issues.apache.org/jira/browse/STORM-130
Downgrade the version of hdp to 2.5 or less
Use storm verion 1.0.1
It solved the issue...
clearing the directory is useless idea.
Related
I am trying to deploy my confluent Kafka Connect for S3 in a distributed mode. But I am encountering the following error :-
(org.eclipse.jetty.server.HttpChannel) [qtp1620643420-22]
java.lang.AbstractMethodError: javax.ws.rs.core.UriBuilder.uri(Ljava/lang/String;)Ljavax/ws/rs/core/UriBuilder;
at javax.ws.rs.core.UriBuilder.fromUri(UriBuilder.java:96)
at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:275)
at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:205)
at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:852)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:544)
at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)
at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1581)
at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)
at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1307)
at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:482)
at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1549)
at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186)
at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1204)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:221)
at org.eclipse.jetty.server.handler.StatisticsHandler.handle(StatisticsHandler.java:173)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
at org.eclipse.jetty.server.Server.handle(Server.java:494)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:374)
at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:268)
at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103)
at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:117)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:313)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produce(EatWhatYouKill.java:135)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:782)
at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:918)
at java.lang.Thread.run(Thread.java:748)
I can see the following version of javax.ws.rs-api-2.1.1.jar available in lib folder still it does not solve the issue. I tried importing glassfish jars but that didn't helped too.
Not sure what is the issue has anyone faced this issue can help me ?
Version which I am using
Confluent Kafka S3 Connect version - 5.5.1
I try to configure kafka authentification using sasl mechanism (OAUTHBEARER)(using flink 1.9.2, kafka-client 2.2.0).
When using Flink with SASL authentification I got the exception bellow.
Kafka is shaded in a fat jar with the application.
After a remote debugging I found that my callback handler has a ChildFirstClassloader and
org.apache.kafka.common.security.auth.AuthenticateCallbackHandler belongs to another ChildFirstClassloader so the instance of the following test is failing (OAuthBearerSaslClientFactory) :
if (!(Objects.requireNonNull(callbackHandler) instanceof AuthenticateCallbackHandler))
throw new IllegalArgumentException(String.format(
"Callback handler must be castable to %s: %s",
AuthenticateCallbackHandler.class.getName(), callbackHandler.getClass().getName()));
I have no idea why these two classes have two different classloader.
Any idea? Any workaround?
Thanks for the help.
Caused by: org.apache.kafka.common.errors.SaslAuthenticationException: Failed to configure SaslClientAuthenticator
Caused by: java.lang.IllegalArgumentException: Callback handler must be castable to org.apache.kafka.common.security.auth.AuthenticateCallbackHandler: org.apache.kafka.common.security.oauthbearer.internals.OAuthBearerSaslClientCallbackHandler
at org.apache.kafka.common.security.oauthbearer.internals.OAuthBearerSaslClient$OAuthBearerSaslClientFactory.createSaslClient(OAuthBearerSaslClient.java:182)
at javax.security.sasl.Sasl.createSaslClient(Sasl.java:420)
at org.apache.kafka.common.security.authenticator.SaslClientAuthenticator.lambda$createSaslClient$0(SaslClientAuthenticator.java:180)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.kafka.common.security.authenticator.SaslClientAuthenticator.createSaslClient(SaslClientAuthenticator.java:176)
at org.apache.kafka.common.security.authenticator.SaslClientAuthenticator.<init>(SaslClientAuthenticator.java:168)
at org.apache.kafka.common.network.SaslChannelBuilder.buildClientAuthenticator(SaslChannelBuilder.java:254)
at org.apache.kafka.common.network.SaslChannelBuilder.lambda$buildChannel$1(SaslChannelBuilder.java:202)
at org.apache.kafka.common.network.KafkaChannel.<init>(KafkaChannel.java:140)
at org.apache.kafka.common.network.SaslChannelBuilder.buildChannel(SaslChannelBuilder.java:210)
at org.apache.kafka.common.network.Selector.buildAndAttachKafkaChannel(Selector.java:334)
at org.apache.kafka.common.network.Selector.registerChannel(Selector.java:325)
at org.apache.kafka.common.network.Selector.connect(Selector.java:257)
at org.apache.kafka.clients.NetworkClient.initiateConnect(NetworkClient.java:920)
at org.apache.kafka.clients.NetworkClient.ready(NetworkClient.java:287)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.trySend(ConsumerNetworkClient.java:474)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:255)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:236)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:215)
at org.apache.kafka.clients.consumer.internals.Fetcher.getTopicMetadata(Fetcher.java:292)
at org.apache.kafka.clients.consumer.KafkaConsumer.partitionsFor(KafkaConsumer.java:1803)
at org.apache.kafka.clients.consumer.KafkaConsumer.partitionsFor(KafkaConsumer.java:1771)
at org.apache.flink.streaming.connectors.kafka.internal.KafkaPartitionDiscoverer.getAllPartitionsForTopics(KafkaPartitionDiscoverer.java:77)
at org.apache.flink.streaming.connectors.kafka.internals.AbstractPartitionDiscoverer.discoverPartitions(AbstractPartitionDiscoverer.java:131)
at org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumerBase.open(FlinkKafkaConsumerBase.java:508)
at org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36)
at org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.open(AbstractUdfStreamOperator.java:102)
at org.apache.flink.streaming.runtime.tasks.StreamTask.openAllOperators(StreamTask.java:552)
at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:416)
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:705)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:530)
at java.lang.Thread.run(Thread.java:748)
I'm not sure if you've solved this already, but I wrestled with this exact same scenario for quite a while. What ended up working for me was copying the kafka-clients jar into Flink's lib/ directory.
Sorry forgot to post the solution, but yes i solved it in the same way, by copying the kafka-client in flink lib.
I am using the Apache Drill (1.14) JDBC driver in my application which consumes the data from the Kafka. The application works just fine for some time and after few iterations it fails to execute due to the following Too many files open issue. I made sure there are no file handle leaks in my code but still nor sure why this issue is happening?
It looks like the issue is happening from with-in the Apache drill libraries when constructing the Kafka consumer. Can any one please guide me help this problem fixed?
The problem perishes when I restart my Apache drillbit but very soon it happens again. I did check the file descriptor count on my unix machine using ulimit -a | wc -l & lsof -a -p <PID> | wc -l before and after the drill process restart and it seems the drill process is considerably taking a lot of file descriptors. I tried increasing the file descriptor count on the system but still no luck.
I have followed the Apache Drill storage plugin documentation in configuring the Kafka plugin into Apache Drill at https://drill.apache.org/docs/kafka-storage-plugin/
Any help on this issue is highly appreciated. Thanks.
JDBC URL: jdbc:drill:drillbit=localhost:31010;schema=kafka
NOTE: I am pushing down the filters in my query
SELECT * FROM myKafkaTopic WHERE kafkaMsgTimestamp > 1560210931626
org.apache.drill.common.exceptions.UserException: DATA_READ ERROR: Failed to fetch start/end offsets of the topic myKafkaTopic
Failed to construct kafka consumer
[Error Id: 73f896a7-09d4-425b-8cd5-f269c3a6e69a ]
at org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:633) ~[drill-common-1.14.0.jar:1.14.0]
at org.apache.drill.exec.store.kafka.KafkaGroupScan.init(KafkaGroupScan.java:198) [drill-storage-kafka-1.14.0.jar:1.14.0]
at org.apache.drill.exec.store.kafka.KafkaGroupScan.<init>(KafkaGroupScan.java:98) [drill-storage-kafka-1.14.0.jar:1.14.0]
at org.apache.drill.exec.store.kafka.KafkaStoragePlugin.getPhysicalScan(KafkaStoragePlugin.java:83) [drill-storage-kafka-1.14.0.jar:1.14.0]
at org.apache.drill.exec.store.AbstractStoragePlugin.getPhysicalScan(AbstractStoragePlugin.java:111) [drill-java-exec-1.14.0.jar:1.14.0]
at org.apache.drill.exec.planner.logical.DrillTable.getGroupScan(DrillTable.java:99) [drill-java-exec-1.14.0.jar:1.14.0]
at org.apache.drill.exec.planner.logical.DrillScanRel.<init>(DrillScanRel.java:89) [drill-java-exec-1.14.0.jar:1.14.0]
at org.apache.drill.exec.planner.logical.DrillScanRel.<init>(DrillScanRel.java:69) [drill-java-exec-1.14.0.jar:1.14.0]
at org.apache.drill.exec.planner.logical.DrillScanRel.<init>(DrillScanRel.java:62) [drill-java-exec-1.14.0.jar:1.14.0]
at org.apache.drill.exec.planner.logical.DrillScanRule.onMatch(DrillScanRule.java:38) [drill-java-exec-1.14.0.jar:1.14.0]
at org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:212) [calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6]
at org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:652) [calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6]
at org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:368) [calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6]
at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.transform(DefaultSqlHandler.java:429) [drill-java-exec-1.14.0.jar:1.14.0]
at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.transform(DefaultSqlHandler.java:369) [drill-java-exec-1.14.0.jar:1.14.0]
at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToRawDrel(DefaultSqlHandler.java:255) [drill-java-exec-1.14.0.jar:1.14.0]
at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToDrel(DefaultSqlHandler.java:318) [drill-java-exec-1.14.0.jar:1.14.0]
at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan(DefaultSqlHandler.java:180) [drill-java-exec-1.14.0.jar:1.14.0]
at org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan(DrillSqlWorker.java:145) [drill-java-exec-1.14.0.jar:1.14.0]
at org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:83) [drill-java-exec-1.14.0.jar:1.14.0]
at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:567) [drill-java-exec-1.14.0.jar:1.14.0]
at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:266) [drill-java-exec-1.14.0.jar:1.14.0]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [na:1.8.0_181]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [na:1.8.0_181]
at java.lang.Thread.run(Thread.java:748) [na:1.8.0_181]
Caused by: org.apache.kafka.common.KafkaException: Failed to construct kafka consumer
at org.apache.kafka.clients.consumer.KafkaConsumer.<init>(KafkaConsumer.java:765) ~[kafka-clients-0.11.0.1.jar:na]
at org.apache.kafka.clients.consumer.KafkaConsumer.<init>(KafkaConsumer.java:633) ~[kafka-clients-0.11.0.1.jar:na]
at org.apache.drill.exec.store.kafka.KafkaGroupScan.init(KafkaGroupScan.java:168) [drill-storage-kafka-1.14.0.jar:1.14.0]
... 23 common frames omitted
Caused by: org.apache.kafka.common.KafkaException: java.io.IOException: Too many open files
at org.apache.kafka.common.network.Selector.<init>(Selector.java:129) ~[kafka-clients-0.11.0.1.jar:na]
at org.apache.kafka.common.network.Selector.<init>(Selector.java:156) ~[kafka-clients-0.11.0.1.jar:na]
at org.apache.kafka.common.network.Selector.<init>(Selector.java:160) ~[kafka-clients-0.11.0.1.jar:na]
at org.apache.kafka.clients.consumer.KafkaConsumer.<init>(KafkaConsumer.java:701) ~[kafka-clients-0.11.0.1.jar:na]
... 25 common frames omitted
Caused by: java.io.IOException: Too many open files
at sun.nio.ch.EPollArrayWrapper.epollCreate(Native Method) ~[na:1.8.0_181]
at sun.nio.ch.EPollArrayWrapper.<init>(EPollArrayWrapper.java:130) ~[na:1.8.0_181]
at sun.nio.ch.EPollSelectorImpl.<init>(EPollSelectorImpl.java:69) ~[na:1.8.0_181]
at sun.nio.ch.EPollSelectorProvider.openSelector(EPollSelectorProvider.java:36) ~[na:1.8.0_181]
at java.nio.channels.Selector.open(Selector.java:227) ~[na:1.8.0_181]
at org.apache.kafka.common.network.Selector.<init>(Selector.java:127) ~[kafka-clients-0.11.0.1.jar:na]```
This is because the kafka storage plugin never explicitly closes its connections to kafka.
It relies on the idle timeout at the server ( connections.max.idle.ms ) which defaults to 10 minutes.
I saw exactly the same issue when running queries for a few minutes - and we reduced the default timeout to verify that was the case.
I have run into an issue with Kafka on Windows where it attempts to delete log segments, but it cannot due to another process having access to the files. This is caused by Kafka holding access to the file itself and trying to delete a file it has open. The bug is below for reference.
I have found two JIRA bugs that have been filed on this issue https://issues.apache.org/jira/browse/KAFKA-1194 and https://issues.apache.org/jira/browse/KAFKA-2170. The first being logged under version 0.8.1 and the second for version 0.10.1.
I have personally tried versions 0.10.1 and 0.10.2. Neither of them have the bug fixed in them.
My question is, does anyone know of patch that can fix this issue or know if the Kafka people have a fix for this that will be rolling out soon.
Thanks.
kafka.common.KafkaStorageException: Failed to change the log file suffix from to .deleted for log segment 6711351
at kafka.log.LogSegment.kafkaStorageException$1(LogSegment.scala:340)
at kafka.log.LogSegment.changeFileSuffixes(LogSegment.scala:342)
at kafka.log.Log.kafka$log$Log$$asyncDeleteSegment(Log.scala:981)
at kafka.log.Log.kafka$log$Log$$deleteSegment(Log.scala:971)
at kafka.log.Log$$anonfun$deleteOldSegments$1.apply(Log.scala:673)
at kafka.log.Log$$anonfun$deleteOldSegments$1.apply(Log.scala:673)
at scala.collection.immutable.List.foreach(List.scala:381)
at kafka.log.Log.deleteOldSegments(Log.scala:673)
at kafka.log.Log.deleteRetentionSizeBreachedSegments(Log.scala:717)
at kafka.log.Log.deleteOldSegments(Log.scala:697)
at kafka.log.LogManager$$anonfun$cleanupLogs$3.apply(LogManager.scala:474)
at kafka.log.LogManager$$anonfun$cleanupLogs$3.apply(LogManager.scala:472)
at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733)
at scala.collection.Iterator$class.foreach(Iterator.scala:893)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732)
at kafka.log.LogManager.cleanupLogs(LogManager.scala:472)
at kafka.log.LogManager$$anonfun$startup$1.apply$mcV$sp(LogManager.scala:200)
at kafka.utils.KafkaScheduler$$anonfun$1.apply$mcV$sp(KafkaScheduler.scala:110)
at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:57)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.nio.file.FileSystemException: c:\kafka-logs\kafka-logs\metric-values-0\00000000000006711351.log -> c:\kafka-logs\kafka-logs\metric-values-0\00000000000006711351.log.deleted: The process cannot access the file because it is being used by another process.
at sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:86)
at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:97)
at sun.nio.fs.WindowsFileCopy.move(WindowsFileCopy.java:387)
at sun.nio.fs.WindowsFileSystemProvider.move(WindowsFileSystemProvider.java:287)
at java.nio.file.Files.move(Files.java:1395)
at org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:711)
at org.apache.kafka.common.record.FileRecords.renameTo(FileRecords.java:210)
... 28 more
Suppressed: java.nio.file.FileSystemException: c:\kafka-logs\kafka-logs\metric-values-0\00000000000006711351.log -> c:\kafka-logs\kafka-logs\metric-values-0\00000000000006711351.log.deleted: The process cannot access the file because it is being used by another process.
at sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:86)
at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:97)
at sun.nio.fs.WindowsFileCopy.move(WindowsFileCopy.java:301)
at sun.nio.fs.WindowsFileSystemProvider.move(WindowsFileSystemProvider.java:287)
at java.nio.file.Files.move(Files.java:1395)
at org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:708)
... 29 more
Im having a similar issue running kafka on local , kafka server seems stopping on execution upon failure to delete the log file. For the solution to avoid this to happen i have to increase the log retention for the log to avoid auto deletion.
# The minimum age of a log file to be eligible for deletion due to age
log.retention.hours=500
setting the log to xxx hours will avoid this upon running on local, but for production I think for linux based system , this should not happen.
In case you need to delete the log file , delete it manually where your logs located ,then restart the kafka .
A task that works in spark local mode is not working for standalone cluster running on the same machine.
The only difference is:
local[*]
vs
spark://<host>.local:7077
for the master
I am able to run spark pi against the master at the above address and also use the spark gui: so the master address is generally working for spark.
Here is the (normal) spark init code:
val sconf = new SparkConf().setMaster(master).setAppName("EpisCatalog")
val sc = new SparkContext(sconf)
Here is the stacktrace from running the program:
15/12/03 03:39:04.746 main WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/12/03 03:39:07.706 main WARN MetricsSystem: Using default name DAGScheduler for source because spark.app.id is not set.
15/12/03 03:39:27.739 appclient-registration-retry-thread ERROR SparkUncaughtExceptionHandler: Uncaught exception in thread Thread[appclient-registration-retry-thread,5,main]
java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.FutureTask#b649f0b rejected from java.util.concurrent.ThreadPoolExecutor#5ef7a52b[Running, pool size = 1, active threads = 1, queued tasks = 0, completed tasks = 0]
at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2047)
at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:823)
at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1369)
at java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:112)
at org.apache.spark.deploy.client.AppClient$ClientEndpoint$$anonfun$tryRegisterAllMasters$1.apply(AppClient.scala:103)
at org.apache.spark.deploy.client.AppClient$ClientEndpoint$$anonfun$tryRegisterAllMasters$1.apply(AppClient.scala:102)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:245)
at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:186)
at org.apache.spark.deploy.client.AppClient$ClientEndpoint.tryRegisterAllMasters(AppClient.scala:102)
at org.apache.spark.deploy.client.AppClient$ClientEndpoint.org$apache$spark$deploy$client$AppClient$ClientEndpoint$$registerWithMaster(AppClient.scala:128)
at org.apache.spark.deploy.client.AppClient$ClientEndpoint$$anon$2$$anonfun$run$1.apply$mcV$sp(AppClient.scala:139)
at org.apache.spark.util.Utils$.tryOrExit(Utils.scala:1130)
at org.apache.spark.deploy.client.AppClient$ClientEndpoint$$anon$2.run(AppClient.scala:131)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
I am running spark 1.6.0-SNAPSHOT. It has been "installed" to local maven repo and I have verified that the client is using the latest local maven repo version.
I had the same problem. It could be solved by using the full host url (can be found on the Master Web UI, port 18080) instead of just the hostname or localhost.
So I had to use mymachine.mycompany.org instead of mymachine
I got the same problem and in my case there was version mismatch. I had Spark Driver written on 1.5.1 version and Spark Cluster setup on 1.6.0.
Maybe you deploy cluster on stable version which was on that time 1.5.1.