Why is Cassandra crashing whenever I try to run DataStax Kafka Connector? - apache-kafka

Goal: My goal is to use Kafka to send messages to a Cassandra sink using Kafka Connect.
I've deployed Kafka and Cassandra and I am able to work with each of them individually - I have no problem sending data to Kafka, using producers to pass messages, and using consumers to consume them. I have no problem using cqlsh to create tables and insert data into them. However, whenever I try to deploy the DataStax Apache Kafka Connector, Cassandra seems to crash.
I am trying to learn how to use Kafka Connect using just one Kafka producer, broker, and one Cassandra keyspace using the standalone mode. I've configured both connect-standalone.properties and the cassandra-sink-standalone.properties following the instructions shown on DataStax: https://docs.datastax.com/en/kafka/doc/kafka/kafkaStringJson.html
connect-standalone.properties
bootstrap.servers=localhost:9092
key.converter=org.apache.kafka.connect.storage.StringConverter
value.converter=org.apache.kafka.connect.json.JsonConverter
key.converter.schemas.enable=false
value.converter.schemas.enable=false
offset.storage.file.filename=/tmp/connect.offsets
offset.flush.interval.ms=10000
plugin.path= *install_location*/kafka-connect-cassandra-sink-1.4.0.jar
cassandra-sink-standalone.properties
name=stocks-sink
connector.class=com.datastax.kafkaconnector.DseSinkConnector
tasks.max=1
topics=stocks_topic
topic.stocks_topic.stocks_keyspace.stocks_table.mapping = symbol=value.symbol, ts=value.ts, exchange=value.exchange, industry=value.industry, name=key, value=value.value
Then, the Kafka Connector is started using bin/connect-standalone.sh connect-standalone.properties cassandra-sink-standalone.properties.
About 95% of the time I attempt to launch Kafka Connector, Cassandra crashes. Running bin/nodetool status shows the message:
nodetool: Failed to connect to '127.0.0.1:7199' - ConnectException: 'Connection refused (Connection refused)'
In the system.log and debug.log logs, there is no indication that Cassandra has even crashed. The last line just remains as:
INFO [main] 2023-01-31 00:00:00,143 StorageService.java:2806 - Node localhost/127.0.0.1:7000 state jump to NORMAL
And in the Kafka Connect logs, the error messages states:
[2023-01-31 15:24:47,803] INFO [plc-sink|task-0] DataStax Java driver for Apache Cassandra(R) (com.datastax.oss:java-driver-core) version 4.6.0 (com.datastax.oss.driver.internal.core.DefaultMavenCoordinates:37)
[2023-01-31 15:24:47,947] INFO [plc-sink|task-0] Could not register Graph extensions; this is normal if Tinkerpop was explicitly excluded from classpath (com.datastax.oss.driver.internal.core.context.InternalDriverContext:540)
[2023-01-31 15:24:47,948] INFO [plc-sink|task-0] Could not register Reactive extensions; this is normal if Reactive Streams was explicitly excluded from classpath (com.datastax.oss.driver.internal.core.context.InternalDriverContext:559)
[2023-01-31 15:24:47,997] INFO [plc-sink|task-0] Using native clock for microsecond precision (com.datastax.oss.driver.internal.core.time.Clock:40)
[2023-01-31 15:24:47,999] INFO [plc-sink|task-0] [s0] No contact points provided, defaulting to /127.0.0.1:9042 (com.datastax.oss.driver.internal.core.metadata.MetadataManager:134)
[2023-01-31 15:24:48,190] WARN [plc-sink|task-0] [s0] Error connecting to Node(endPoint=/127.0.0.1:9042, hostId=null, hashCode=3247c5e4), trying next node (ConnectionInitException: [s0|control|connecting...] Protocol initialization request, step 1 (OPTIONS): failed to send request (java.nio.channels.ClosedChannelException)) (com.datastax.oss.driver.internal.core.control.ControlConnection:34)
[2023-01-31 15:24:48,200] ERROR [plc-sink|task-0] WorkerSinkTask{id=plc-sink-0} Task threw an uncaught and unrecoverable exception. Task is being killed and will not recover until manually restarted (org.apache.kafka.connect.runtime.WorkerTask:196)
com.datastax.oss.driver.api.core.AllNodesFailedException: Could not reach any contact point, make sure you've provided valid addresses (showing first 1 nodes, use getAllErrors() for more): Node(endPoint=/127.0.0.1:9042, hostId=null, hashCode=3247c5e4): [com.datastax.oss.driver.api.core.connection.ConnectionInitException: [s0|control|connecting...] Protocol initialization request, step 1 (OPTIONS): failed to send request (java.nio.channels.ClosedChannelException)]
at com.datastax.oss.driver.api.core.AllNodesFailedException.copy(AllNodesFailedException.java:141)
at com.datastax.oss.driver.internal.core.util.concurrent.CompletableFutures.getUninterruptibly(CompletableFutures.java:149)
at com.datastax.oss.driver.api.core.session.SessionBuilder.build(SessionBuilder.java:612)
at com.datastax.oss.kafka.sink.state.LifeCycleManager.buildCqlSession(LifeCycleManager.java:518)
at com.datastax.oss.kafka.sink.state.LifeCycleManager.lambda$startTask$0(LifeCycleManager.java:113)
at java.util.concurrent.ConcurrentHashMap.computeIfAbsent(ConcurrentHashMap.java:1660)
at com.datastax.oss.kafka.sink.state.LifeCycleManager.startTask(LifeCycleManager.java:109)
at com.datastax.oss.kafka.sink.CassandraSinkTask.start(CassandraSinkTask.java:83)
at org.apache.kafka.connect.runtime.WorkerSinkTask.initializeAndStart(WorkerSinkTask.java:312)
at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:187)
at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:244)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Suppressed: com.datastax.oss.driver.api.core.connection.ConnectionInitException: [s0|control|connecting...] Protocol initialization request, step 1 (OPTIONS): failed to send request (java.nio.channels.ClosedChannelException)
at com.datastax.oss.driver.internal.core.channel.ProtocolInitHandler$InitRequest.fail(ProtocolInitHandler.java:342)
at com.datastax.oss.driver.internal.core.channel.ChannelHandlerRequest.writeListener(ChannelHandlerRequest.java:87)
at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:577)
at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:551)
at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:490)
at io.netty.util.concurrent.DefaultPromise.addListener(DefaultPromise.java:183)
at io.netty.channel.DefaultChannelPromise.addListener(DefaultChannelPromise.java:95)
at io.netty.channel.DefaultChannelPromise.addListener(DefaultChannelPromise.java:30)
at com.datastax.oss.driver.internal.core.channel.ChannelHandlerRequest.send(ChannelHandlerRequest.java:76)
at com.datastax.oss.driver.internal.core.channel.ProtocolInitHandler$InitRequest.send(ProtocolInitHandler.java:183)
at com.datastax.oss.driver.internal.core.channel.ProtocolInitHandler.onRealConnect(ProtocolInitHandler.java:118)
at com.datastax.oss.driver.internal.core.channel.ConnectInitHandler.lambda$connect$0(ConnectInitHandler.java:57)
at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:577)
at io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:570)
at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:549)
at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:490)
at io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:615)
at io.netty.util.concurrent.DefaultPromise.setFailure0(DefaultPromise.java:608)
at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:117)
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.fulfillConnectPromise(AbstractNioChannel.java:321)
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:337)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:702)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:650)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:576)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
... 1 more
Suppressed: io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: /127.0.0.1:9042
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:716)
at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:330)
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:334)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:702)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:650)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:576)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.lang.Thread.run(Thread.java:750)
Caused by: java.nio.channels.ClosedChannelException
at io.netty.channel.AbstractChannel$AbstractUnsafe.newClosedChannelException(AbstractChannel.java:957)
at io.netty.channel.AbstractChannel$AbstractUnsafe.flush0(AbstractChannel.java:921)
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.flush0(AbstractNioChannel.java:354)
at io.netty.channel.AbstractChannel$AbstractUnsafe.flush(AbstractChannel.java:897)
at io.netty.channel.DefaultChannelPipeline$HeadContext.flush(DefaultChannelPipeline.java:1372)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:748)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:740)
at io.netty.channel.AbstractChannelHandlerContext.flush(AbstractChannelHandlerContext.java:726)
at io.netty.channel.ChannelDuplexHandler.flush(ChannelDuplexHandler.java:127)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:748)
at io.netty.channel.AbstractChannelHandlerContext.invokeWriteAndFlush(AbstractChannelHandlerContext.java:763)
at io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:788)
at io.netty.channel.AbstractChannelHandlerContext.writeAndFlush(AbstractChannelHandlerContext.java:756)
at io.netty.channel.AbstractChannelHandlerContext.writeAndFlush(AbstractChannelHandlerContext.java:806)
at io.netty.channel.DefaultChannelPipeline.writeAndFlush(DefaultChannelPipeline.java:1025)
at io.netty.channel.AbstractChannel.writeAndFlush(AbstractChannel.java:294)
at com.datastax.oss.driver.internal.core.channel.ChannelHandlerRequest.send(ChannelHandlerRequest.java:75)
... 20 more
In the 5% of the time that Cassandra doesn't actually crash, the following message shows up in Kafka Connect's logs:
[2023-01-31 15:41:32,839] INFO [plc-sink|task-0] DataStax Java driver for Apache Cassandra(R) (com.datastax.oss:java-driver-core) version 4.6.0 (com.datastax.oss.driver.internal.core.DefaultMavenCoordinates:37)
[2023-01-31 15:41:32,981] INFO [plc-sink|task-0] Could not register Graph extensions; this is normal if Tinkerpop was explicitly excluded from classpath (com.datastax.oss.driver.internal.core.context.InternalDriverContext:540)
[2023-01-31 15:41:32,982] INFO [plc-sink|task-0] Could not register Reactive extensions; this is normal if Reactive Streams was explicitly excluded from classpath (com.datastax.oss.driver.internal.core.context.InternalDriverContext:559)
[2023-01-31 15:41:33,037] INFO [plc-sink|task-0] Using native clock for microsecond precision (com.datastax.oss.driver.internal.core.time.Clock:40)
[2023-01-31 15:41:33,040] INFO [plc-sink|task-0] [s0] No contact points provided, defaulting to /127.0.0.1:9042 (com.datastax.oss.driver.internal.core.metadata.MetadataManager:134)
[2023-01-31 15:41:33,254] INFO [plc-sink|task-0] [s0] Failed to connect with protocol DSE_V2, retrying with DSE_V1 (com.datastax.oss.driver.internal.core.channel.ChannelFactory:224)
[2023-01-31 15:41:33,263] INFO [plc-sink|task-0] [s0] Failed to connect with protocol DSE_V1, retrying with V4 (com.datastax.oss.driver.internal.core.channel.ChannelFactory:224)
[2023-01-31 15:41:34,091] INFO [plc-sink|task-0] WorkerSinkTask{id=plc-sink-0} Sink task finished initialization and start (org.apache.kafka.connect.runtime.WorkerSinkTask:313)
[2023-01-31 15:41:34,092] INFO [plc-sink|task-0] WorkerSinkTask{id=plc-sink-0} Executing sink task (org.apache.kafka.connect.runtime.WorkerSinkTask:198)
...
Versions:
Apache Cassandra 4.0.7
Apache Kafka 3.3.1
DataStax Apache Kafka Connector 1.4.0
I am currently using WSL2 Ubuntu 20.04.5 on Windows 11, with the following specs:
CPU: 4 Cores
Memory: 8GB RAM
Disk (SSD): 250 GB
Seeing that it actually works 5% of the time, I suspect that it's an OOM problem as outlined in https://community.datastax.com/questions/6947/index.html (and I sometimes just happen to have enough memory?). I've tried the solution in this article but it didn't help. How can I configure Cassandra / Kafka Connect to avoid this problem? Is this just a matter of needing a computer with more memory?

I think you're on the right track when you suggested that memory is an issue.
I have a "small" Windows Surface Pro I use to replicate issues like yours. I'm also running Ubuntu 20.04 with WSL2 on this laptop.
By default, Windows allocates half of system RAM to WSL2 so on my sub-8GB RAM installation, my Ubuntu installation takes up 3.7GB of memory. A vanilla installation of Cassandra (out-of-the-box zero configuration), starts with 1.3GB of memory allocated to it so there's only 2.4GB left for everything else.
What I suspect is happening is that when you start Kafka on the same node, Ubuntu runs out of memory and it triggers the Linux oom-killer. Although the end result is similar, the trigger is slightly different to what I described in the post you linked so my recommendation to explicitly set disk_access_mode doesn't help in this situation.
As a workaround, configure Cassandra to only allocate 1GB of memory by setting the MAX_HEAP_SIZE in conf/cassandra-env.sh:
MAX_HEAP_SIZE="1G"
Kafka is configured with 1GB by default but if it isn't, set the following in bin/kafka-server-start.sh:
export KAFKA_HEAP_OPTS="-Xmx1G -Xms1G"
By setting both, there should be over 1GB left for Ubuntu and hopefully allow you to run your tests. Cheers!

Related

Kafka Snowflake Connector - Stopping after connector error

I've been checking all the kafka snowflake connector posts but none of them talked about the issue I'm having.
I installed Kafka in local, with zookeper, and I also want to run a Snowflake connector, to copy data from Kafka towards Snowflake.
I run zookeeper, every thing looks right:
zookeeper log
Then I launch the kafka server, looks correct as well:
server log
However when I launch the snowflake-kafka-connector:
sh connect-standalone.sh /usr/local/kafka/kafka_2.11-1.1.0/config/connect-standalone.properties /usr/local/kafka/kafka_2.11-1.1.0/config/SF_connect.properties
, it breaks like this:
[2022-05-27 10:41:37,380] INFO Finished creating connector TEST_CONNECTOR (org.apache.kafka.connect.runtime.Worker:224)
[2022-05-27 10:41:37,380] INFO Skipping reconfiguration of connector kafkatest since it is not running (org.apache.kafka.connect.runtime.standalone.StandaloneHerder:285)
[2022-05-27 10:41:37,381] ERROR Stopping after connector error (org.apache.kafka.connect.cli.ConnectStandalone:113)
java.lang.NullPointerException: Cannot invoke "org.apache.kafka.connect.runtime.rest.entities.ConnectorInfo.name()" because the return value of "org.apache.kafka.connect.runtime.Herder$Created.result()" is null
at org.apache.kafka.connect.cli.ConnectStandalone$1.onCompletion(ConnectStandalone.java:104)
at org.apache.kafka.connect.cli.ConnectStandalone$1.onCompletion(ConnectStandalone.java:98)
at org.apache.kafka.connect.util.ConvertingFutureCallback.onCompletion(ConvertingFutureCallback.java:44)
at org.apache.kafka.connect.runtime.standalone.StandaloneHerder.putConnectorConfig(StandaloneHerder.java:185)
at org.apache.kafka.connect.cli.ConnectStandalone.main(ConnectStandalone.java:107)
[2022-05-27 10:41:37,382] INFO Kafka Connect stopping (org.apache.kafka.connect.runtime.Connect:65)
[2022-05-27 10:41:37,382] INFO Stopping REST server (org.apache.kafka.connect.runtime.rest.RestServer:211)
I tried to find information on what's the matter, but I can't find anything. Can you please help me on that?
This is the sf_connector.properties file:
sf_connector.properties
Thanks!

Unable to connect IIDR CDC to kafka

When trying to connect IIDR replication engine for kafka to a kafka cluster via Zookeeper, I am getting the following error
kafka.common.KafkaException: Failed to parse the broker info from zookeeper: {"listener_security_protocol_map":{"PLAINTEXT":"PLAINTEXT","PLAINTEXT_HOST":"PLAINTEXT"},"endpoints":["PLAINTEXT:
//broker:29092","PLAINTEXT_HOST://localhost:9092"],"jmx_port":9101,"host":"broker","timestamp":"1598174513950","port":29092,"version":4}
at kafka.cluster.Broker$.createBroker(Broker.scala:101)
at kafka.utils.ZkUtils.getBrokerInfo(ZkUtils.scala:787)
at kafka.utils.ZkUtils$$anonfun$getAllBrokersInCluster$2.apply(ZkUtils.scala:162)
at kafka.utils.ZkUtils$$anonfun$getAllBrokersInCluster$2.apply(ZkUtils.scala:162)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
at scala.collection.AbstractTraversable.map(Traversable.scala:104)
at kafka.utils.ZkUtils.getAllBrokersInCluster(ZkUtils.scala:162)
at com.datamirror.ts.target.publication.KafkaTargetPublisherProxy.getBootstrapBrokers(KafkaTargetPublisherProxy.java:829)
at com.datamirror.ts.target.publication.KafkaTargetPublisherProxy.loadKafkaServicesInfo(KafkaTargetPublisherProxy.java:417)
at com.datamirror.ts.target.publication.KafkaTargetPublisherProxy.handleStartReplicateMessage(KafkaTargetPublisherProxy.java:139)
at com.datamirror.ts.enginemsg.MessageDispatcher.dispatchSwitch(MessageDispatcher.java:547)
at com.datamirror.ts.enginemsg.MessageDispatcher.dispatch(MessageDispatcher.java:142)
at com.datamirror.ts.engine.ReplicationSession.dispatchDynamicMessage(ReplicationSession.java:2816)
at com.datamirror.ts.engine.ReplicationSession.dispatchDataMessage(ReplicationSession.java:2910)
at com.datamirror.ts.target.publication.TargetDataChannelJob.moderateForTheTarget(TargetDataChannelJob.java:178)
at com.datamirror.ts.target.publication.TargetDataChannelJob.execute(TargetDataChannelJob.java:74)
at com.datamirror.ts.engine.component.PipelineThread.runThread(PipelineThread.java:217)
at com.datamirror.ts.util.TsThread.run(TsThread.java:130)
Caused by: java.lang.IllegalArgumentException: No enum constant org.apache.kafka.common.protocol.SecurityProtocol.PLAINTEXT_HOST
at java.lang.Enum.valueOf(Enum.java:249)
at org.apache.kafka.common.protocol.SecurityProtocol.valueOf(SecurityProtocol.java:28)
at org.apache.kafka.common.protocol.SecurityProtocol.forName(SecurityProtocol.java:89)
at kafka.cluster.EndPoint$.createEndPoint(EndPoint.scala:49)
at kafka.cluster.Broker$$anonfun$1.apply(Broker.scala:90)
at kafka.cluster.Broker$$anonfun$1.apply(Broker.scala:89)
at scala.collection.immutable.List.map(List.scala:277)
at kafka.cluster.Broker$.createBroker(Broker.scala:89)
It seems like there is a compatibility issue between IIDR and my kafka cluster. Appreciate any advice on how to overcome this.
System setup
- IIDR v11.4 kafka engine setup
- Quick start confluent kafka docker setup, https://github.com/confluentinc/cp-all-in-one, cd cp-all-in-one
The recommendation for both Kafka these days and the IDR product is to use bootstrap.servers and list the brokers in the kafkaproducer.properties and kafkaconsumer.properties files. Contact L2 with regards to zookeeper config if there is a need to pursue, but we strongly recommend using the bootstrap.servers parameter as per Apache Kafka recommendations.

Kafka connect doesn't find available brokers when volume attached

Symptom : A modified bitnami kafka image contains the kafka-connect jars, they work fine.
But once I add a volume for persistence, it can't find existing brokers.
Details:
I modded the bitnami image in a way to copy the connect jars and launching the connect-distributed.sh.
It works fine, connectors can consume and produce from/to the topics
But once I add persistent volume to the kafka image, the first startup is ok but the next onwards dont. connect.log says:
"[2020-05-21 15:59:34,786] ERROR [Worker clientId=connect-1, groupId=my-group1] Uncaught exception in herder work thread, exiting: (org.apache.kafka.connect.runtime.distributed.DistributedHerder:297)
g.apache.kafka.common.KafkaException: Unexpected error fetching metadata for topic connect-offsets
at org.apache.kafka.clients.consumer.internals.Fetcher.getTopicMetadata(Fetcher.java:403)
at org.apache.kafka.clients.consumer.KafkaConsumer.partitionsFor(KafkaConsumer.java:1965)
at org.apache.kafka.clients.consumer.KafkaConsumer.partitionsFor(KafkaConsumer.java:1933)
at org.apache.kafka.connect.util.KafkaBasedLog.start(KafkaBasedLog.java:138)
at org.apache.kafka.connect.storage.KafkaOffsetBackingStore.start(KafkaOffsetBackingStore.java:109)
at org.apache.kafka.connect.runtime.Worker.start(Worker.java:186)
at org.apache.kafka.connect.runtime.AbstractHerder.startServices(AbstractHerder.java:123)
at org.apache.kafka.connect.runtime.distributed.DistributedHerder.run(DistributedHerder.java:284)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: org.apache.kafka.common.errors.InvalidReplicationFactorException: Replication factor is below 1 or larger than the number of available brokers."
The kafka itself still works well, every topic is present (replfactor of 1) and I can consumer/produce messages by hand. And I can also launch the connector system by hand successfully.
edit: My guess is that without PV it will start the connectors after kafka is up, but with PV it sees immediatly that connectors are already present and tries to load them before the kafka started.
edit2:
modded image:
FROM bitnami/kafka
// copying connect jars..
ADD connect-distributed.properties /opt/prop/connect-distributed.properties
ADD modded-kafka-run.sh /opt/bitnami/scripts/kafka/run.sh
RUN chmod 755 /opt/bitnami/scripts/kafka/run.sh
modded run.sh(I just added the distributed.sh and curl to it):
info "** Starting Kafka **"
/opt/bitnami/kafka/bin/connect-distributed.sh -daemon /opt/prop/connect-distributed.properties
//.. adding the connectors with curl
if am_i_root; then
exec gosu "$KAFKA_DAEMON_USER" "${START_COMMAND[#]}"
else
exec "${START_COMMAND[#]}"
fi
original run.sh: https://github.com/bitnami/bitnami-docker-kafka/blob/master/2/debian-10/rootfs/opt/bitnami/scripts/kafka/run.sh
Hard to tell what the issue is, but the ENTRYPOINT that starts Kafka actually starts after any RUN command.
Not clear why you need to create your own Kafka Connect image when at least two exist
You should be using docker-compose to start 3 separate Zookeeper, Kafka, and Connect clusters

Confluent Start -> Schema Registry Failed to Start

When I start Confluent, Schema-registry fails, preventing the process from completing successfully. This is the response I get:
Starting zookeeper
zookeeper is [UP]
Starting kafka
kafka is [UP]
Starting schema-registry
Schema Registry failed to start
schema-registry is [DOWN]
Starting kafka-rest
Kafka Rest failed to start
kafka-rest is [DOWN]
Starting connect
connect is [UP]
When I tried to run the processes individually, zookeeper ran without problems. However, when I launched kafka, zookeeper displayed the following error:
Error Path:/brokers Error:KeeperErrorCode = NodeExists for /brokers (org.apache.zookeeper.server.PrepRequestProcessor)
Then, when I attempted to run Schema registry, I was hit with a massive list of errors. I'm sure the errors all point to one small thing. Here are some of the errors (many repeat in the same long message):
1.
WARNING: HK2 service reification failed for [org.glassfish.jersey.message.internal.DataSourceProvider] with an exception:
MultiException stack 1 of 2
java.lang.NoClassDefFoundError: javax/activation/DataSource
2.
MultiException stack 2 of 2
java.lang.IllegalArgumentException: Errors were discovered while reifying SystemDescriptor
3.
java.lang.IllegalArgumentException: While attempting to resolve the dependencies of org.glassfish.jersey.server.validation.internal.ValidationBinder$ConfiguredValidatorProvider errors were found
4.
java.lang.NoClassDefFoundError: javax/xml/bind/ValidationException
Some of the errors vary slightly based on location, but for the most part, these 4 errors are printed out dozens of times.
I did my best to make sure no ports were being used by other processes. I also stopped and destroyed all instances of confluent that I've created before. I've played around with Kafka on this computer before, so I theorize that that could have something to do with it, but I've made sure to close all past zookeeper and kafka instances.
I've tried to run confluent on a different computer and didn't run into any issues. Does anyone know what could be the problem? I can send the entire error message and provide any additional details.
Thanks in advance!
Remove Java 9.
I had both Java 9 and Java 8 on my computer. Turns out, Confluent was attempting to use Java 9, which isn't compatible with Confluent. When I deleted everything related to Java 9, Confluent started using Java 8, which solved the problem.
As BluePhantom pointed out, using Java 7 will also do the trick.

Spring websockets across Wildfly 10.1 cluster

Using Wildfly 10.1 standalone-full-ha.xml with the included ActiveMQ Artemis 1.1.0. Enabled STOMP by adding this to the activemq config:
<acceptor name="stomp-acceptor" factory-class="org.apache.activemq.artemis.core.remoting.impl.netty.NettyAcceptorFactory">
<param name="protocols" value="STOMP"/>
<param name="port" value="${stomp.port:61613}"/>
</acceptor>
Deployed a sample spring websockets war based on https://spring.io/guides/gs/messaging-stomp-websocket/
I have this running on 2 separate servers which are able to form a cluster. I am able to connect to both servers and send messages across, however when I disconnect one of the websockets, the following exception is thrown on the other server:
16:49:02,377 ERROR [org.apache.activemq.artemis.core.server] (Thread-7 (ActiveMQ-client-global-threads-926891052)) AMQ224037: cluster connection Failed to handle message: java.lang.IllegalStateException: Cannot find binding for ffaa538e-77c3-11e7-ba4b-7b9cb7ee40e2b7e6493a-77c3-11e7-ba4b-7b9cb7ee40e2
at org.apache.activemq.artemis.core.server.cluster.impl.ClusterConnectionImpl$MessageFlowRecordImpl.doConsumerClosed(ClusterConnectionImpl.java:1319)
at org.apache.activemq.artemis.core.server.cluster.impl.ClusterConnectionImpl$MessageFlowRecordImpl.handleNotificationMessage(ClusterConnectionImpl.java:1005)
at org.apache.activemq.artemis.core.server.cluster.impl.ClusterConnectionImpl$MessageFlowRecordImpl.onMessage(ClusterConnectionImpl.java:974)
at org.apache.activemq.artemis.core.client.impl.ClientConsumerImpl.callOnMessage(ClientConsumerImpl.java:1018)
at org.apache.activemq.artemis.core.client.impl.ClientConsumerImpl.access$400(ClientConsumerImpl.java:48)
at org.apache.activemq.artemis.core.client.impl.ClientConsumerImpl$Runner.run(ClientConsumerImpl.java:1145)
at org.apache.activemq.artemis.utils.OrderedExecutorFactory$OrderedExecutor$ExecutorTask.run(OrderedExecutorFactory.java:103)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
I have also tested this with Wildfly 11 Alpha and the included ActiveMQ Artemis 1.5.3 but I get the same error every time I disconnect a websocket.
I also get the following errors when I shutdown a server, but I am less concerned about these since they only happen during shutdown:
17:11:03,747 ERROR [org.apache.activemq.artemis.core.server] (Thread-27 (ActiveMQ-server-org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl$2#5669232b-14602972)) AMQ224051: Failed to call notification listener: java.lang.IllegalStateException: No queue 0110e86e-77c4-11e7-a745-252881f696b4
at org.apache.activemq.artemis.core.postoffice.impl.PostOfficeImpl.onNotification(PostOfficeImpl.java:387)
at org.apache.activemq.artemis.core.server.management.impl.ManagementServiceImpl.sendNotification(ManagementServiceImpl.java:580)
at org.apache.activemq.artemis.core.server.impl.ServerConsumerImpl.close(ServerConsumerImpl.java:437)
at org.apache.activemq.artemis.core.server.impl.ServerSessionImpl.doClose(ServerSessionImpl.java:354)
at org.apache.activemq.artemis.core.server.impl.ServerSessionImpl$1.done(ServerSessionImpl.java:1191)
at org.apache.activemq.artemis.core.persistence.impl.journal.OperationContextImpl.executeOnCompletion(OperationContextImpl.java:161)
at org.apache.activemq.artemis.core.server.impl.ServerSessionImpl.close(ServerSessionImpl.java:1185)
at org.apache.activemq.artemis.core.protocol.stomp.StompProtocolManager$1.run(StompProtocolManager.java:276)
at org.apache.activemq.artemis.utils.OrderedExecutorFactory$OrderedExecutor$ExecutorTask.run(OrderedExecutorFactory.java:103)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
17:11:03,784 ERROR [io.netty.util.concurrent.DefaultPromise.rejectedExecution] (globalEventExecutor-1-2) Failed to submit a listener notification task. Event loop shut down?: java.util.concurrent.RejectedExecutionException: event executor terminated
at io.netty.util.concurrent.SingleThreadEventExecutor.reject(SingleThreadEventExecutor.java:821)
at io.netty.util.concurrent.SingleThreadEventExecutor.offerTask(SingleThreadEventExecutor.java:327)
at io.netty.util.concurrent.SingleThreadEventExecutor.addTask(SingleThreadEventExecutor.java:320)
at io.netty.util.concurrent.SingleThreadEventExecutor.execute(SingleThreadEventExecutor.java:746)
at io.netty.util.concurrent.DefaultPromise.safeExecute(DefaultPromise.java:760)
at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:428)
at io.netty.util.concurrent.DefaultPromise.trySuccess(DefaultPromise.java:104)
at io.netty.channel.DefaultChannelPromise.trySuccess(DefaultChannelPromise.java:82)
at io.netty.channel.AbstractChannel$CloseFuture.setClosed(AbstractChannel.java:1058)
at io.netty.channel.AbstractChannel$AbstractUnsafe.doClose0(AbstractChannel.java:686)
at io.netty.channel.AbstractChannel$AbstractUnsafe.access$700(AbstractChannel.java:419)
at io.netty.channel.AbstractChannel$AbstractUnsafe$5.run(AbstractChannel.java:646)
at io.netty.util.concurrent.GlobalEventExecutor$TaskRunner.run(GlobalEventExecutor.java:233)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144)
at java.lang.Thread.run(Thread.java:745)
17:11:03,797 WARN [io.netty.channel.AbstractChannel] (globalEventExecutor-1-2) Can't invoke task later as EventLoop rejected it: java.util.concurrent.RejectedExecutionException: event executor terminated
at io.netty.util.concurrent.SingleThreadEventExecutor.reject(SingleThreadEventExecutor.java:821)
at io.netty.util.concurrent.SingleThreadEventExecutor.offerTask(SingleThreadEventExecutor.java:327)
at io.netty.util.concurrent.SingleThreadEventExecutor.addTask(SingleThreadEventExecutor.java:320)
at io.netty.util.concurrent.SingleThreadEventExecutor.execute(SingleThreadEventExecutor.java:746)
at io.netty.channel.AbstractChannel$AbstractUnsafe.invokeLater(AbstractChannel.java:931)
at io.netty.channel.AbstractChannel$AbstractUnsafe.access$900(AbstractChannel.java:419)
at io.netty.channel.AbstractChannel$AbstractUnsafe$5.run(AbstractChannel.java:649)
at io.netty.util.concurrent.GlobalEventExecutor$TaskRunner.run(GlobalEventExecutor.java:233)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144)
at java.lang.Thread.run(Thread.java:745)
Is it an issue with the spring websocket example?
Do I need to change something in the default standalone-full-ha.xml to get this to work properly across a cluster, without throwing exceptions on client disconnect?
The version of Artemis in Wildfly is a bit behind the latest upstream (1.5.3 vs 2.2.0). Have you tried this with a cluster of standalone Artemis 2.2.0 nodes?