Confluent Control Center failure: Unable to fetch consumer offsets for cluster id - apache-kafka

I am running confluent platform (version 6.1.1). I deploy the following components: 3 Brokers, 3 ZK, Schema Registry, 3 Kafka Connect, KSQL and Confluent Control Center (CCC).
The CCC has entered into a failed state and I have difficulties to bring it back.
To make things cleaner, I have created another EC2 instance (m4.2xlarge) where I configured new CCC with the aim to connect it to the current cluster. New CCC has exactly the same configuration as the failed one, but with a different confluent.controlcenter.id.
I start the CCC and it is running. I can access the CCC UI but it is not working properly: the pages are loading too long, it keeps showing the changing state of the connect cluster (sometimes healthy, sometimes not), it keeps showing the changing state of the brokers (sometimes healthy, sometimes not)
For example it looks like this (see screenshots):
After running certain amount of time, it is automatically restarted and keeps restarting every 5-7 minutes.
When it is started, I see a bunch of new topics created in the Kafka cluster.
After that in the control-center.log I see :
INFO [main] Setting offsets for topic=_confluent-monitoring (io.confluent.controlcenter.KafkaHelper)
INFO [main] found 12 topicPartitions for topic=_confluent-monitoring (io.confluent.controlcenter.KafkaHelper)
INFO [main] Setting offsets for topic=_confluent-metrics (io.confluent.controlcenter.KafkaHelper)
INFO [main] found 12 topicPartitions for topic=_confluent-metrics (io.confluent.controlcenter.KafkaHelper)
INFO [main] action=starting topology=command (io.confluent.controlcenter.ControlCenter)
INFO [main] waiting for streams to be in running state REBALANCING (io.confluent.command.CommandStore)
INFO [main] Streams state RUNNING (io.confluent.command.CommandStore)
INFO [main] action=started topology=command (io.confluent.controlcenter.ControlCenter)
INFO [main] action=starting operation=command-migration (io.confluent.controlcenter.ControlCenter)
INFO [main] action=completed operation=command-migration (io.confluent.controlcenter.ControlCenter)
INFO [main] action=starting topology=monitoring (io.confluent.controlcenter.ControlCenter)
INFO [main] action=started topology=monitoring (io.confluent.controlcenter.ControlCenter)
INFO [main] Starting Health Check (io.confluent.controlcenter.ControlCenter)
INFO [main] Starting Alert Manager (io.confluent.controlcenter.ControlCenter)
INFO [main] Starting Consumer Offsets Fetch (io.confluent.controlcenter.ControlCenter)
INFO [control-center-heartbeat-0] current clusterId=lCRehAk0RqmLR04nhXKHtA (io.confluent.controlcenter.healthcheck.HealthCheck)
INFO [control-center-heartbeat-0] broker id set has changed new={1001=[10.251.xx.xx:9093 (id: 1001 rack: null)], 1002=[10.251.xx.xx:9093 (id: 1002 rack: null)], 1003=[10.251.xx.xx:9093 (id: 1003 rack: null)]} removed={} (io.confluent.controlcenter.healthcheck.HealthCheck)
INFO [control-center-heartbeat-0] new controller=10.251.xx.xx:9093 (id: 1002 rack: null) (io.confluent.controlcenter.healthcheck.HealthCheck)
INFO [main] Initial capacity 128, increased by 64, maximum capacity 2147483647. (io.confluent.rest.ApplicationServer)
INFO [main] Adding listener: http://0.0.0.0:9021 (io.confluent.rest.ApplicationServer)
INFO [main] x509=X509#3a8ead9(ip-44-135-xx-xx.eu-central-1.compute.internal,h=[ip-44-135-xx-xx.eu-central-1.compute.internal],w=[]) for Server#7c8b37a8[provider=null,keyStore=file:///var/kafka-ssl/server.keystore.jks,trustStore=file:///var/kafka-ssl/client.truststore.jks] (org.eclipse.jetty.util.ssl.SslContextFactory)
INFO [main] x509=X509#3831f4c2(caroot,h=[eu-central-1.compute.internal],w=[]) for Server#7c8b37a8[provider=null,keyStore=file:///var/kafka-ssl/server.keystore.jks,trustStore=file:///var/kafka-ssl/client.truststore.jks] (org.eclipse.jetty.util.ssl.SslContextFactory)
INFO [main] jetty-9.4.38.v20210224; built: 2021-02-24T20:25:07.675Z; git: 288f3cc74549e8a913bf363250b0744f2695b8e6; jvm 11.0.13+8-LTS (org.eclipse.jetty.server.Server)
INFO [main] DefaultSessionIdManager workerName=node0 (org.eclipse.jetty.server.session)
INFO [main] No SessionScavenger set, using defaults (org.eclipse.jetty.server.session)
INFO [main] node0 Scavenging every 660000ms (org.eclipse.jetty.server.session)
INFO [main] Started o.e.j.s.ServletContextHandler#1ef5cde4{/,[jar:file:/usr/share/java/acl/acl-6.1.1.jar!/io/confluent/controlcenter/rest/static],AVAILABLE} (org.eclipse.jetty.server.handler.ContextHandler)
INFO [main] Started o.e.j.s.ServletContextHandler#5401c6a8{/ws,null,AVAILABLE} (org.eclipse.jetty.server.handler.ContextHandler)
INFO [main] Started NetworkTrafficServerConnector#5d6b5d3d{HTTP/1.1, (http/1.1)}{0.0.0.0:9021} (org.eclipse.jetty.server.AbstractConnector)
INFO [main] Started #36578ms (org.eclipse.jetty.server.Server)
INFO [_eim_c3_non_prod-4-b6c9d6bd-717d-4559-bcfe-a4c9be647b7f-StreamThread-1] name=monitoring-input-topic-progress-.count type=monitoring cluster= value=0.0 (io.confluent.controlcenter.util.StreamProgressReporter)
INFO [_eim_c3_non_prod-4-b6c9d6bd-717d-4559-bcfe-a4c9be647b7f-StreamThread-1] name=monitoring-input-topic-progress-.rate type=monitoring cluster= value=0.0 (io.confluent.controlcenter.util.StreamProgressReporter)
INFO [_eim_c3_non_prod-4-b6c9d6bd-717d-4559-bcfe-a4c9be647b7f-StreamThread-1] name=monitoring-input-topic-progress-.timestamp type=monitoring cluster= value=NaN (io.confluent.controlcenter.util.StreamProgressReporter)
INFO [_eim_c3_non_prod-4-b6c9d6bd-717d-4559-bcfe-a4c9be647b7f-StreamThread-1] name=monitoring-input-topic-progress-.min type=monitoring cluster= value=1.7976931348623157E308 (io.confluent.controlcenter.util.StreamProgressReporter)
INFO [_eim_c3_non_prod-4-b6c9d6bd-717d-4559-bcfe-a4c9be647b7f-StreamThread-1] name=metrics-input-topic-progress-lCRehAk0RqmLR04nhXKHtA.count type=metrics cluster=lCRehAk0RqmLR04nhXKHtA value=0.0 (io.confluent.controlcenter.util.StreamProgressReporter)
INFO [_eim_c3_non_prod-4-b6c9d6bd-717d-4559-bcfe-a4c9be647b7f-StreamThread-1] name=metrics-input-topic-progress-lCRehAk0RqmLR04nhXKHtA.rate type=metrics cluster=lCRehAk0RqmLR04nhXKHtA value=0.0 (io.confluent.controlcenter.util.StreamProgressReporter)
INFO [_eim_c3_non_prod-4-b6c9d6bd-717d-4559-bcfe-a4c9be647b7f-StreamThread-1] name=metrics-input-topic-progress-lCRehAk0RqmLR04nhXKHtA.timestamp type=metrics cluster=lCRehAk0RqmLR04nhXKHtA value=NaN (io.confluent.controlcenter.util.StreamProgressReporter)
INFO [_eim_c3_non_prod-4-b6c9d6bd-717d-4559-bcfe-a4c9be647b7f-StreamThread-1] name=metrics-input-topic-progress-lCRehAk0RqmLR04nhXKHtA.min type=metrics cluster=lCRehAk0RqmLR04nhXKHtA value=1.7976931348623157E308 (io.confluent.controlcenter.util.StreamProgressReporter)
WARN [control-center-heartbeat-0] misconfigured topic=_confluent-command config=segment.bytes value=1073741824 expected=134217728 (io.confluent.controlcenter.healthcheck.HealthCheck)
WARN [control-center-heartbeat-0] misconfigured topic=_confluent-command config=delete.retention.ms value=86400000 expected=259200000 (io.confluent.controlcenter.healthcheck.HealthCheck)
INFO [control-center-heartbeat-0] misconfigured topic=_confluent-metrics config=min.insync.replicas value=1 expected=2 (io.confluent.controlcenter.healthcheck.HealthCheck)
WARN [control-center-heartbeat-1] Unable to fetch consumer offsets for cluster id lCRehAk0RqmLR04nhXKHtA (io.confluent.controlcenter.data.ConsumerOffsetsFetcher)
java.util.concurrent.TimeoutException
at org.apache.kafka.common.internals.KafkaFutureImpl$SingleWaiter.await(KafkaFutureImpl.java:108)
at org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:272)
at io.confluent.controlcenter.data.ConsumerOffsetsDao.getAllConsumerGroupDescriptions(ConsumerOffsetsDao.java:220)
at io.confluent.controlcenter.data.ConsumerOffsetsDao.getAllConsumerGroupOffsets(ConsumerOffsetsDao.java:58)
at io.confluent.controlcenter.data.ConsumerOffsetsFetcher.run(ConsumerOffsetsFetcher.java:73)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
WARN [kafka-admin-client-thread | adminclient-3] failed fetching description for consumerGroup=_confluent-ksql-eim_ksql_non_prodquery_CSAS_SDL_STMTS_GG_347 (io.confluent.controlcenter.data.ConsumerOffsetsDao)
org.apache.kafka.common.errors.TimeoutException: Call(callName=describeConsumerGroups, deadlineMs=1654853629184, tries=1, nextAllowedTryMs=1654853629324) timed out at 1654853629224 after 1 attempt(s)
Caused by: org.apache.kafka.common.errors.DisconnectException: Cancelled describeConsumerGroups request with correlation id 168 due to node 1001 being disconnected
WARN [kafka-admin-client-thread | adminclient-3] failed fetching description for consumerGroup=connect-mongo-dci-grid-partner-test11 (io.confluent.controlcenter.data.ConsumerOffsetsDao)
org.apache.kafka.common.errors.TimeoutException: Call(callName=describeConsumerGroups, deadlineMs=1654853629184, tries=1, nextAllowedTryMs=1654853629324) timed out at 1654853629224 after 1 attempt(s)
Caused by: org.apache.kafka.common.errors.TimeoutException: Timed out waiting for a node assignment. Call: describeConsumerGroups
WARN [kafka-admin-client-thread | adminclient-3] failed fetching description for consumerGroup=_confluent-ksql-eim_ksql_non_prodquery_CSAS_SDL_STMTS_UPWARD_GG_355 (io.confluent.controlcenter.data.ConsumerOffsetsDao)
org.apache.kafka.common.errors.TimeoutException: Call(callName=describeConsumerGroups, deadlineMs=1654853629184, tries=1, nextAllowedTryMs=1654853629324) timed out at 1654853629224 after 1 attempt(s)
Caused by: org.apache.kafka.common.errors.TimeoutException: Timed out waiting to send the call. Call: describeConsumerGroups
WARN [kafka-admin-client-thread | adminclient-3] failed fetching description for consumerGroup=_eim_c3_non_prod-4 (io.confluent.controlcenter.data.ConsumerOffsetsDao)
org.apache.kafka.common.errors.TimeoutException: Call(callName=describeConsumerGroups, deadlineMs=1654853629184, tries=1, nextAllowedTryMs=1654853629324) timed out at 1654853629224 after 1 attempt(s)
Caused by: org.apache.kafka.common.errors.TimeoutException: Timed out waiting to send the call. Call: describeConsumerGroups
...
and so on...
WARN [control-center-heartbeat-1] Unable to fetch consumer offsets for cluster id lCRehAk0RqmLR04nhXKHtA (io.confluent.controlcenter.data.ConsumerOffsetsFetcher)
java.util.concurrent.TimeoutException
at org.apache.kafka.common.internals.KafkaFutureImpl$SingleWaiter.await(KafkaFutureImpl.java:108)
at org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:272)
at io.confluent.controlcenter.data.ConsumerOffsetsDao.getAllConsumerGroupDescriptions(ConsumerOffsetsDao.java:220)
at io.confluent.controlcenter.data.ConsumerOffsetsDao.getAllConsumerGroupOffsets(ConsumerOffsetsDao.java:58)
at io.confluent.controlcenter.data.ConsumerOffsetsFetcher.run(ConsumerOffsetsFetcher.java:73)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
and so on...
In the control-center-kafka.log I see:
INFO [control-center-heartbeat-1] Kafka version: 6.1.1-ce (org.apache.kafka.common.utils.AppInfoParser)
INFO [control-center-heartbeat-1] Kafka commitId: 73deb3aeb1f8647c (org.apache.kafka.common.utils.AppInfoParser)
INFO [control-center-heartbeat-1] Kafka startTimeMs: 1654853610852 (org.apache.kafka.common.utils.AppInfoParser)
INFO [kafka-coordinator-heartbeat-thread | _eim_c3_non_prod-4] [Consumer clientId=_eim_c3_non_prod-4-b6c9d6bd-717d-4559-bcfe-a4c9be647b7f-StreamThread-5-consumer, groupId=_eim_c3_non_prod-4] Resetting offset for partition _eim_c3_non_prod-4-monitoring-message-rekey-store-7 to position FetchPosition{offset=0, offsetEpoch=Optional.empty, currentLeader=LeaderAndEpoch{leader=Optional[10.251.6.2:9093 (id: 1002 rack: null)], epoch=0}}. (org.apache.kafka.clients.consumer.internals.SubscriptionState)
INFO [kafka-coordinator-heartbeat-thread | _eim_c3_non_prod-4] [Consumer clientId=_eim_c3_non_prod-4-b6c9d6bd-717d-4559-bcfe-a4c9be647b7f-StreamThread-5-consumer, groupId=_eim_c3_non_prod-4] Resetting offset for partition _eim_c3_non_prod-4-monitoring-trigger-event-rekey-7 to position FetchPosition{offset=0, offsetEpoch=Optional.empty, currentLeader=LeaderAndEpoch{leader=Optional[10.251.6.2:9093 (id: 1002 rack: null)], epoch=0}}. (org.apache.kafka.clients.consumer.internals.SubscriptionState)
INFO [kafka-coordinator-heartbeat-thread | _eim_c3_non_prod-4] [Consumer clientId=_eim_c3_non_prod-4-b6c9d6bd-717d-4559-bcfe-a4c9be647b7f-StreamThread-5-consumer, groupId=_eim_c3_non_prod-4] Resetting offset for partition _eim_c3_non_prod-4-MonitoringStream-ONE_MINUTE-repartition-7 to position FetchPosition{offset=0, offsetEpoch=Optional.empty, currentLeader=LeaderAndEpoch{leader=Optional[10.251.6.2:9093 (id: 1002 rack: null)], epoch=0}}. (org.apache.kafka.clients.consumer.internals.SubscriptionState)
INFO [kafka-coordinator-heartbeat-thread | _eim_c3_non_prod-4] [Consumer clientId=_eim_c3_non_prod-4-b6c9d6bd-717d-4559-bcfe-a4c9be647b7f-StreamThread-5-consumer, groupId=_eim_c3_non_prod-4] Resetting offset for partition _eim_c3_non_prod-4-aggregatedTopicPartitionTableWindows-ONE_MINUTE-repartition-7 to position FetchPosition{offset=0, offsetEpoch=Optional.empty, currentLeader=LeaderAndEpoch{leader=Optional[10.251.6.1:9093 (id: 1001 rack: null)], epoch=0}}. (org.apache.kafka.clients.consumer.internals.SubscriptionState)
and so on ...
INFO [kafka-coordinator-heartbeat-thread | _eim_c3_non_prod-4] [Consumer clientId=_eim_c3_non_prod-4-b6c9d6bd-717d-4559-bcfe-a4c9be647b7f-StreamThread-10-consumer, groupId=_eim_c3_non_prod-4] Error sending fetch request (sessionId=INVALID, epoch=INITIAL) to node 1003: (org.apache.kafka.clients.FetchSessionHandler)
org.apache.kafka.common.errors.DisconnectException
INFO [_eim_c3_non_prod-4-b6c9d6bd-717d-4559-bcfe-a4c9be647b7f-StreamThread-3] [Consumer clientId=_eim_c3_non_prod-4-b6c9d6bd-717d-4559-bcfe-a4c9be647b7f-StreamThread-3-consumer, groupId=_eim_c3_non_prod-4] Error sending fetch request (sessionId=INVALID, epoch=INITIAL) to node 1002: (org.apache.kafka.clients.FetchSessionHandler)
org.apache.kafka.common.errors.DisconnectException
INFO [kafka-coordinator-heartbeat-thread | _eim_c3_non_prod-4] [Consumer clientId=_eim_c3_non_prod-4-b6c9d6bd-717d-4559-bcfe-a4c9be647b7f-StreamThread-3-consumer, groupId=_eim_c3_non_prod-4] Error sending fetch request (sessionId=INVALID, epoch=INITIAL) to node 1001: (org.apache.kafka.clients.FetchSessionHandler)
org.apache.kafka.common.errors.DisconnectException
INFO [_eim_c3_non_prod-4-b6c9d6bd-717d-4559-bcfe-a4c9be647b7f-StreamThread-10] [Consumer clientId=_eim_c3_non_prod-4-b6c9d6bd-717d-4559-bcfe-a4c9be647b7f-StreamThread-10-consumer, groupId=_eim_c3_non_prod-4] Error sending fetch request (sessionId=INVALID, epoch=INITIAL) to node 1002: (org.apache.kafka.clients.FetchSessionHandler)
org.apache.kafka.common.errors.DisconnectException
INFO [kafka-coordinator-heartbeat-thread | _eim_c3_non_prod-4] [Consumer clientId=_eim_c3_non_prod-4-b6c9d6bd-717d-4559-bcfe-a4c9be647b7f-StreamThread-5-consumer, groupId=_eim_c3_non_prod-4] Error sending fetch request (sessionId=1478925475, epoch=1) to node 1003: (org.apache.kafka.clients.FetchSessionHandler)
org.apache.kafka.common.errors.DisconnectException
INFO [kafka-coordinator-heartbeat-thread | _eim_c3_non_prod-4] [Consumer clientId=_eim_c3_non_prod-4-b6c9d6bd-717d-4559-bcfe-a4c9be647b7f-StreamThread-6-consumer, groupId=_eim_c3_non_prod-4] Error sending fetch request (sessionId=1947312909, epoch=1) to node 1002: (org.apache.kafka.clients.FetchSessionHandler)
org.apache.kafka.common.errors.DisconnectException
and so on ...
Any ideas what can be wrong here?

Related

KAFKA connect EOFException when running connect-standalone with confluent broker and source connector

i am trying to run a connect-standalone application with a sample soure connector. (FileSource or JDBC connector).
i am getting constantly repeating error messages like
[2022-11-14 18:33:09,641] DEBUG [local-file-source|task-0] [Producer clientId=connector-producer-local-file-source-0] Connection with xxxx.westeurope.azure.confluent.cloud/ (channelId=-1) disconnected (org.apache.kafka.common.network.Selector:606)
java.io.EOFException
at org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:97)
at org.apache.kafka.common.network.KafkaChannel.receive(KafkaChannel.java:452)
at org.apache.kafka.common.network.KafkaChannel.read(KafkaChannel.java:402)
at org.apache.kafka.common.network.Selector.attemptRead(Selector.java:674)
at org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:576)
at org.apache.kafka.common.network.Selector.poll(Selector.java:481)
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:560)
at org.apache.kafka.clients.producer.internals.Sender.runOnce(Sender.java:328)
at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:243)
at java.base/java.lang.Thread.run(Thread.java:1589)
[2022-11-14 18:33:09,643] INFO [local-file-source|task-0] [Producer clientId=connector-producer-local-file-source-0] Node -1 disconnected. (org.apache.kafka.clients.NetworkClient:935)
[2022-11-14 18:33:09,645] DEBUG [local-file-source|task-0] [Producer clientId=connector-producer-local-file-source-0] Cancelled in-flight API_VERSIONS request with correlation id 0 due to node -1 being disconnected (elapsed time since creation: 34ms, elapsed time since send: 34ms, request timeout: 30000ms): ApiVersionsRequestData(clientSoftwareName='apache-kafka-java', clientSoftwareVersion='3.2.1') (org.apache.kafka.clients.NetworkClient:335)
[2022-11-14 18:33:09,647] WARN [local-file-source|task-0] [Producer clientId=connector-producer-local-file-source-0] Bootstrap broker xxxx.westeurope.azure.confluent.cloud:9092 (id: -1 rack: null) disconnected (org.apache.kafka.clients.NetworkClient:1063)
[2022-11-14 18:33:09,757] DEBUG [local-file-source|task-0] [Producer clientId=connector-producer-local-file-source-0] Initialize connection to node xxxx.westeurope.azure.confluent.cloud:9092 (id: -1 rack: null) for sending metadata request (org.apache.kafka.clients.NetworkClient:1160)
[2022-11-14 18:33:09,758] DEBUG [local-file-source|task-0] Resolved host xxxx.westeurope.azure.confluent.cloud as (org.apache.kafka.clients.ClientUtils:113)
[2022-11-14 18:33:09,758] DEBUG [local-file-source|task-0] [Producer clientId=connector-producer-local-file-source-0] Initiating connection to node xxxx.westeurope.azure.confluent.cloud:9092 (id: -1 rack: null) using address xxxx.westeurope.azure.confluent.cloud/ (org.apache.kafka.clients.NetworkClient:989)
[2022-11-14 18:33:09,787] DEBUG [local-file-source|task-0] [Producer clientId=connector-producer-local-file-source-0] Created socket with SO_RCVBUF = 32768, SO_SNDBUF = 131072, SO_TIMEOUT = 0 to node -1 (org.apache.kafka.common.network.Selector:531)
[2022-11-14 18:33:09,788] DEBUG [local-file-source|task-0] [Producer clientId=connector-producer-local-file-source-0] Completed connection to node -1. Fetching API versions. (org.apache.kafka.clients.NetworkClient:951)
[2022-11-14 18:33:09,789] DEBUG [local-file-source|task-0] [Producer clientId=connector-producer-local-file-source-0] Initiating API versions fetch from node -1. (org.apache.kafka.clients.NetworkClient:965)
[2022-11-14 18:33:09,789] DEBUG [local-file-source|task-0] [Producer clientId=connector-producer-local-file-source-0] Sending API_VERSIONS request with header RequestHeader(apiKey=API_VERSIONS, apiVersion=3, clientId=connector-producer-local-file-source-0, correlationId=1) and timeout 30000 to node -1: ApiVersionsRequestData(clientSoftwareName='apache-kafka-java', clientSoftwareVersion='3.2.1') (org.apache.kafka.clients.NetworkClient:521)
[2022-11-14 18:33:09,817] DEBUG [local-file-source|task-0] [Producer clientId=connector-producer-local-file-source-0] Connection with xxxx.westeurope.azure.confluent.cloud/ (channelId=-1) disconnected
I can create a topic with the kafka-topics.sh command, write messages through the console producer to the topic and read from the topic through console consumer as well as with connect-standalone with sink connectors.
If i am running kafka server and zookeper locally everything seems to work fine.
commandline:
bin/connect-standalone.sh config/connect-standalone.properties config/connect-file-source.properties
connect-standalone.properties
bootstrap.servers=pkc-pj9zy.westeurope.azure.confluent.cloud:9092
offset.storage.file.filename=/tmp/connect.offsets
offset.flush.interval.ms=10000
ssl.endpoint.identification.algorithm=https
security.protocol=SASL_SSL
sasl.mechanism=PLAIN
sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username="name" password="passphrase";
ssl.protocol=TLSv1.2
ssl.enabled.protocols=TLSv1.2
plugin.path=./plugins,./libs
key.converter=org.apache.kafka.connect.storage.StringConverter
value.converter=org.apache.kafka.connect.storage.StringConverter
connect-file-source.properties
name=local-file-source
connector.class=FileStreamSource
tasks.max=1
file=test.txt
topic=test_ksql_af_file_source-test
auto.create=true
auto.evolve=true
Got it sorted out.
was missing the properties
producer.security.protocol=SASL_SSL
producer.sasl.mechanism=PLAIN
producer.sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username="name" password="passphrase";
in my settings.
The log did unfortunately not give a hint about this

Running Kafka Connect in distributed mode, no obvious errors, but data does not end up in sink connector

Running Kafka Connect via this repo https://github.com/entechlog/kafka-examples/tree/master/kafka-connect-standalone except I have added extra configs for AWS MSK IAM authentication. I've also updated the .env file to use different variables, like the AWS MSK IAM jar file, AWS key/secret key credentials, and a few other minor things. Note that this repo runs in standalone mode, but I have updated the launch shell script to run in distributed mode: exec connect-distributed /etc/"${COMPONENT}"/"${COMPONENT}".properties. But I have NOT created a file called kafka-connect.properties.template because when I do, I get a whole host of errors like Missing required configuration "group.id" which has no default value. which makes no sense to me as I can see it in the docker-compose.yml file.
My goal is to get data from a third party Kafka cluster into BigQuery.
When I run my docker-compose file, I see no errors, a few warnings, but nothing that stands out to me. I get a lot of warnings like this: [2021-12-14 01:57:37,917] WARN The configuration 'camelcase.default.dataset' was supplied but isn't a known config. (org.apache.kafka.clients.consumer.ConsumerConfig:380), and this dataset is required to send data to BigQuery. Not making sense why things like dataset are not being used.
Here are the latest logs:
[2021-12-14 01:57:37,921] INFO Kafka version: 6.2.1-ce (org.apache.kafka.common.utils.AppInfoParser:119)
[2021-12-14 01:57:37,922] INFO Kafka commitId: 14770bfc4e973178 (org.apache.kafka.common.utils.AppInfoParser:120)
[2021-12-14 01:57:37,922] INFO Kafka startTimeMs: 1639447057921 (org.apache.kafka.common.utils.AppInfoParser:121)
[2021-12-14 01:57:39,332] INFO [Producer clientId=producer-3] Cluster ID: k2eIXxm_RkmWu2-R2d0N1Q (org.apache.kafka.clients.Metadata:279)
[2021-12-14 01:57:39,413] INFO [Consumer clientId=consumer-connect-kafka-connect-group-3, groupId=connect-kafka-connect-group] Cluster ID: k2eIXxm_RkmWu2-R2d0N1Q (org.apache.kafka.clients.Metadata:279)
[2021-12-14 01:57:39,428] INFO [Consumer clientId=consumer-connect-kafka-connect-group-3, groupId=connect-kafka-connect-group] Subscribed to partition(s): connect-configs-0 (org.apache.kafka.clients.consumer.KafkaConsumer:1123)
[2021-12-14 01:57:39,428] INFO [Consumer clientId=consumer-connect-kafka-connect-group-3, groupId=connect-kafka-connect-group] Seeking to EARLIEST offset of partition connect-configs-0 (org.apache.kafka.clients.consumer.internals.SubscriptionState:619)
[2021-12-14 01:57:40,727] INFO Finished reading KafkaBasedLog for topic connect-configs (org.apache.kafka.connect.util.KafkaBasedLog:228)
[2021-12-14 01:57:40,728] INFO Started KafkaBasedLog for topic connect-configs (org.apache.kafka.connect.util.KafkaBasedLog:230)
[2021-12-14 01:57:40,729] INFO Started KafkaConfigBackingStore (org.apache.kafka.connect.storage.KafkaConfigBackingStore:290)
[2021-12-14 01:57:40,729] INFO [Worker clientId=connect-1, groupId=connect-kafka-connect-group] Herder started (org.apache.kafka.connect.runtime.distributed.DistributedHerder:312)
[2021-12-14 01:57:45,059] INFO [Worker clientId=connect-1, groupId=connect-kafka-connect-group] Cluster ID: k2eIXxm_RkmWu2-R2d0N1Q (org.apache.kafka.clients.Metadata:279)
[2021-12-14 01:57:45,089] INFO [Worker clientId=connect-1, groupId=connect-kafka-connect-group] Discovered group coordinator <bootstrap server and port here> (id: 2147483643 rack: null) (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:848)
[2021-12-14 01:57:45,095] INFO [Worker clientId=connect-1, groupId=connect-kafka-connect-group] Rebalance started (org.apache.kafka.connect.runtime.distributed.WorkerCoordinator:221)
[2021-12-14 01:57:45,095] INFO [Worker clientId=connect-1, groupId=connect-kafka-connect-group] (Re-)joining group (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:538)
[2021-12-14 01:57:45,957] INFO [Worker clientId=connect-1, groupId=connect-kafka-connect-group] (Re-)joining group (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:538)
[2021-12-14 01:57:49,038] INFO [Worker clientId=connect-1, groupId=connect-kafka-connect-group] Successfully joined group with generation Generation{generationId=1, memberId='connect-1-a4cc5355-60da-46c3-8228-bff2de664f2c', protocol='sessioned'} (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:594)
[2021-12-14 01:57:49,160] INFO [Worker clientId=connect-1, groupId=connect-kafka-connect-group] Successfully synced group in generation Generation{generationId=1, memberId='connect-1-a4cc5355-60da-46c3-8228-bff2de664f2c', protocol='sessioned'} (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:758)
[2021-12-14 01:57:49,161] INFO [Worker clientId=connect-1, groupId=connect-kafka-connect-group] Joined group at generation 1 with protocol version 2 and got assignment: Assignment{error=0, leader='connect-1-a4cc5355-60da-46c3-8228-bff2de664f2c', leaderUrl='http://kafka-connect:8083/', offset=10, connectorIds=[], taskIds=[], revokedConnectorIds=[], revokedTaskIds=[], delay=0} with rebalance delay: 0 (org.apache.kafka.connect.runtime.distributed.DistributedHerder:1694)
[2021-12-14 01:57:49,162] WARN [Worker clientId=connect-1, groupId=connect-kafka-connect-group] Catching up to assignment's config offset. (org.apache.kafka.connect.runtime.distributed.DistributedHerder:1119)
[2021-12-14 01:57:49,162] INFO [Worker clientId=connect-1, groupId=connect-kafka-connect-group] Current config state offset -1 is behind group assignment 10, reading to end of config log (org.apache.kafka.connect.runtime.distributed.DistributedHerder:1183)
[2021-12-14 01:57:49,359] INFO [Worker clientId=connect-1, groupId=connect-kafka-connect-group] Finished reading to end of log and updated config snapshot, new config log offset: 10 (org.apache.kafka.connect.runtime.distributed.DistributedHerder:1190)
[2021-12-14 01:57:49,359] INFO [Worker clientId=connect-1, groupId=connect-kafka-connect-group] Starting connectors and tasks using config offset 10 (org.apache.kafka.connect.runtime.distributed.DistributedHerder:1244)
[2021-12-14 01:57:49,359] INFO [Worker clientId=connect-1, groupId=connect-kafka-connect-group] Finished starting connectors and tasks (org.apache.kafka.connect.runtime.distributed.DistributedHerder:1272)
[2021-12-14 01:57:50,791] INFO [Worker clientId=connect-1, groupId=connect-kafka-connect-group] Session key updated (org.apache.kafka.connect.runtime.distributed.DistributedHerder:1582)
And also this:
WARNING: A provider org.apache.kafka.connect.runtime.rest.resources.ConnectorsResource registered in SERVER runtime does not implement any provider interfaces applicable in the SERVER runtime. Due to constraint configuration problems the provider org.apache.kafka.connect.runtime.rest.resources.ConnectorsResource will be ignored.
WARNING: A provider org.apache.kafka.connect.runtime.rest.resources.ConnectorPluginsResource registered in SERVER runtime does not implement any provider interfaces applicable in the SERVER runtime. Due to constraint configuration problems the provider org.apache.kafka.connect.runtime.rest.resources.ConnectorPluginsResource will be ignored.
WARNING: The (sub)resource method createConnector in org.apache.kafka.connect.runtime.rest.resources.ConnectorsResource contains empty path annotation.
WARNING: The (sub)resource method listConnectors in org.apache.kafka.connect.runtime.rest.resources.ConnectorsResource contains empty path annotation.
WARNING: The (sub)resource method listConnectorPlugins in org.apache.kafka.connect.runtime.rest.resources.ConnectorPluginsResource contains empty path annotation.
And when I navigate to http://localhost:8083/connectors in my browser, I get an empty list [].

Same offset and partition record is getting consumed twice causing duplicates

I am trying to consume records using application written in spring-kafka. I am facing very unique condition and not able to understand why this is happening ?
My consumer application is running with 2 concurrency meaning 2 consumer thread subscribed to topic having two partitions.I am consuming records and placing it into table using upsert with offset, partitions and insert timestamp.
I am seeing duplicate values with same offset and partition in the table which should not occur. There is no difference in the timestamp value, means insert occurred at the same time. I am not sure how is it possible? I don't see any issue in the log as well. I am not sure what is happening at the Producer end but we can't have 2 values at the same offset anyway, so not sure whether this is an issue at consumer end of producer end.Any suggestion or thought which would help me to triage this issue?
Kafka log :
I don't see anything unusual in the log as well.
14:29:56.318 [main] INFO org.apache.kafka.common.utils.AppInfoParser - Kafka version: 2.4.0
14:29:56.318 [main] INFO org.apache.kafka.common.utils.AppInfoParser - Kafka commitId: 77a89fcf8d7fa018
14:29:56.318 [main] INFO org.apache.kafka.common.utils.AppInfoParser - Kafka startTimeMs: 1604154596318
14:29:56.319 [main] INFO org.apache.kafka.clients.consumer.KafkaConsumer - [Consumer clientId=consumer-csp-prov-emerald-test-2, groupId=csp-prov-emerald-test] Subscribed to topic(s): kaas.pe.enrollment.csp.ts2
14:29:57.914 [org.springframework.kafka.KafkaListenerEndpointContainer#0-1-C-1] INFO org.apache.kafka.clients.Metadata - [Consumer clientId=consumer-csp-prov-emerald-test-2, groupId=csp-prov-emerald-test] Cluster ID: 6cbv7QOaSW6j1vXrOCE4jA
14:29:57.914 [org.springframework.kafka.KafkaListenerEndpointContainer#0-0-C-1] INFO org.apache.kafka.clients.Metadata - [Consumer clientId=consumer-csp-prov-emerald-test-1, groupId=csp-prov-emerald-test] Cluster ID: 6cbv7QOaSW6j1vXrOCE4jA
14:29:57.915 [org.springframework.kafka.KafkaListenerEndpointContainer#0-1-C-1] INFO org.apache.kafka.clients.consumer.internals.AbstractCoordinator - [Consumer clientId=consumer-csp-prov-emerald-test-2, groupId=csp-prov-emerald-test] Discovered group coordinator apslp1563.uhc.com:9093 (id: 2147483574 rack: null)
14:29:57.915 [org.springframework.kafka.KafkaListenerEndpointContainer#0-0-C-1] INFO org.apache.kafka.clients.consumer.internals.AbstractCoordinator - [Consumer clientId=consumer-csp-prov-emerald-test-1, groupId=csp-prov-emerald-test] Discovered group coordinator apslp1563.uhc.com:9093 (id: 2147483574 rack: null)
14:29:57.923 [org.springframework.kafka.KafkaListenerEndpointContainer#0-1-C-1] INFO org.apache.kafka.clients.consumer.internals.AbstractCoordinator - [Consumer clientId=consumer-csp-prov-emerald-test-2, groupId=csp-prov-emerald-test] (Re-)joining group
14:29:57.924 [org.springframework.kafka.KafkaListenerEndpointContainer#0-0-C-1] INFO org.apache.kafka.clients.consumer.internals.AbstractCoordinator - [Consumer clientId=consumer-csp-prov-emerald-test-1, groupId=csp-prov-emerald-test] (Re-)joining group
14:29:58.121 [org.springframework.kafka.KafkaListenerEndpointContainer#0-0-C-1] INFO org.apache.kafka.clients.consumer.internals.AbstractCoordinator - [Consumer clientId=consumer-csp-prov-emerald-test-1, groupId=csp-prov-emerald-test] (Re-)joining group
14:29:58.125 [org.springframework.kafka.KafkaListenerEndpointContainer#0-1-C-1] INFO org.apache.kafka.clients.consumer.internals.AbstractCoordinator - [Consumer clientId=consumer-csp-prov-emerald-test-2, groupId=csp-prov-emerald-test] (Re-)joining group
14:30:13.127 [org.springframework.kafka.KafkaListenerEndpointContainer#0-0-C-1] INFO org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=consumer-csp-prov-emerald-test-1, groupId=csp-prov-emerald-test] Finished assignment for group at generation 23: {consumer-csp-prov-emerald-test-1-19d92ba5-5dc3-433d-b967-3cf1ce1b4174=org.apache.kafka.clients.consumer.ConsumerPartitionAssignor$Assignment#d7e2a1f, consumer-csp-prov-emerald-test-2-5833c212-7031-4ab1-944b-7e26f7d7a293=org.apache.kafka.clients.consumer.ConsumerPartitionAssignor$Assignment#53c3aad3}
14:30:13.131 [org.springframework.kafka.KafkaListenerEndpointContainer#0-0-C-1] INFO org.apache.kafka.clients.consumer.internals.AbstractCoordinator - [Consumer clientId=consumer-csp-prov-emerald-test-1, groupId=csp-prov-emerald-test] Successfully joined group with generation 23
14:30:13.131 [org.springframework.kafka.KafkaListenerEndpointContainer#0-1-C-1] INFO org.apache.kafka.clients.consumer.internals.AbstractCoordinator - [Consumer clientId=consumer-csp-prov-emerald-test-2, groupId=csp-prov-emerald-test] Successfully joined group with generation 23
14:30:13.134 [org.springframework.kafka.KafkaListenerEndpointContainer#0-1-C-1] INFO org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=consumer-csp-prov-emerald-test-2, groupId=csp-prov-emerald-test] Adding newly assigned partitions: kaas.pe.enrollment.csp.ts2-1
14:30:13.134 [org.springframework.kafka.KafkaListenerEndpointContainer#0-0-C-1] INFO org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=consumer-csp-prov-emerald-test-1, groupId=csp-prov-emerald-test] Adding newly assigned partitions: kaas.pe.enrollment.csp.ts2-0
14:30:13.143 [org.springframework.kafka.KafkaListenerEndpointContainer#0-0-C-1] INFO org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=consumer-csp-prov-emerald-test-1, groupId=csp-prov-emerald-test] Setting offset for partition kaas.pe.enrollment.csp.ts2-0 to the committed offset FetchPosition{offset=500387, offsetEpoch=Optional.empty, currentLeader=LeaderAndEpoch{leader=apslp1559.uhc.com:9093 (id: 69 rack: null), epoch=37}}
14:30:13.143 [org.springframework.kafka.KafkaListenerEndpointContainer#0-1-C-1] INFO org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=consumer-csp-prov-emerald-test-2, groupId=csp-prov-emerald-test] Setting offset for partition kaas.pe.enrollment.csp.ts2-1 to the committed offset FetchPosition{offset=499503, offsetEpoch=Optional.empty, currentLeader=LeaderAndEpoch{leader=apslp1562.uhc.com:9093 (id: 72 rack: null), epoch=36}}
Code :
#KafkaListener(topics = "${app.topic}", groupId = "${app.group_id_config}")
public void run(ConsumerRecord<String, GenericRecord> record, Acknowledgment acknowledgement) throws Exception {
try {
prov_tin_number = record.value().get("providerTinNumber").toString();
prov_tin_type = record.value().get("providerTINType").toString();
enroll_type = record.value().get("enrollmentType").toString();
vcp_prov_choice_ind = record.value().get("vcpProvChoiceInd").toString();
error_flag = "";
dataexecutor.peStremUpsertTbl(prov_tin_number, prov_tin_type, enroll_type, vcp_prov_choice_ind, error_flag,
record.partition(), record.offset());
acknowledgement.acknowledge();
}catch (Exception ex) {
System.out.println(record);
System.out.println(ex.getMessage());
}
}

Logstash & Kafka - java.io.EOFException: null. Node -1 disconnected

I am trying to set up ELK stack with one of my Logstash filter on my local machine.
I have an input file that comes into a kafka queue that is parse by my filter. Which outputs to elasticsearch. When I run my test.sh that runs the logstash filter on the input file, this is the resulting errors in logstash --debug
I am not sure what this error could be, all of my settings are localhost and default ports. Any guidance as this error does not tell me much.
[2019-01-23T15:09:27,263][DEBUG][org.apache.kafka.clients.NetworkClient] [Consumer clientId=logstash-0, groupId=test-retry] Initialize connection to node localhost:2181 (id: -1 rack: null) for sending metadata request
[2019-01-23T15:09:27,264][DEBUG][org.apache.kafka.clients.NetworkClient] [Consumer clientId=logstash-0, groupId=test-retry] Initiating connection to node localhost:2181 (id: -1 rack: null)
[2019-01-23T15:09:27,265][DEBUG][org.apache.kafka.common.network.Selector] [Consumer clientId=logstash-0, groupId=test-retry] Created socket with SO_RCVBUF = 342972, SO_SNDBUF = 146988, SO_TIMEOUT = 0 to node -1
[2019-01-23T15:09:27,265][DEBUG][org.apache.kafka.clients.NetworkClient] [Consumer clientId=logstash-0, groupId=test-retry] Completed connection to node -1. Fetching API versions.
[2019-01-23T15:09:27,265][DEBUG][org.apache.kafka.clients.NetworkClient] [Consumer clientId=logstash-0, groupId=test-retry] Initiating API versions fetch from node -1.
[2019-01-23T15:09:27,266][DEBUG][org.apache.kafka.common.network.Selector] [Consumer clientId=logstash-0, groupId=test-retry] Connection with localhost/127.0.0.1 disconnected
java.io.EOFException: null
at org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:96) ~[kafka-clients-2.0.1.jar:?]
at org.apache.kafka.common.network.KafkaChannel.receive(KafkaChannel.java:335) ~[kafka-clients-2.0.1.jar:?]
at org.apache.kafka.common.network.KafkaChannel.read(KafkaChannel.java:296) ~[kafka-clients-2.0.1.jar:?]
at org.apache.kafka.common.network.Selector.attemptRead(Selector.java:562) ~[kafka-clients-2.0.1.jar:?]
at org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:498) [kafka-clients-2.0.1.jar:?]
at org.apache.kafka.common.network.Selector.poll(Selector.java:427) [kafka-clients-2.0.1.jar:?]
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:510) [kafka-clients-2.0.1.jar:?]
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:271) [kafka-clients-2.0.1.jar:?]
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:242) [kafka-clients-2.0.1.jar:?]
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:233) [kafka-clients-2.0.1.jar:?]
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.awaitMetadataUpdate(ConsumerNetworkClient.java:161) [kafka-clients-2.0.1.jar:?]
at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureCoordinatorReady(AbstractCoordinator.java:243) [kafka-clients-2.0.1.jar:?]
at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:314) [kafka-clients-2.0.1.jar:?]
at org.apache.kafka.clients.consumer.KafkaConsumer.updateAssignmentMetadataIfNeeded(KafkaConsumer.java:1218) [kafka-clients-2.0.1.jar:?]
at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1181) [kafka-clients-2.0.1.jar:?]
at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1115) [kafka-clients-2.0.1.jar:?]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_172]
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_172]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_172]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_172]
at org.jruby.javasupport.JavaMethod.invokeDirectWithExceptionHandling(JavaMethod.java:423) [jruby-complete-9.1.13.0.jar:?]
at org.jruby.javasupport.JavaMethod.invokeDirect(JavaMethod.java:290) [jruby-complete-9.1.13.0.jar:?]
at org.jruby.java.invokers.InstanceMethodInvoker.call(InstanceMethodInvoker.java:28) [jruby-complete-9.1.13.0.jar:?]
at org.jruby.java.invokers.InstanceMethodInvoker.call(InstanceMethodInvoker.java:90) [jruby-complete-9.1.13.0.jar:?]
at org.jruby.ir.targets.InvokeSite.invoke(InvokeSite.java:145) [jruby-complete-9.1.13.0.jar:?]
at usr.local.Cellar.logstash.$6_dot_5_dot_4.libexec.vendor.bundle.jruby.$2_dot_3_dot_0.gems.logstash_minus_input_minus_kafka_minus_8_dot_2_dot_1.lib.logstash.inputs.kafka.RUBY$block$thread_runner$1(/usr/local/Cellar/logstash/6.5.4/libexec/vendor/bundle/jruby/2.3.0/gems/logstash-input-kafka-8.2.1/lib/logstash/inputs/kafka.rb:253) [jruby-complete-9.1.13.0.jar:?]
at org.jruby.runtime.CompiledIRBlockBody.callDirect(CompiledIRBlockBody.java:145) [jruby-complete-9.1.13.0.jar:?]
at org.jruby.runtime.IRBlockBody.call(IRBlockBody.java:71) [jruby-complete-9.1.13.0.jar:?]
at org.jruby.runtime.Block.call(Block.java:124) [jruby-complete-9.1.13.0.jar:?]
at org.jruby.RubyProc.call(RubyProc.java:289) [jruby-complete-9.1.13.0.jar:?]
at org.jruby.RubyProc.call(RubyProc.java:246) [jruby-complete-9.1.13.0.jar:?]
at org.jruby.internal.runtime.RubyRunnable.run(RubyRunnable.java:104) [jruby-complete-9.1.13.0.jar:?]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_172]
[2019-01-23T15:09:27,267][DEBUG][org.apache.kafka.clients.NetworkClient] [Consumer clientId=logstash-0, groupId=test-retry] Node -1 disconnected.
[2019-01-23T15:09:27,267][DEBUG][org.apache.kafka.clients.NetworkClient] [Consumer clientId=logstash-0, groupId=test-retry] Give up sending metadata request since no node is available
[2019-01-23T15:09:27,318][DEBUG][org.apache.kafka.clients.NetworkClient] [Consumer clientId=logstash-0, groupId=test-retry] Give up sending metadata request since no node is available
[2019-01-23T15:09:27,373][DEBUG][org.apache.kafka.clients.NetworkClient] [Consumer clientId=logstash-0, groupId=test-retry] Give up sending metadata request since no node is available
[2019-01-23T15:09:27,424][DEBUG][org.apache.kafka.clients.NetworkClient] [Consumer clientId=logstash-0, groupId=test-retry] Give up sending metadata request since no node is available
[2019-01-23T15:09:27,474][DEBUG][org.apache.kafka.clients.NetworkClient] [Consumer clientId=logstash-0, groupId=test-retry] Give up sending metadata request since no node is available
[2019-01-23T15:09:27,529][DEBUG][org.apache.kafka.clients.NetworkClient] [Consumer clientId=logstash-0, groupId=test-retry] Give up sending metadata request since no node is available
[2019-01-23T15:09:27,579][DEBUG][org.apache.kafka.clients.NetworkClient] [Consumer clientId=logstash-0, groupId=test-retry] Give up sending metadata request since no node is available
[2019-01-23T15:09:27,633][DEBUG][org.apache.kafka.clients.NetworkClient] [Consumer clientId=logstash-0, groupId=test-retry] Give up sending metadata request since no node is available
[2019-01-23T15:09:27,700][DEBUG][org.apache.kafka.clients.NetworkClient] [Consumer clientId=logstash-0, groupId=test-retry] Give up sending metadata request since no node is available
[2019-01-23T15:09:27,753][DEBUG][org.apache.kafka.clients.NetworkClient] [Consumer clientId=logstash-0, groupId=test-retry] Give up sending metadata request since no node is available
[2019-01-23T15:09:27,806][DEBUG][org.apache.kafka.clients.NetworkClient] [Consumer clientId=logstash-0, groupId=test-retry] Give up sending metadata request since no node is available
[2019-01-23T15:09:27,858][DEBUG][org.apache.kafka.clients.NetworkClient] [Consumer clientId=logstash-0, groupId=test-retry] Give up sending metadata request since no node is available
[2019-01-23T15:09:27,913][DEBUG][org.apache.kafka.clients.NetworkClient] [Consumer clientId=logstash-0, groupId=test-retry] Give up sending metadata request since no node is available
[2019-01-23T15:09:27,966][DEBUG][org.apache.kafka.clients.NetworkClient] [Consumer clientId=logstash-0, groupId=test-retry] Give up sending metadata request since no node is available
[2019-01-23T15:09:28,020][DEBUG][org.apache.kafka.clients.NetworkClient] [Consumer clientId=logstash-0, groupId=test-retry] Give up sending metadata request since no node is available
[2019-01-23T15:09:28,073][DEBUG][org.apache.kafka.clients.NetworkClient] [Consumer clientId=logstash-0, groupId=test-retry] Give up sending metadata request since no node is available
[2019-01-23T15:09:28,128][DEBUG][org.apache.kafka.clients.NetworkClient] [Consumer clientId=logstash-0, groupId=test-retry] Give up sending metadata request since no node is available
[2019-01-23T15:09:28,183][DEBUG][org.apache.kafka.clients.NetworkClient] [Consumer clientId=logstash-0, groupId=test-retry] Give up sending metadata request since no node is available
[2019-01-23T15:09:28,234][DEBUG][org.apache.kafka.clients.NetworkClient] [Consumer clientId=logstash-0, groupId=test-retry] Give up sending metadata request since no node is available
[2019-01-23T15:09:28,288][DEBUG][org.apache.kafka.clients.NetworkClient] [Consumer clientId=logstash-0, groupId=test-retry] Give up sending metadata request since no node is available
[2019-01-23T15:09:28,340][DEBUG][org.apache.kafka.clients.NetworkClient] [Consumer clientId=logstash-0, groupId=test-retry] Initialize connection to node localhost:2181 (id: -1 rack: null) for sending metadata request
You set logstash to connect to Zookeeper, not Kafka.
Initialize connection to node localhost:2181
Make sure the bootstrap server is localhost:9092, in your case

Issue while setting up nifi cluster

I followed the below tutorial to setup nifi cluster. Performed everything as mentioned, but receiving the below error.
https://pierrevillard.com/2016/08/13/apache-nifi-1-0-0-cluster-setup/
But the zookeeper was unable to elect a leader and getting the following error message.
2018-04-20 09:27:40,942 INFO [main] org.wali.MinimalLockingWriteAheadLog Successfully recovered 0 records in 3 milliseconds 2018-04-20 09:27:41,007 INFO [main] org.wali.MinimalLockingWriteAheadLog org.wali.MinimalLockingWriteAheadLog#27caa186 checkpointed with 0 Records and 0 Swap Files in 65 milliseconds (Stop-the-world time = 1 milliseconds, Clear Edit Logs time = 7 millis), max Transaction ID -1 2018-04-20 09:27:41,300 INFO [main] o.a.z.server.DatadirCleanupManager autopurge.snapRetainCount set to 30 2018-04-20 09:27:41,300 INFO [main] o.a.z.server.DatadirCleanupManager autopurge.purgeInterval set to 24 2018-04-20 09:27:41,311 INFO [main] o.a.n.c.s.server.ZooKeeperStateServer Starting Embedded ZooKeeper Peer 2018-04-20 09:27:41,319 INFO [PurgeTask] o.a.z.server.DatadirCleanupManager Purge task started. 2018-04-20 09:27:41,343 INFO [PurgeTask] o.a.z.server.DatadirCleanupManager Purge task completed. 2018-04-20 09:27:41,621 INFO [main] o.apache.nifi.controller.FlowController Checking if there is already a Cluster Coordinator Elected... 2018-04-20 09:27:41,907 INFO [main] o.a.c.f.imps.CuratorFrameworkImpl Starting 2018-04-20 09:27:49,882 WARN [main] o.a.n.c.l.e.CuratorLeaderElectionManager Unable to determine the Elected Leader for role 'Cluster Coordinator' due to org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /nifi/leaders/Cluster Coordinator; assuming no leader has been elected 2018-04-20 09:27:49,885 INFO [Curator-Framework-0] o.a.c.f.imps.CuratorFrameworkImpl backgroundOperationsLoop exiting 2018-04-20 09:27:49,986 INFO [main] o.apache.nifi.controller.FlowController It appears that no Cluster Coordinator has been Elected yet. Registering for Cluster Coordinator Role. 2018-04-20 09:27:49,988 INFO [main] o.a.n.c.l.e.CuratorLeaderElectionManager CuratorLeaderElectionManager[stopped=true] Registered new Leader Selector for role Cluster Coordinator; this node is an active participant in the election. 2018-04-20 09:27:49,988 INFO [main] o.a.c.f.imps.CuratorFrameworkImpl Starting 2018-04-20 09:27:49,997 INFO [main] o.a.n.c.l.e.CuratorLeaderElectionManager CuratorLeaderElectionManager[stopped=false] Registered new Leader Selector for role Cluster Coordinator; this node is an active participant in the election. 2018-04-20 09:27:49,997 INFO [main] o.a.n.c.l.e.CuratorLeaderElectionManager CuratorLeaderElectionManager[stopped=false] started 2018-04-20 09:27:49,997 INFO [main] o.a.n.c.c.h.AbstractHeartbeatMonitor Heartbeat Monitor started 2018-04-20 09:27:58,179 INFO [main] o.e.jetty.server.handler.ContextHandler Started o.e.j.w.WebAppContext#76d9fe26{/nifi-api,file:///root/nifi-1.6.0/work/jetty/nifi-web-api-1.6.0.war/webapp/,AVAILABLE}{./work/nar/framework/nifi-framework-nar-1.6.0.nar-unpacked/META-INF/bundled-dependencies/nifi-web-api-1.6.0.war} 2018-04-20 09:27:59,657 INFO [main] o.e.j.a.AnnotationConfiguration Scanning elapsed time=861ms 2018-04-20 09:27:59,834 INFO [main] o.e.j.C./nifi-content-viewer No Spring WebApplicationInitializer types detected on classpath 2018-04-20 09:27:59,840 INFO [main] o.e.jetty.server.handler.ContextHandler Started o.e.j.w.WebAppContext#43b41cb9{/nifi-content-viewer,file:///root/nifi-1.6.0/work/jetty/nifi-web-content-viewer-1.6.0.war/webapp/,AVAILABLE}{./work/nar/framework/nifi-framework-nar-1.6.0.nar-unpacked/META-INF/bundled-dependencies/nifi-web-content-viewer-1.6.0.war} 2018-04-20 09:27:59,863 INFO [main] o.e.jetty.server.handler.ContextHandler Started o.e.j.s.h.ContextHandler#49825659{/nifi-docs,null,AVAILABLE} 2018-04-20 09:28:00,079 INFO [main] o.e.j.a.AnnotationConfiguration Scanning elapsed time=69ms 2018-04-20 09:28:00,083 INFO [main] o.e.jetty.ContextHandler./nifi-docs No Spring WebApplicationInitializer types detected on classpath 2018-04-20 09:28:00,171 INFO [main] o.e.jetty.server.handler.ContextHandler Started o.e.j.w.WebAppContext#2ab0ca04{/nifi-docs,file:///root/nifi-1.6.0/work/jetty/nifi-web-docs-1.6.0.war/webapp/,AVAILABLE}{./work/nar/framework/nifi-framework-nar-1.6.0.nar-unpacked/META-INF/bundled-dependencies/nifi-web-docs-1.6.0.war} 2018-04-20 09:28:00,343 INFO [main] o.e.j.a.AnnotationConfiguration Scanning elapsed time=119ms 2018-04-20 09:28:00,344 INFO [main] org.eclipse.jetty.ContextHandler./ No Spring WebApplicationInitializer types detected on classpath 2018-04-20 09:28:00,454 INFO [main] o.e.jetty.server.handler.ContextHandler Started o.e.j.w.WebAppContext#67e255cf{/,file:///root/nifi-1.6.0/work/jetty/nifi-web-error-1.6.0.war/webapp/,AVAILABLE}{./work/nar/framework/nifi-framework-nar-1.6.0.nar-unpacked/META-INF/bundled-dependencies/nifi-web-error-1.6.0.war} 2018-04-20 09:28:00,541 INFO [main] o.eclipse.jetty.server.AbstractConnector Started ServerConnector#6d9f624{HTTP/1.1,[http/1.1]}{node-1:8080} 2018-04-20 09:28:00,559 INFO [main] org.eclipse.jetty.server.Server Started #98628ms 2018-04-20 09:28:00,599 INFO [main] org.apache.nifi.web.server.JettyServer Loading Flow... 2018-04-20 09:28:00,688 INFO [main] org.apache.nifi.io.socket.SocketListener Now listening for connections from nodes on port 9999 2018-04-20 09:28:00,916 INFO [main] o.apache.nifi.controller.FlowController Successfully synchronized controller with proposed flow 2018-04-20 09:28:01,337 INFO [main] o.a.nifi.controller.StandardFlowService Connecting Node: node-1:8080 2018-04-20 09:28:07,922 INFO [Curator-Framework-0] o.a.c.f.state.ConnectionStateManager State change: SUSPENDED 2018-04-20 09:28:07,927 INFO [Curator-ConnectionStateManager-0] o.a.n.c.l.e.CuratorLeaderElectionManager org.apache.nifi.controller.leader.election.CuratorLeaderElectionManager$ElectionListener#188207a7 Connection State changed to SUSPENDED 2018-04-20 09:28:07,930 ERROR [Curator-Framework-0] o.a.c.f.imps.CuratorFrameworkImpl Background operation retry gave up org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss
at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
at org.apache.curator.framework.imps.CuratorFrameworkImpl.checkBackgroundRetry(CuratorFrameworkImpl.java:728)
at org.apache.curator.framework.imps.CuratorFrameworkImpl.performBackgroundOperation(CuratorFrameworkImpl.java:857)
at org.apache.curator.framework.imps.CuratorFrameworkImpl.backgroundOperationsLoop(CuratorFrameworkImpl.java:809)
at org.apache.curator.framework.imps.CuratorFrameworkImpl.access$300(CuratorFrameworkImpl.java:64)
at org.apache.curator.framework.imps.CuratorFrameworkImpl$4.call(CuratorFrameworkImpl.java:267)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2018-04-19 15:56:55,209 ERROR [Curator-Framework-0] o.a.c.f.imps.CuratorFrameworkImpl Background operation retry gave up
org.apache.zookeeper.KeeperException$ConnectionLossException:
KeeperErrorCode = ConnectionLoss
at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
at
org.apache.curator.framework.imps.CuratorFrameworkImpl.checkBackgroundRetry(CuratorFrameworkImpl.java:728)
at org.apache.curator.framework.imps.CuratorFrameworkImpl.performBackgroundOperation(CuratorFrameworkImpl.java:857)
at org.apache.curator.framework.imps.CuratorFrameworkImpl.backgroundOperationsLoop(CuratorFrameworkImpl.java:809)
at org.apache.curator.framework.imps.CuratorFrameworkImpl.access$300(CuratorFrameworkImpl.java:64)
at org.apache.curator.framework.imps.CuratorFrameworkImpl$4.call(CuratorFrameworkImpl.java:267)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
I set the following properties on all 3 nodes:
mkdir ./state
mkdir ./state/zookeeper
echo 1 > ./state/zookeeper/myid (2 & 3 for other nodes)
in zookeeper.properties
server.1=node-1:2888:3888
server.2=node-2:2888:3888
server.3=node-3:2888:3888
in nifi.properties
nifi.state.management.embedded.zookeeper.start=true
nifi.zookeeper.connect.string=node-1:2181,node-2:2181,node-3:2181
nifi.cluster.protocol.is.secure=false
nifi.cluster.is.node=true
nifi.cluster.node.address=node-1(node-2 & node-3)
nifi.cluster.node.protocol.port=9999
nifi.cluster.node.protocol.threads=10
nifi.cluster.node.event.history.size=25
nifi.cluster.node.connection.timeout=5 sec
nifi.cluster.node.read.timeout=5 sec
nifi.cluster.firewall.file=
nifi.remote.input.host=node-1(node-2 & node-3)
nifi.remote.input.secure=false
nifi.remote.input.socket.port=9998
nifi.remote.input.http.enabled=true
nifi.remote.input.http.transaction.ttl=30 sec
nifi.web.http.host=node-1(node-2 & node-3)
After Bryan's comment, got reminded that I forgot to disable the iptables (I guess I can even open the ports but since this is for my practice I am disabling the iptables) and its working now.