Confluent Schema Registry Kubernetess hangs - kubernetes

I am trying to run Schema-registry server using helm charts from github hangs during startup when I deploy the kubernetess, kafka and zookeeper is up. I tried to Add DEBUG=true for more info but nothing prints. It was working great but i dont know what is happening. After the hang kubernetess just restarts the application and same situation happens. Kinly asking for help, how can I get more logs or information.
Also if i run this stack using docker-compose there is no issue. I guess it is about kubernetess configuration issue.
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
vaultify-trade-dev-v1-s-kafka-0 1/1 Running 0 5m
vaultify-trade-dev-v1-s-kafka-1 1/1 Running 0 4m
vaultify-trade-dev-v1-s-schema-registry-6b4c57f998-kq5vv 0/1 CrashLoopBackOff 5 5m
internal-controller-54cb494qdxg 1/1 Running 0 5m
internal-controller 1/1 Running 0 5m
vaultify-trade-dev-v1-s-zookeeper-0 1/1 Running 0 5m
$ kubectl get service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 5d
vaultify-trade-dev-v1-s-kafka ClusterIP 10.109.226.220 <none> 9092/TCP 8m
vaultify-trade-dev-v1-s-kafka-headless ClusterIP None <none> 9092/TCP 8m
vaultify-trade-dev-v1-s-schema-registry ClusterIP 10.98.201.198 <none> 8081/TCP 8m
internal-controller LoadBalancer 10.100.119.227 localhost 80:31323/TCP,443:31073/TCP 8m
internal-backend ClusterIP 10.100.74.127 <none> 80/TCP 8m
vaultify-trade-dev-v1-s-zookeeper ClusterIP 10.109.184.236 <none> 2181/TCP 8m
vaultify-trade-dev-v1-s-zookeeper-headless ClusterIP None <none> 2181/TCP,3888/TCP,2888/TCP 8m
https://github.com/helm/charts/tree/master/incubator/schema-registry
===> Launching ...
===> Launching schema-registry ...
[2019-02-27 09:59:25,341] INFO SchemaRegistryConfig values:
resource.extension.class = []
metric.reporters = []
kafkastore.sasl.kerberos.kinit.cmd = /usr/bin/kinit
response.mediatype.default = application/vnd.schemaregistry.v1+json
resource.extension.classes = []
kafkastore.ssl.trustmanager.algorithm = PKIX
inter.instance.protocol = http
authentication.realm =
ssl.keystore.type = JKS
kafkastore.topic = _schemas
metrics.jmx.prefix = kafka.schema.registry
kafkastore.ssl.enabled.protocols = TLSv1.2,TLSv1.1,TLSv1
kafkastore.topic.replication.factor = 3
ssl.truststore.password = [hidden]
kafkastore.timeout.ms = 500
host.name = 10.1.2.67
kafkastore.bootstrap.servers = [PLAINTEXT://vaultify-trade-dev-v1-s-kafka-headless:9092]
schema.registry.zk.namespace = schema_registry
kafkastore.sasl.kerberos.ticket.renew.window.factor = 0.8
kafkastore.sasl.kerberos.service.name =
schema.registry.resource.extension.class = []
ssl.endpoint.identification.algorithm =
compression.enable = true
kafkastore.ssl.truststore.type = JKS
avro.compatibility.level = backward
kafkastore.ssl.protocol = TLS
kafkastore.ssl.provider =
kafkastore.ssl.truststore.location =
response.mediatype.preferred = [application/vnd.schemaregistry.v1+json, application/vnd.schemaregistry+json, application/json]
kafkastore.ssl.keystore.type = JKS
authentication.skip.paths = []
ssl.truststore.type = JKS
websocket.servlet.initializor.classes = []
kafkastore.ssl.truststore.password = [hidden]
access.control.allow.origin =
ssl.truststore.location =
ssl.keystore.password = [hidden]
port = 8081
access.control.allow.headers =
kafkastore.ssl.keystore.location =
metrics.tag.map = {}
master.eligibility = true
ssl.client.auth = false
kafkastore.ssl.keystore.password = [hidden]
rest.servlet.initializor.classes = []
websocket.path.prefix = /ws
kafkastore.security.protocol = PLAINTEXT
ssl.trustmanager.algorithm =
authentication.method = NONE
request.logger.name = io.confluent.rest-utils.requests
ssl.key.password = [hidden]
kafkastore.zk.session.timeout.ms = 30000
kafkastore.sasl.mechanism = GSSAPI
kafkastore.sasl.kerberos.ticket.renew.jitter = 0.05
kafkastore.ssl.key.password = [hidden]
zookeeper.set.acl = false
schema.registry.inter.instance.protocol =
authentication.roles = [*]
metrics.num.samples = 2
ssl.protocol = TLS
schema.registry.group.id = schema-registry
kafkastore.ssl.keymanager.algorithm = SunX509
kafkastore.connection.url =
debug = false
listeners = []
kafkastore.group.id = vaultify-trade-dev-v1-s
ssl.provider =
ssl.enabled.protocols = []
shutdown.graceful.ms = 1000
ssl.keystore.location =
ssl.cipher.suites = []
kafkastore.ssl.endpoint.identification.algorithm =
kafkastore.ssl.cipher.suites =
access.control.allow.methods =
kafkastore.sasl.kerberos.min.time.before.relogin = 60000
ssl.keymanager.algorithm =
metrics.sample.window.ms = 30000
kafkastore.init.timeout.ms = 60000
(io.confluent.kafka.schemaregistry.rest.SchemaRegistryConfig)
[2019-02-27 09:59:25,379] INFO Logging initialized #381ms to org.eclipse.jetty.util.log.Slf4jLog (org.eclipse.jetty.util.log)
[2019-02-27 09:59:25,614] WARN DEPRECATION warning: `listeners` configuration is not configured. Falling back to the deprecated `port` configuration. (io.confluent.rest.Application)
[2019-02-27 09:59:25,734] WARN DEPRECATION warning: `listeners` configuration is not configured. Falling back to the deprecated `port` configuration. (io.confluent.rest.Application)
[2019-02-27 09:59:25,734] INFO Initializing KafkaStore with broker endpoints: PLAINTEXT://vaultify-trade-dev-v1-s-kafka-headless:9092 (io.confluent.kafka.schemaregistry.storage.KafkaStore)
[2019-02-27 09:59:25,750] INFO AdminClientConfig values:
bootstrap.servers = [PLAINTEXT://vaultify-trade-dev-v1-s-kafka-headless:9092]
client.dns.lookup = default
client.id =
connections.max.idle.ms = 300000
metadata.max.age.ms = 300000
metric.reporters = []
metrics.num.samples = 2
metrics.recording.level = INFO
metrics.sample.window.ms = 30000
receive.buffer.bytes = 65536
reconnect.backoff.max.ms = 1000
reconnect.backoff.ms = 50
request.timeout.ms = 120000
retries = 5
retry.backoff.ms = 100
sasl.client.callback.handler.class = null
sasl.jaas.config = null
sasl.kerberos.kinit.cmd = /usr/bin/kinit
sasl.kerberos.min.time.before.relogin = 60000
sasl.kerberos.service.name = null
sasl.kerberos.ticket.renew.jitter = 0.05
sasl.kerberos.ticket.renew.window.factor = 0.8
sasl.login.callback.handler.class = null
sasl.login.class = null
sasl.login.refresh.buffer.seconds = 300
sasl.login.refresh.min.period.seconds = 60
sasl.login.refresh.window.factor = 0.8
sasl.login.refresh.window.jitter = 0.05
sasl.mechanism = GSSAPI
security.protocol = PLAINTEXT
send.buffer.bytes = 131072
ssl.cipher.suites = null
ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]
ssl.endpoint.identification.algorithm = https
ssl.key.password = null
ssl.keymanager.algorithm = SunX509
ssl.keystore.location = null
ssl.keystore.password = null
ssl.keystore.type = JKS
ssl.protocol = TLS
ssl.provider = null
ssl.secure.random.implementation = null
ssl.trustmanager.algorithm = PKIX
ssl.truststore.location = null
ssl.truststore.password = null
ssl.truststore.type = JKS
(org.apache.kafka.clients.admin.AdminClientConfig)
[2019-02-27 09:59:25,813] WARN The configuration 'group.id' was supplied but isn't a known config. (org.apache.kafka.clients.admin.AdminClientConfig)
[2019-02-27 09:59:25,817] INFO Kafka version : 2.1.1-cp1 (org.apache.kafka.common.utils.AppInfoParser)
[2019-02-27 09:59:25,817] INFO Kafka commitId : 9aa84c2aaa91e392 (org.apache.kafka.common.utils.AppInfoParser)
[2019-02-27 09:59:25,973] INFO Validating schemas topic _schemas (io.confluent.kafka.schemaregistry.storage.KafkaStore)
[2019-02-27 09:59:25,981] WARN The replication factor of the schema topic _schemas is less than the desired one of 3. If this is a production environment, it's crucial to add more brokers and increase the replication factor of the topic. (io.confluent.kafka.schemaregistry.storage.KafkaStore)
[2019-02-27 09:59:26,010] INFO ProducerConfig values:
acks = -1
batch.size = 16384
bootstrap.servers = [PLAINTEXT://vaultify-trade-dev-v1-s-kafka-headless:9092]
buffer.memory = 33554432
client.dns.lookup = default
client.id =
compression.type = none
connections.max.idle.ms = 540000
delivery.timeout.ms = 120000
enable.idempotence = false
interceptor.classes = []
key.serializer = class org.apache.kafka.common.serialization.ByteArraySerializer
linger.ms = 0
max.block.ms = 60000
max.in.flight.requests.per.connection = 5
max.request.size = 1048576
metadata.max.age.ms = 300000
metric.reporters = []
metrics.num.samples = 2
metrics.recording.level = INFO
metrics.sample.window.ms = 30000
partitioner.class = class org.apache.kafka.clients.producer.internals.DefaultPartitioner
receive.buffer.bytes = 32768
reconnect.backoff.max.ms = 1000
reconnect.backoff.ms = 50
request.timeout.ms = 30000
retries = 0
retry.backoff.ms = 100
sasl.client.callback.handler.class = null
sasl.jaas.config = null
sasl.kerberos.kinit.cmd = /usr/bin/kinit
sasl.kerberos.min.time.before.relogin = 60000
sasl.kerberos.service.name = null
sasl.kerberos.ticket.renew.jitter = 0.05
sasl.kerberos.ticket.renew.window.factor = 0.8
sasl.login.callback.handler.class = null
sasl.login.class = null
sasl.login.refresh.buffer.seconds = 300
sasl.login.refresh.min.period.seconds = 60
sasl.login.refresh.window.factor = 0.8
sasl.login.refresh.window.jitter = 0.05
sasl.mechanism = GSSAPI
security.protocol = PLAINTEXT
send.buffer.bytes = 131072
ssl.cipher.suites = null
ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]
ssl.endpoint.identification.algorithm = https
ssl.key.password = null
ssl.keymanager.algorithm = SunX509
ssl.keystore.location = null
ssl.keystore.password = null
ssl.keystore.type = JKS
ssl.protocol = TLS
ssl.provider = null
ssl.secure.random.implementation = null
ssl.trustmanager.algorithm = PKIX
ssl.truststore.location = null
ssl.truststore.password = null
ssl.truststore.type = JKS
transaction.timeout.ms = 60000
transactional.id = null
value.serializer = class org.apache.kafka.common.serialization.ByteArraySerializer
(org.apache.kafka.clients.producer.ProducerConfig)
[2019-02-27 09:59:26,046] WARN The configuration 'group.id' was supplied but isn't a known config. (org.apache.kafka.clients.producer.ProducerConfig)
[2019-02-27 09:59:26,046] INFO Kafka version : 2.1.1-cp1 (org.apache.kafka.common.utils.AppInfoParser)
[2019-02-27 09:59:26,046] INFO Kafka commitId : 9aa84c2aaa91e392 (org.apache.kafka.common.utils.AppInfoParser)
[2019-02-27 09:59:26,062] INFO Registered kafka:type=kafka.Log4jController MBean (kafka.utils.Log4jControllerRegistration$)
[2019-02-27 09:59:26,098] INFO Kafka store reader thread starting consumer (io.confluent.kafka.schemaregistry.storage.KafkaStoreReaderThread)
[2019-02-27 09:59:26,107] INFO ConsumerConfig values:
auto.commit.interval.ms = 5000
auto.offset.reset = earliest
bootstrap.servers = [PLAINTEXT://vaultify-trade-dev-v1-s-kafka-headless:9092]
check.crcs = true
client.dns.lookup = default
client.id = KafkaStore-reader-_schemas
connections.max.idle.ms = 540000
default.api.timeout.ms = 60000
enable.auto.commit = false
exclude.internal.topics = true
fetch.max.bytes = 52428800
fetch.max.wait.ms = 500
fetch.min.bytes = 1
group.id = vaultify-trade-dev-v1-s
heartbeat.interval.ms = 3000
interceptor.classes = []
internal.leave.group.on.close = true
isolation.level = read_uncommitted
key.deserializer = class org.apache.kafka.common.serialization.ByteArrayDeserializer
max.partition.fetch.bytes = 1048576
max.poll.interval.ms = 300000
max.poll.records = 500
metadata.max.age.ms = 300000
metric.reporters = []
metrics.num.samples = 2
metrics.recording.level = INFO
metrics.sample.window.ms = 30000
partition.assignment.strategy = [class org.apache.kafka.clients.consumer.RangeAssignor]
receive.buffer.bytes = 65536
reconnect.backoff.max.ms = 1000
reconnect.backoff.ms = 50
request.timeout.ms = 30000
retry.backoff.ms = 100
sasl.client.callback.handler.class = null
sasl.jaas.config = null
sasl.kerberos.kinit.cmd = /usr/bin/kinit
sasl.kerberos.min.time.before.relogin = 60000
sasl.kerberos.service.name = null
sasl.kerberos.ticket.renew.jitter = 0.05
sasl.kerberos.ticket.renew.window.factor = 0.8
sasl.login.callback.handler.class = null
sasl.login.class = null
sasl.login.refresh.buffer.seconds = 300
sasl.login.refresh.min.period.seconds = 60
sasl.login.refresh.window.factor = 0.8
sasl.login.refresh.window.jitter = 0.05
sasl.mechanism = GSSAPI
security.protocol = PLAINTEXT
send.buffer.bytes = 131072
session.timeout.ms = 10000
ssl.cipher.suites = null
ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]
ssl.endpoint.identification.algorithm = https
ssl.key.password = null
ssl.keymanager.algorithm = SunX509
ssl.keystore.location = null
ssl.keystore.password = null
ssl.keystore.type = JKS
ssl.protocol = TLS
ssl.provider = null
ssl.secure.random.implementation = null
ssl.trustmanager.algorithm = PKIX
ssl.truststore.location = null
ssl.truststore.password = null
ssl.truststore.type = JKS
value.deserializer = class org.apache.kafka.common.serialization.ByteArrayDeserializer
(org.apache.kafka.clients.consumer.ConsumerConfig)
[2019-02-27 09:59:26,154] INFO Kafka version : 2.1.1-cp1 (org.apache.kafka.common.utils.AppInfoParser)
[2019-02-27 09:59:26,154] INFO Kafka commitId : 9aa84c2aaa91e392 (org.apache.kafka.common.utils.AppInfoParser)
[2019-02-27 09:59:26,164] INFO Cluster ID: yST0jB3rQhmxVsWCEKf7mg (org.apache.kafka.clients.Metadata)
[2019-02-27 09:59:26,168] INFO Initialized last consumed offset to -1 (io.confluent.kafka.schemaregistry.storage.KafkaStoreReaderThread)
[2019-02-27 09:59:26,170] INFO [kafka-store-reader-thread-_schemas]: Starting (io.confluent.kafka.schemaregistry.storage.KafkaStoreReaderThread)
[2019-02-27 09:59:26,200] INFO [Consumer clientId=KafkaStore-reader-_schemas, groupId=vaultify-trade-dev-v1-s] Resetting offset for partition _schemas-0 to offset 0. (org.apache.kafka.clients.consumer.internals.Fetcher)
[2019-02-27 09:59:26,228] INFO Cluster ID: yST0jB3rQhmxVsWCEKf7mg (org.apache.kafka.clients.Metadata)
[2019-02-27 09:59:26,304] INFO Wait to catch up until the offset of the last message at 17 (io.confluent.kafka.schemaregistry.storage.KafkaStore)
[2019-02-27 09:59:26,359] INFO Joining schema registry with Kafka-based coordination (io.confluent.kafka.schemaregistry.storage.KafkaSchemaRegistry)
[2019-02-27 09:59:26,366] INFO Kafka version : 2.1.1-cp1 (org.apache.kafka.common.utils.AppInfoParser)
[2019-02-27 09:59:26,366] INFO Kafka commitId : 9aa84c2aaa91e392 (org.apache.kafka.common.utils.AppInfoParser)
[2019-02-27 09:59:26,377] INFO Cluster ID: yST0jB3rQhmxVsWCEKf7mg (org.apache.kafka.clients.Metadata)
This is my kubernetess deployment
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: vaultify-trade-dev-v1-s-schema-registry
labels:
app: schema-registry
chart: schema-registry-1.1.2
release: vaultify-trade-dev-v1-s
heritage: Tiller
spec:
replicas: 1
template:
metadata:
labels:
app: schema-registry
release: vaultify-trade-dev-v1-s
spec:
containers:
- name: schema-registry
image: "confluentinc/cp-schema-registry:5.1.2"
imagePullPolicy: IfNotPresent
ports:
- containerPort: 8081
- containerPort: 5555
name: jmx
livenessProbe:
httpGet:
path: /
port: 8081
initialDelaySeconds: 10
timeoutSeconds: 5
readinessProbe:
httpGet:
path: /
port: 8081
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 5
successThreshold: 1
failureThreshold: 3
env:
- name: SCHEMA_REGISTRY_HOST_NAME
valueFrom:
fieldRef:
fieldPath: status.podIP
- name: SCHEMA_REGISTRY_KAFKASTORE_BOOTSTRAP_SERVERS
value: PLAINTEXT://vaultify-trade-dev-v1-s-kafka-headless:9092
- name: SCHEMA_REGISTRY_KAFKASTORE_GROUP_ID
value: vaultify-trade-dev-v1-s
- name: SCHEMA_REGISTRY_MASTER_ELIGIBILITY
value: "true"
- name: JMX_PORT
value: "5555"
resources:
{}
volumeMounts:
volumes:
More..
If I tell kubernetess to not restart I get this error
[2019-02-27 10:29:07,601] INFO Wait to catch up until the offset of the last message at 8 (io.confluent.kafka.schemaregistry.storage.KafkaStore)
[2019-02-27 10:29:07,675] INFO Joining schema registry with Kafka-based coordination (io.confluent.kafka.schemaregistry.storage.KafkaSchemaRegistry)
[2019-02-27 10:29:07,681] INFO Kafka version : 2.0.1-cp1 (org.apache.kafka.common.utils.AppInfoParser)
[2019-02-27 10:29:07,681] INFO Kafka commitId : 815feb8a888d39d9 (org.apache.kafka.common.utils.AppInfoParser)
[2019-02-27 10:29:07,696] INFO Cluster ID: HoNdEGzXTCqHb_Ba6_toaA (org.apache.kafka.clients.Metadata)
.
[2019-02-27 10:30:07,681] ERROR Error starting the schema registry (io.confluent.kafka.schemaregistry.rest.SchemaRegistryRestApplication)
io.confluent.kafka.schemaregistry.exceptions.SchemaRegistryInitializationException: io.confluent.kafka.schemaregistry.exceptions.SchemaRegistryTimeoutException: Timed out waiting for join group to complete
at io.confluent.kafka.schemaregistry.storage.KafkaSchemaRegistry.init(KafkaSchemaRegistry.java:220)
at io.confluent.kafka.schemaregistry.rest.SchemaRegistryRestApplication.setupResources(SchemaRegistryRestApplication.java:63)
at io.confluent.kafka.schemaregistry.rest.SchemaRegistryRestApplication.setupResources(SchemaRegistryRestApplication.java:41)
at io.confluent.rest.Application.createServer(Application.java:169)
at io.confluent.kafka.schemaregistry.rest.SchemaRegistryMain.main(SchemaRegistryMain.java:43)
Caused by: io.confluent.kafka.schemaregistry.exceptions.SchemaRegistryTimeoutException: Timed out waiting for join group to complete
at io.confluent.kafka.schemaregistry.masterelector.kafka.KafkaGroupMasterElector.init(KafkaGroupMasterElector.java:202)
at io.confluent.kafka.schemaregistry.storage.KafkaSchemaRegistry.init(KafkaSchemaRegistry.java:215)
... 4 more
[2019-02-27 10:30:07,682] INFO Shutting down schema registry (io.confluent.kafka.schemaregistry.storage.KafkaSchemaRegistry)
[2019-02-27 10:30:07,685] INFO [kafka-store-reader-thread-_schemas]: Shutting down (io.confluent.kafka.schemaregistry.storage.KafkaStoreReaderThread)
[2019-02-27 10:30:07,687] INFO [kafka-store-reader-thread-_schemas]: Stopped (io.confluent.kafka.schemaregistry.storage.KafkaStoreReaderThread)
[2019-02-27 10:30:07,688] INFO [kafka-store-reader-thread-_schemas]: Shutdown completed (io.confluent.kafka.schemaregistry.storage.KafkaStoreReaderThread)
[2019-02-27 10:30:07,692] INFO KafkaStoreReaderThread shutdown complete. (io.confluent.kafka.schemaregistry.storage.KafkaStoreReaderThread)
[2019-02-27 10:30:07,692] INFO [Producer clientId=producer-1] Closing the Kafka producer with timeoutMillis = 9223372036854775807 ms. (org.apache.kafka.clients.producer.KafkaProducer)
[2019-02-27 10:30:07,710] ERROR Unexpected exception in schema registry group processing thread (io.confluent.kafka.schemaregistry.masterelector.kafka.KafkaGroupMasterElector)
org.apache.kafka.common.errors.WakeupException
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.maybeTriggerWakeup(ConsumerNetworkClient.java:498)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:284)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:242)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:218)
at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureCoordinatorReady(AbstractCoordinator.java:230)
at io.confluent.kafka.schemaregistry.masterelector.kafka.SchemaRegistryCoordinator.ensureCoordinatorReady(SchemaRegistryCoordinator.java:207)
at io.confluent.kafka.schemaregistry.masterelector.kafka.SchemaRegistryCoordinator.poll(SchemaRegistryCoordinator.java:97)
at io.confluent.kafka.schemaregistry.masterelector.kafka.KafkaGroupMasterElector$1.run(KafkaGroupMasterElector.java:192)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

As tolga_kavukcu mentioned in the comments:
Default replication factor for topics is 3 in kafka helm chart.
In 1 node cluster schema-registry cannot initiate topic creation at kafka side error happens.
Just change default replication factor if this is the case

Related

Timeout expired while fetching topic metadata - Kafka

We are trying to consume the broker messages in Kafka hosted in Windows standalone.
Consumer is running in Kubernetes.
server.properties:
listeners=PLAINTEXT://:29092
advertised.listeners=PLAINTEXT://myhostname:29092
listener.security.protocol.map=PLAINTEXT:PLAINTEXT,SSL:SSL,SASL_PLAINTEXT:SASL_PLAINTEXT,SASL_SSL:SASL_SSL
Consumer Error:
[main] INFO org.apache.kafka.common.utils.AppInfoParser - Kafka startTimeMs: 1627883198273
[main] INFO XXX.XXX.KafkaConsumerProperties - Kafka Topic Name : table-update
[main] INFO org.apache.kafka.clients.consumer.KafkaConsumer - [Consumer clientId=consumer-GroupConsumer-1, groupId=GroupConsumer] Subscribed to topic(s): table-update
[main] INFO XX.XX.XXXXX- Could not run Loader: Timeout expired while fetching topic metadata .
Consumer config values :
allow.auto.create.topics = true
auto.commit.interval.ms = 5000
auto.offset.reset = earliest
bootstrap.servers = [myhostname:29092]
check.crcs = true
client.dns.lookup = use_all_dns_ips
client.id = consumer-GroupConsumer-1
client.rack =
connections.max.idle.ms = 540000
default.api.timeout.ms = 60000
enable.auto.commit = false
exclude.internal.topics = true
fetch.max.bytes = 52428800
fetch.max.wait.ms = 500
fetch.min.bytes = 1
group.id = GroupConsumer
group.instance.id = null
heartbeat.interval.ms = 3000
interceptor.classes = []
internal.leave.group.on.close = true
internal.throw.on.fetch.stable.offset.unsupported = false
isolation.level = read_uncommitted
key.deserializer = class org.apache.kafka.common.serialization.StringDeserializer
max.partition.fetch.bytes = 52428800
max.poll.interval.ms = 2147483647
max.poll.records = 1000
metadata.max.age.ms = 300000
metric.reporters = []
metrics.num.samples = 2
metrics.recording.level = INFO
metrics.sample.window.ms = 30000
partition.assignment.strategy = [class org.apache.kafka.clients.consumer.RangeAssignor]
receive.buffer.bytes = 65536
reconnect.backoff.max.ms = 1000
reconnect.backoff.ms = 50
request.timeout.ms = 120000
retry.backoff.ms = 100
sasl.client.callback.handler.class = null
sasl.jaas.config = null
sasl.kerberos.kinit.cmd = /usr/bin/kinit
sasl.kerberos.min.time.before.relogin = 60000
sasl.kerberos.service.name = null
sasl.kerberos.ticket.renew.jitter = 0.05
sasl.kerberos.ticket.renew.window.factor = 0.8
sasl.login.callback.handler.class = null
sasl.login.class = null
sasl.login.refresh.buffer.seconds = 300
sasl.login.refresh.min.period.seconds = 60
sasl.login.refresh.window.factor = 0.8
sasl.login.refresh.window.jitter = 0.05
sasl.mechanism = GSSAPI
security.protocol = PLAINTEXT
security.providers = null
send.buffer.bytes = 131072
session.timeout.ms = 60000
ssl.cipher.suites = null
ssl.enabled.protocols = [TLSv1.2]
ssl.endpoint.identification.algorithm = https
ssl.engine.factory.class = null
ssl.key.password = null
ssl.keymanager.algorithm = SunX509
ssl.keystore.location = null
ssl.keystore.password = null
ssl.keystore.type = JKS
ssl.protocol = TLSv1.2
ssl.provider = null
ssl.secure.random.implementation = null
ssl.trustmanager.algorithm = PKIX
ssl.truststore.location = null
ssl.truststore.password = null
ssl.truststore.type = JKS
value.deserializer = XXXX
Please help me in resolving this issue.
While fetching the metadata for topics there are possibilities of multiple error conditions like invalid topic name, metadata leader, offline partitions etc. You can get more info about the errors by logging at debug level. The debug level logs the error condition
log.debug("Topic metadata fetch included errors: {}", errors);

Azure Event hubs for Kafka : Attempt to join group failed due to unexpected error

I'm facing an issue related to Azure Event Hubs for Kafka.
I have a Kafka consumer processing a message topic name "ABC" that has the below config:
ConsumerConfig values:
allow.auto.create.topics = true
auto.commit.interval.ms = 5000
auto.offset.reset = earliest
bootstrap.servers = [xxxx-eventhubs-namespace.servicebus.windows.net:9093]
check.crcs = true
client.dns.lookup = use_all_dns_ips
client.id = consumer-XXXGroupConsumer-3
client.rack =
connections.max.idle.ms = 540000
default.api.timeout.ms = 60000
enable.auto.commit = false
exclude.internal.topics = true
fetch.max.bytes = 52428800
fetch.max.wait.ms = 500
fetch.min.bytes = 1
group.id = XXXGroupConsumer
group.instance.id = null
heartbeat.interval.ms = 3000
interceptor.classes = []
internal.leave.group.on.close = true
internal.throw.on.fetch.stable.offset.unsupported = false
isolation.level = read_uncommitted
key.deserializer = class org.apache.kafka.common.serialization.StringDeserializer
max.partition.fetch.bytes = 52428800
max.poll.interval.ms = 2147483647
max.poll.records = 1000
metadata.max.age.ms = 300000
metric.reporters = []
metrics.num.samples = 2
metrics.recording.level = INFO
metrics.sample.window.ms = 30000
partition.assignment.strategy = [class org.apache.kafka.clients.consumer.RangeAssignor]
receive.buffer.bytes = 65536
reconnect.backoff.max.ms = 1000
reconnect.backoff.ms = 50
request.timeout.ms = 120000
retry.backoff.ms = 100
sasl.client.callback.handler.class = null
sasl.jaas.config = [hidden]
sasl.kerberos.kinit.cmd = /usr/bin/kinit
sasl.kerberos.min.time.before.relogin = 60000
sasl.kerberos.service.name = null
sasl.kerberos.ticket.renew.jitter = 0.05
sasl.kerberos.ticket.renew.window.factor = 0.8
sasl.login.callback.handler.class = null
sasl.login.class = null
sasl.login.refresh.buffer.seconds = 300
sasl.login.refresh.min.period.seconds = 60
sasl.login.refresh.window.factor = 0.8
sasl.login.refresh.window.jitter = 0.05
sasl.mechanism = PLAIN
security.protocol = SASL_SSL
security.providers = null
send.buffer.bytes = 131072
session.timeout.ms = 100000
ssl.cipher.suites = null
ssl.enabled.protocols = [TLSv1.2]
ssl.endpoint.identification.algorithm = https
ssl.engine.factory.class = null
ssl.key.password = null
ssl.keymanager.algorithm = SunX509
ssl.keystore.location = null
ssl.keystore.password = null
ssl.keystore.type = JKS
ssl.protocol = TLSv1.2
ssl.provider = null
ssl.secure.random.implementation = null
ssl.trustmanager.algorithm = PKIX
ssl.truststore.location = null
ssl.truststore.password = null
ssl.truststore.type = JKS
value.deserializer = class KafkaConsumerDeserializer
Consumer is able to subscribe to topic "ABC" , but it fails with the below error:
"Join group failed with org.apache.kafka.common.KafkaException: Unexpected error in join group response: The request timed out."
[pool-2-thread-1] INFO org.apache.kafka.clients.consumer.KafkaConsumer - [Consumer clientId=consumer-XXXGroupConsumer-3, groupId=XXXGroupConsumer] Subscribed to topic(s): ABC
[pool-2-thread-1] INFO org.apache.kafka.clients.consumer.internals.AbstractCoordinator - [Consumer clientId=consumer-XXXGroupConsumer-3, groupId=XXXGroupConsumer] Discovered group coordinator xxxx-eventhubs-namespace.servicebus.windows.net:9093 (id: 2147483647 rack: null)
[pool-2-thread-1] INFO org.apache.kafka.clients.consumer.internals.AbstractCoordinator - [Consumer clientId=consumer-XXXGroupConsumer-3, groupId=XXXGroupConsumer] (Re-)joining group
[pool-2-thread-1] ERROR org.apache.kafka.clients.consumer.internals.AbstractCoordinator - [Consumer clientId=consumer-XXXGroupConsumer-3, groupId=XXXGroupConsumer] **Attempt to join group failed due to unexpected error: The request timed out.**
[pool-2-thread-1] INFO org.apache.kafka.clients.consumer.internals.AbstractCoordinator - [Consumer clientId=consumer-XXXGroupConsumer-3, groupId=XXXGroupConsumer] **Join group failed with org.apache.kafka.common.KafkaException: Unexpected error in join group response: The request timed out.**

How can I enable SASL in Kafka-Connect (within Cluster)

I have downloaded cp-kafka-connect and deployed in my k8s cluster with a KafKa broker which accept secure connections. (SASL)
I would like to enable security(SASL) for Kafka Connect.
I am using ConfigMap to mount the configuration file named connect-distributed.properties into cp-kafka-connect container (in etc/kafka)
Here is the part of configuration file:
sasl.mechanism=SCRAM-SHA-256
# Configure SASL_SSL if SSL encryption is enabled, otherwise configure SASL_PLAINTEXT
security.protocol=SASL_SSL
sasl.jaas.config=org.apache.kafka.common.security.scram.ScramLoginModule required
username="admin" password="password-secret";
But It is failing to start with an error.
Here are the logs:
kubectl logs test-cp-kafka-connect-846f4b745f-hx2mp
===> ENV Variables ...
ALLOW_UNSIGNED=false
COMPONENT=kafka-connect
CONFLUENT_DEB_VERSION=1
CONFLUENT_PLATFORM_LABEL=
CONFLUENT_VERSION=5.5.0
CONNECT_BOOTSTRAP_SERVERS=PLAINTEXT://test-kafka:9092
CONNECT_CONFIG_STORAGE_REPLICATION_FACTOR=3
CONNECT_CONFIG_STORAGE_TOPIC=test-cp-kafka-connect-config
CONNECT_GROUP_ID=test
CONNECT_INTERNAL_KEY_CONVERTER=org.apache.kafka.connect.json.JsonConverter
CONNECT_INTERNAL_VALUE_CONVERTER=org.apache.kafka.connect.json.JsonConverter
CONNECT_KEY_CONVERTER=io.confluent.connect.avro.AvroConverter
CONNECT_KEY_CONVERTER_SCHEMAS_ENABLE=false
CONNECT_KEY_CONVERTER_SCHEMA_REGISTRY_URL=http://test-cp-schema-registry:8081
CONNECT_OFFSET_STORAGE_REPLICATION_FACTOR=3
CONNECT_OFFSET_STORAGE_TOPIC=test-cp-kafka-connect-offset
CONNECT_PLUGIN_PATH=/usr/share/java,/usr/share/confluent-hub-components
CONNECT_REST_ADVERTISED_HOST_NAME=10.233.85.127
CONNECT_REST_PORT=8083
CONNECT_STATUS_STORAGE_REPLICATION_FACTOR=3
CONNECT_STATUS_STORAGE_TOPIC=test-cp-kafka-connect-status
CONNECT_VALUE_CONVERTER=io.confluent.connect.avro.AvroConverter
CONNECT_VALUE_CONVERTER_SCHEMAS_ENABLE=false
CONNECT_VALUE_CONVERTER_SCHEMA_REGISTRY_URL=http://test-cp-schema-registry:8081
CUB_CLASSPATH=/etc/confluent/docker/docker-utils.jar
HOME=/root
HOSTNAME=test-cp-kafka-connect-846f4b745f-hx2mp
KAFKA_ADVERTISED_LISTENERS=
KAFKA_HEAP_OPTS=-Xms512M -Xmx512M
KAFKA_JMX_PORT=5555
KAFKA_VERSION=
KAFKA_ZOOKEEPER_CONNECT=
KUBERNETES_PORT=tcp://10.233.0.1:443
KUBERNETES_PORT_443_TCP=tcp://10.233.0.1:443
KUBERNETES_PORT_443_TCP_ADDR=10.233.0.1
KUBERNETES_PORT_443_TCP_PORT=443
KUBERNETES_PORT_443_TCP_PROTO=tcp
KUBERNETES_SERVICE_HOST=10.233.0.1
KUBERNETES_SERVICE_PORT=443
KUBERNETES_SERVICE_PORT_HTTPS=443
LANG=C.UTF-8
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
PWD=/
PYTHON_PIP_VERSION=8.1.2
PYTHON_VERSION=2.7.9-1
SCALA_VERSION=2.12
SHLVL=1
TEST_0_EXTERNAL_PORT=tcp://10.233.13.164:19092
TEST_0_EXTERNAL_PORT_19092_TCP=tcp://10.233.13.164:19092
TEST_0_EXTERNAL_PORT_19092_TCP_ADDR=10.233.13.164
TEST_0_EXTERNAL_PORT_19092_TCP_PORT=19092
TEST_0_EXTERNAL_PORT_19092_TCP_PROTO=tcp
TEST_0_EXTERNAL_SERVICE_HOST=10.233.13.164
TEST_0_EXTERNAL_SERVICE_PORT=19092
TEST_0_EXTERNAL_SERVICE_PORT_EXTERNAL_BROKER=19092
TEST_CP_KAFKA_CONNECT_PORT=tcp://10.233.38.137:8083
TEST_CP_KAFKA_CONNECT_PORT_8083_TCP=tcp://10.233.38.137:8083
TEST_CP_KAFKA_CONNECT_PORT_8083_TCP_ADDR=10.233.38.137
TEST_CP_KAFKA_CONNECT_PORT_8083_TCP_PORT=8083
TEST_CP_KAFKA_CONNECT_PORT_8083_TCP_PROTO=tcp
TEST_CP_KAFKA_CONNECT_SERVICE_HOST=10.233.38.137
TEST_CP_KAFKA_CONNECT_SERVICE_PORT=8083
TEST_CP_KAFKA_CONNECT_SERVICE_PORT_KAFKA_CONNECT=8083
TEST_KAFKA_EXPORTER_PORT=tcp://10.233.5.215:9308
TEST_KAFKA_EXPORTER_PORT_9308_TCP=tcp://10.233.5.215:9308
TEST_KAFKA_EXPORTER_PORT_9308_TCP_ADDR=10.233.5.215
TEST_KAFKA_EXPORTER_PORT_9308_TCP_PORT=9308
TEST_KAFKA_EXPORTER_PORT_9308_TCP_PROTO=tcp
TEST_KAFKA_EXPORTER_SERVICE_HOST=10.233.5.215
TEST_KAFKA_EXPORTER_SERVICE_PORT=9308
TEST_KAFKA_EXPORTER_SERVICE_PORT_KAFKA_EXPORTER=9308
TEST_KAFKA_MANAGER_PORT=tcp://10.233.7.186:9000
TEST_KAFKA_MANAGER_PORT_9000_TCP=tcp://10.233.7.186:9000
TEST_KAFKA_MANAGER_PORT_9000_TCP_ADDR=10.233.7.186
TEST_KAFKA_MANAGER_PORT_9000_TCP_PORT=9000
TEST_KAFKA_MANAGER_PORT_9000_TCP_PROTO=tcp
TEST_KAFKA_MANAGER_SERVICE_HOST=10.233.7.186
TEST_KAFKA_MANAGER_SERVICE_PORT=9000
TEST_KAFKA_MANAGER_SERVICE_PORT_KAFKA_MANAGER=9000
TEST_KAFKA_PORT=tcp://10.233.12.237:9092
TEST_KAFKA_PORT_8001_TCP=tcp://10.233.12.237:8001
TEST_KAFKA_PORT_8001_TCP_ADDR=10.233.12.237
TEST_KAFKA_PORT_8001_TCP_PORT=8001
TEST_KAFKA_PORT_8001_TCP_PROTO=tcp
TEST_KAFKA_PORT_9092_TCP=tcp://10.233.12.237:9092
TEST_KAFKA_PORT_9092_TCP_ADDR=10.233.12.237
TEST_KAFKA_PORT_9092_TCP_PORT=9092
TEST_KAFKA_PORT_9092_TCP_PROTO=tcp
TEST_KAFKA_SERVICE_HOST=10.233.12.237
TEST_KAFKA_SERVICE_PORT=9092
TEST_KAFKA_SERVICE_PORT_BROKER=9092
TEST_KAFKA_SERVICE_PORT_KAFKASHELL=8001
TEST_ZOOKEEPER_PORT=tcp://10.233.1.144:2181
TEST_ZOOKEEPER_PORT_2181_TCP=tcp://10.233.1.144:2181
TEST_ZOOKEEPER_PORT_2181_TCP_ADDR=10.233.1.144
TEST_ZOOKEEPER_PORT_2181_TCP_PORT=2181
TEST_ZOOKEEPER_PORT_2181_TCP_PROTO=tcp
TEST_ZOOKEEPER_SERVICE_HOST=10.233.1.144
TEST_ZOOKEEPER_SERVICE_PORT=2181
TEST_ZOOKEEPER_SERVICE_PORT_CLIENT=2181
ZULU_OPENJDK_VERSION=8=8.38.0.13
_=/usr/bin/env
appID=dAi5R82Pf9xC38kHkGeAFaOknIUImdmS-1589882527
cluster=test
datacenter=testx
namespace=mynamespace
workspace=8334431b-ef82-414f-9348-a8de032dfca7
===> User
uid=0(root) gid=0(root) groups=0(root)
===> Configuring ...
===> Running preflight checks ...
===> Check if Kafka is healthy ...
[main] INFO org.apache.kafka.clients.admin.AdminClientConfig - AdminClientConfig values:
bootstrap.servers = [PLAINTEXT://test-kafka:9092]
client.dns.lookup = default
client.id =
connections.max.idle.ms = 300000
default.api.timeout.ms = 60000
metadata.max.age.ms = 300000
metric.reporters = []
metrics.num.samples = 2
metrics.recording.level = INFO
metrics.sample.window.ms = 30000
receive.buffer.bytes = 65536
reconnect.backoff.max.ms = 1000
reconnect.backoff.ms = 50
request.timeout.ms = 30000
retries = 2147483647
retry.backoff.ms = 100
sasl.client.callback.handler.class = null
sasl.jaas.config = null
sasl.kerberos.kinit.cmd = /usr/bin/kinit
sasl.kerberos.min.time.before.relogin = 60000
sasl.kerberos.service.name = null
sasl.kerberos.ticket.renew.jitter = 0.05
sasl.kerberos.ticket.renew.window.factor = 0.8
sasl.login.callback.handler.class = null
sasl.login.class = null
sasl.login.refresh.buffer.seconds = 300
sasl.login.refresh.min.period.seconds = 60
sasl.login.refresh.window.factor = 0.8
sasl.login.refresh.window.jitter = 0.05
sasl.mechanism = GSSAPI
security.protocol = PLAINTEXT
security.providers = null
send.buffer.bytes = 131072
ssl.cipher.suites = null
ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]
ssl.endpoint.identification.algorithm = https
ssl.key.password = null
ssl.keymanager.algorithm = SunX509
ssl.keystore.location = null
ssl.keystore.password = null
ssl.keystore.type = JKS
ssl.protocol = TLS
ssl.provider = null
ssl.secure.random.implementation = null
ssl.trustmanager.algorithm = PKIX
ssl.truststore.location = null
ssl.truststore.password = null
ssl.truststore.type = JKS
[main] INFO org.apache.kafka.common.utils.AppInfoParser - Kafka version: 5.5.0-ccs
[main] INFO org.apache.kafka.common.utils.AppInfoParser - Kafka commitId: 785a156634af5f7e
[main] INFO org.apache.kafka.common.utils.AppInfoParser - Kafka startTimeMs: 1589883940496
[kafka-admin-client-thread | adminclient-1] INFO org.apache.kafka.clients.admin.internals.AdminMetadataManager - [AdminClient clientId=adminclient-1] Metadata update failed
org.apache.kafka.common.errors.TimeoutException: Call(callName=fetchMetadata, deadlineMs=1589883970509) timed out at 1589883970510 after 281 attempt(s)
Caused by: org.apache.kafka.common.errors.TimeoutException: Timed out waiting for a node assignment.
The error is:
[kafka-admin-client-thread | adminclient-1] INFO org.apache.kafka.clients.admin.internals.AdminMetadataManager - [AdminClient clientId=adminclient-1] Metadata update failed
org.apache.kafka.common.errors.TimeoutException: Call(callName=fetchMetadata, deadlineMs=1589883970509) timed out at 1589883970510 after 281 attempt(s)
Caused by: org.apache.kafka.common.errors.TimeoutException: Timed out waiting for a node
Refer to this approach:
sasl-scram-connect-workers
Can someone help me how to resolve this issue?
Change your boostrapServers parameter to point to the SASL listerner. For example:
SASL_SSL://test-kafka:9093

#KafkaListener not recovering after DisconnectException

I have a Kafka consumer (Spring boot) configured using #KafkaListener. This was running in production and all was good until as part of the maintenance the brokers were restarted. By docs, I was expecting that the kafka listener would recover once broker is back up. However this is not what I observed from the logs. The logs stopped with following Exception:
2020-04-22 11:11:28,802|INFO|automator-consumer-app-id-0-C-1|org.apache.kafka.clients.FetchSessionHandler|[Consumer clientId=automator-consumer-app-id-0, groupId=automator-consumer-app-id] Node 10 was unable to process the fetch request with (sessionId=2138208872, epoch=348): FETCH_SESSION_ID_NOT_FOUND.
2020-04-22 11:24:23,798|INFO|automator-consumer-app-id-0-C-1|org.apache.kafka.clients.FetchSessionHandler|[Consumer clientId=automator-consumer-app-id-0, groupId=automator-consumer-app-id] Error sending fetch request (sessionId=499459047, epoch=314160) to node 7: org.apache.kafka.common.errors.DisconnectException.
2020-04-22 11:36:37,241|INFO|automator-consumer-app-id-0-C 1|org.apache.kafka.clients.FetchSessionHandler|[Consumer clientId=automator-consumer-app-id-0, groupId=automator-consumer-app-id] Error sending fetch request (sessionId=2033512553, epoch=342949) to node 4: org.apache.kafka.common.errors.DisconnectException.
Once the application was restarted, the connectivity reestablished. I was wondering if this could be related with any of the consumer configuration below.
2020-04-22 12:46:59,681|INFO|main|org.apache.kafka.clients.consumer.ConsumerConfig|ConsumerConfig values:
allow.auto.create.topics = true
auto.commit.interval.ms = 5000
auto.offset.reset = latest
bootstrap.servers = [msk-e00-br1.int.bell.ca:9092]
check.crcs = true
client.dns.lookup = default
client.id = automator-consumer-app-id-0
client.rack =
connections.max.idle.ms = 540000
default.api.timeout.ms = 60000
enable.auto.commit = false
exclude.internal.topics = true
fetch.max.bytes = 52428800
fetch.max.wait.ms = 500
fetch.min.bytes = 1
group.id = automator-consumer-app-id
group.instance.id = null
heartbeat.interval.ms = 3000
interceptor.classes = []
internal.leave.group.on.close = true
isolation.level = read_uncommitted
key.deserializer = class org.apache.kafka.common.serialization.StringDeserializer
max.partition.fetch.bytes = 1048576
max.poll.interval.ms = 300000
max.poll.records = 500
metadata.max.age.ms = 300000
metric.reporters = []
metrics.num.samples = 2
metrics.recording.level = INFO
metrics.sample.window.ms = 30000
partition.assignment.strategy = [class org.apache.kafka.clients.consumer.RangeAssignor]
receive.buffer.bytes = 65536
reconnect.backoff.max.ms = 1000
reconnect.backoff.ms = 50
request.timeout.ms = 30000
retry.backoff.ms = 100
sasl.client.callback.handler.class = null
sasl.jaas.config = null
sasl.kerberos.kinit.cmd = /usr/bin/kinit
sasl.kerberos.min.time.before.relogin = 60000
sasl.kerberos.service.name = null
sasl.kerberos.ticket.renew.jitter = 0.05
sasl.kerberos.ticket.renew.window.factor = 0.8
sasl.login.callback.handler.class = null
sasl.login.class = null
sasl.login.refresh.buffer.seconds = 300
sasl.login.refresh.min.period.seconds = 60
sasl.login.refresh.window.factor = 0.8
sasl.login.refresh.window.jitter = 0.05
sasl.mechanism = GSSAPI
security.protocol = PLAINTEXT
send.buffer.bytes = 131072
session.timeout.ms = 10000
ssl.cipher.suites = null
ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]
ssl.endpoint.identification.algorithm = https
ssl.key.password = null
ssl.keymanager.algorithm = SunX509
ssl.keystore.location = null
ssl.keystore.password = null
ssl.keystore.type = JKS
ssl.protocol = TLS
ssl.provider = null
ssl.secure.random.implementation = null
ssl.trustmanager.algorithm = PKIX
ssl.truststore.location = null
ssl.truststore.password = null
ssl.truststore.type = JKS
value.deserializer = class org.apache.kafka.common.serialization.StringDeserializer
Increase the value of
max.incremental.fetch.session.cache.slots
. The default value is 1K. You can refer the answer here : How to check the actual number of incremental fetch session cache slots used in Kafka cluster?
Basically, your application is continuously listening the messages from the topic, suppose if there is no message published in the topic you will get this type of exception.
org.apache.kafka.common.errors.DisconnectException: null
Disconnect Exception class
If we start sending messages to topic, application will start running and consume those messages.
Here you need to increase the request timeout in your properties files.
consumer.request.timeout.ms:

Kafka - Connection to node -1 could not be established

I'm trying to consume a kafka topic using apache flink streaming.
But I'm getting this issue.
2018-04-10 02:55:59,856|- ProducerConfig values:
acks = 1
batch.size = 16384
bootstrap.servers = [localhost:9092]
buffer.memory = 33554432
client.id =
compression.type = none
connections.max.idle.ms = 540000
enable.idempotence = false
interceptor.classes = null
key.serializer = class org.apache.kafka.common.serialization.StringSerializer
linger.ms = 1
max.block.ms = 60000
max.in.flight.requests.per.connection = 5
max.request.size = 1048576
metadata.max.age.ms = 300000
metric.reporters = []
metrics.num.samples = 2
metrics.recording.level = INFO
metrics.sample.window.ms = 30000
partitioner.class = class org.apache.kafka.clients.producer.internals.DefaultPartitioner
receive.buffer.bytes = 32768
reconnect.backoff.max.ms = 1000
reconnect.backoff.ms = 50
request.timeout.ms = 30000
retries = 2
retry.backoff.ms = 100
sasl.jaas.config = null
sasl.kerberos.kinit.cmd = /usr/bin/kinit
sasl.kerberos.min.time.before.relogin = 60000
sasl.kerberos.service.name = null
sasl.kerberos.ticket.renew.jitter = 0.05
sasl.kerberos.ticket.renew.window.factor = 0.8
sasl.mechanism = GSSAPI
security.protocol = PLAINTEXT
send.buffer.bytes = 131072
ssl.cipher.suites = null
ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]
ssl.endpoint.identification.algorithm = null
ssl.key.password = null
ssl.keymanager.algorithm = SunX509
ssl.keystore.location = null
ssl.keystore.password = null
ssl.keystore.type = JKS
ssl.protocol = TLS
ssl.provider = null
ssl.secure.random.implementation = null
ssl.trustmanager.algorithm = PKIX
ssl.truststore.location = null
ssl.truststore.password = null
ssl.truststore.type = JKS
transaction.timeout.ms = 60000
transactional.id = null
value.serializer = class org.apache.kafka.common.serialization.StringSerializer
2018-04-10 02:56:00,052|- The configuration 'auto.create.topics.enable' was supplied but isn't a known config.
2018-04-10 02:56:00,064|- Kafka version : 1.0.0
2018-04-10 02:56:00,064|- Kafka commitId : aaa7af6d4a11b29d
2018-04-10 02:56:40,064|- ConsumerConfig values:
auto.commit.interval.ms = 1000
auto.offset.reset = latest
bootstrap.servers = [localhost:9092]
check.crcs = true
client.id =
connections.max.idle.ms = 540000
enable.auto.commit = true
exclude.internal.topics = true
fetch.max.bytes = 52428800
fetch.max.wait.ms = 500
fetch.min.bytes = 1
group.id = engine-kafka-consumer
heartbeat.interval.ms = 3000
interceptor.classes = null
internal.leave.group.on.close = true
isolation.level = read_uncommitted
key.deserializer = class org.apache.kafka.common.serialization.ByteArrayDeserializer
max.partition.fetch.bytes = 1048576
max.poll.interval.ms = 300000
max.poll.records = 500
metadata.max.age.ms = 300000
metric.reporters = []
metrics.num.samples = 2
metrics.recording.level = INFO
metrics.sample.window.ms = 30000
partition.assignment.strategy = [class org.apache.kafka.clients.consumer.RangeAssignor]
receive.buffer.bytes = 65536
reconnect.backoff.max.ms = 1000
reconnect.backoff.ms = 50
request.timeout.ms = 305000
retry.backoff.ms = 100
sasl.jaas.config = null
sasl.kerberos.kinit.cmd = /usr/bin/kinit
sasl.kerberos.min.time.before.relogin = 60000
sasl.kerberos.service.name = null
sasl.kerberos.ticket.renew.jitter = 0.05
sasl.kerberos.ticket.renew.window.factor = 0.8
sasl.mechanism = GSSAPI
security.protocol = PLAINTEXT
send.buffer.bytes = 131072
session.timeout.ms = 30000
ssl.cipher.suites = null
ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]
ssl.endpoint.identification.algorithm = null
ssl.key.password = null
ssl.keymanager.algorithm = SunX509
ssl.keystore.location = null
ssl.keystore.password = null
ssl.keystore.type = JKS
ssl.protocol = TLS
ssl.provider = null
ssl.secure.random.implementation = null
ssl.trustmanager.algorithm = PKIX
ssl.truststore.location = null
ssl.truststore.password = null
ssl.truststore.type = JKS
value.deserializer = class org.apache.kafka.common.serialization.ByteArrayDeserializer
2018-04-10 02:56:40,073|- ConsumerConfig values:
auto.commit.interval.ms = 1000
auto.offset.reset = latest
bootstrap.servers = [localhost:9092]
check.crcs = true
client.id =
connections.max.idle.ms = 540000
enable.auto.commit = true
exclude.internal.topics = true
fetch.max.bytes = 52428800
fetch.max.wait.ms = 500
fetch.min.bytes = 1
group.id = engine-kafka-consumer
heartbeat.interval.ms = 3000
interceptor.classes = null
internal.leave.group.on.close = true
isolation.level = read_uncommitted
key.deserializer = class org.apache.kafka.common.serialization.ByteArrayDeserializer
max.partition.fetch.bytes = 1048576
max.poll.interval.ms = 300000
max.poll.records = 500
metadata.max.age.ms = 300000
metric.reporters = []
metrics.num.samples = 2
metrics.recording.level = INFO
metrics.sample.window.ms = 30000
partition.assignment.strategy = [class org.apache.kafka.clients.consumer.RangeAssignor]
receive.buffer.bytes = 65536
reconnect.backoff.max.ms = 1000
reconnect.backoff.ms = 50
request.timeout.ms = 305000
retry.backoff.ms = 100
sasl.jaas.config = null
sasl.kerberos.kinit.cmd = /usr/bin/kinit
sasl.kerberos.min.time.before.relogin = 60000
sasl.kerberos.service.name = null
sasl.kerberos.ticket.renew.jitter = 0.05
sasl.kerberos.ticket.renew.window.factor = 0.8
sasl.mechanism = GSSAPI
security.protocol = PLAINTEXT
send.buffer.bytes = 131072
session.timeout.ms = 30000
ssl.cipher.suites = null
ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]
ssl.endpoint.identification.algorithm = null
ssl.key.password = null
ssl.keymanager.algorithm = SunX509
ssl.keystore.location = null
ssl.keystore.password = null
ssl.keystore.type = JKS
ssl.protocol = TLS
ssl.provider = null
ssl.secure.random.implementation = null
ssl.trustmanager.algorithm = PKIX
ssl.truststore.location = null
ssl.truststore.password = null
ssl.truststore.type = JKS
value.deserializer = class org.apache.kafka.common.serialization.ByteArrayDeserializer
2018-04-10 02:56:40,079|- ConsumerConfig values:
auto.commit.interval.ms = 1000
auto.offset.reset = latest
bootstrap.servers = [localhost:9092]
check.crcs = true
client.id =
connections.max.idle.ms = 540000
enable.auto.commit = true
exclude.internal.topics = true
fetch.max.bytes = 52428800
fetch.max.wait.ms = 500
fetch.min.bytes = 1
group.id = engine-kafka-consumer
heartbeat.interval.ms = 3000
interceptor.classes = null
internal.leave.group.on.close = true
isolation.level = read_uncommitted
key.deserializer = class org.apache.kafka.common.serialization.ByteArrayDeserializer
max.partition.fetch.bytes = 1048576
max.poll.interval.ms = 300000
max.poll.records = 500
metadata.max.age.ms = 300000
metric.reporters = []
metrics.num.samples = 2
metrics.recording.level = INFO
metrics.sample.window.ms = 30000
partition.assignment.strategy = [class org.apache.kafka.clients.consumer.RangeAssignor]
receive.buffer.bytes = 65536
reconnect.backoff.max.ms = 1000
reconnect.backoff.ms = 50
request.timeout.ms = 305000
retry.backoff.ms = 100
sasl.jaas.config = null
sasl.kerberos.kinit.cmd = /usr/bin/kinit
sasl.kerberos.min.time.before.relogin = 60000
sasl.kerberos.service.name = null
sasl.kerberos.ticket.renew.jitter = 0.05
sasl.kerberos.ticket.renew.window.factor = 0.8
sasl.mechanism = GSSAPI
security.protocol = PLAINTEXT
send.buffer.bytes = 131072
session.timeout.ms = 30000
ssl.cipher.suites = null
ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]
ssl.endpoint.identification.algorithm = null
ssl.key.password = null
ssl.keymanager.algorithm = SunX509
ssl.keystore.location = null
ssl.keystore.password = null
ssl.keystore.type = JKS
ssl.protocol = TLS
ssl.provider = null
ssl.secure.random.implementation = null
ssl.trustmanager.algorithm = PKIX
ssl.truststore.location = null
ssl.truststore.password = null
ssl.truststore.type = JKS
value.deserializer = class org.apache.kafka.common.serialization.ByteArrayDeserializer
2018-04-10 02:56:40,082|- ConsumerConfig values:
auto.commit.interval.ms = 1000
auto.offset.reset = latest
bootstrap.servers = [localhost:9092]
check.crcs = true
client.id =
connections.max.idle.ms = 540000
enable.auto.commit = true
exclude.internal.topics = true
fetch.max.bytes = 52428800
fetch.max.wait.ms = 500
fetch.min.bytes = 1
group.id = engine-kafka-consumer
heartbeat.interval.ms = 3000
interceptor.classes = null
internal.leave.group.on.close = true
isolation.level = read_uncommitted
key.deserializer = class org.apache.kafka.common.serialization.ByteArrayDeserializer
max.partition.fetch.bytes = 1048576
max.poll.interval.ms = 300000
max.poll.records = 500
metadata.max.age.ms = 300000
metric.reporters = []
metrics.num.samples = 2
metrics.recording.level = INFO
metrics.sample.window.ms = 30000
partition.assignment.strategy = [class org.apache.kafka.clients.consumer.RangeAssignor]
receive.buffer.bytes = 65536
reconnect.backoff.max.ms = 1000
reconnect.backoff.ms = 50
request.timeout.ms = 305000
retry.backoff.ms = 100
sasl.jaas.config = null
sasl.kerberos.kinit.cmd = /usr/bin/kinit
sasl.kerberos.min.time.before.relogin = 60000
sasl.kerberos.service.name = null
sasl.kerberos.ticket.renew.jitter = 0.05
sasl.kerberos.ticket.renew.window.factor = 0.8
sasl.mechanism = GSSAPI
security.protocol = PLAINTEXT
send.buffer.bytes = 131072
session.timeout.ms = 30000
ssl.cipher.suites = null
ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]
ssl.endpoint.identification.algorithm = null
ssl.key.password = null
ssl.keymanager.algorithm = SunX509
ssl.keystore.location = null
ssl.keystore.password = null
ssl.keystore.type = JKS
ssl.protocol = TLS
ssl.provider = null
ssl.secure.random.implementation = null
ssl.trustmanager.algorithm = PKIX
ssl.truststore.location = null
ssl.truststore.password = null
ssl.truststore.type = JKS
value.deserializer = class org.apache.kafka.common.serialization.ByteArrayDeserializer
2018-04-10 02:56:40,783|- Kafka version : 1.0.0
2018-04-10 02:56:40,783|- Kafka commitId : aaa7af6d4a11b29d
2018-04-10 02:56:40,818|- Kafka version : 1.0.0
2018-04-10 02:56:40,819|- Kafka commitId : aaa7af6d4a11b29d
2018-04-10 02:56:40,819|- Kafka version : 1.0.0
2018-04-10 02:56:40,819|- Kafka commitId : aaa7af6d4a11b29d
2018-04-10 02:56:40,820|- Kafka version : 1.0.0
2018-04-10 02:56:40,820|- Kafka commitId : aaa7af6d4a11b29d
2018-04-10 02:56:41,906|- [Consumer clientId=consumer-2, groupId=engine-kafka-consumer] Connection to node -1 could not be established. Broker may not be available.
2018-04-10 02:56:41,925|- [Consumer clientId=consumer-4, groupId=engine-kafka-consumer] Connection to node -1 could not be established. Broker may not be available.
2018-04-10 02:56:41,931|- [Consumer clientId=consumer-1, groupId=engine-kafka-consumer] Connection to node -1 could not be established. Broker may not be available.
2018-04-10 02:56:41,948|- [Consumer clientId=consumer-3, groupId=engine-kafka-consumer] Connection to node -1 could not be established. Broker may not be available.
2018-04-10 02:56:42,013|- [Consumer clientId=consumer-2, groupId=engine-kafka-consumer] Connection to node -1 could not be established. Broker may not be available.
Below are the versions that I use.
// kafka
"org.apache.kafka" %% "kafka" % "1.0.0"
"net.manub" %% "scalatest-embedded-kafka" % "0.14.0" % Test
//flink
"org.apache.flink" %% "flink-scala" % "1.4.2"
"org.apache.flink" %% "flink-streaming-scala" % "1.4.2"
"org.apache.flink" %% "flink-connector-kafka-0.9" % "1.4.0"
"org.apache.flink" %% "flink-connector-cassandra" % "1.4.2"
I met same problem. when consumer run another computer,not kafka server,this will happen. pls modify
config/server.properties . listeners=PLAINTEXT://serverip:9092.
ATTENTION:severip could not set to 127.0.0.1 or localhost, should be set to the ip that consumer can connect to.