KafkaConsumer: how to increase log level? - apache-kafka

When I run my Java application and instantiate the KafkaConsumer object (fed with the minimum required properties: key and value deserializer and group_id); I see lots of INFO messages on the StdOut (If I provide unsupported properties, I also see WARNING messages).
I want to see when fetch events take place. I assume that by increasing the loglevel to DEBUG I will be able to see that. Unfortunately, I am not able to increase it.
I tried to feed the log4j.properties file in multiple ways (placing the file at specific paths and also passing it as parameter (-Dlog4j.configuration). The output remains the same.
cd /Users/user/git/kafka/toys; JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.8.0_191.jdk/Contents/Home "/Applications/NetBeans/NetBeans 8.2.app/Contents/Resources/NetBeans/java/maven/bin/mvn" "-Dexec.args=-classpath %classpath ch.demo.toys.CarthusianConsumer" -Dexec.executable=/Library/Java/JavaVirtualMachines/jdk1.8.0_191.jdk/Contents/Home/bin/java -Dexec.classpathScope=runtime -DskipTests=true org.codehaus.mojo:exec-maven-plugin:1.2.1:exec
Running NetBeans Compile On Save execution. Phase execution is skipped and output directories of dependency projects (with Compile on Save turned on) will be used instead of their jar artifacts.
Scanning for projects...
------------------------------------------------------------------------
Building toys 1.0-SNAPSHOT
------------------------------------------------------------------------
--- exec-maven-plugin:1.2.1:exec (default-cli) # toys ---
Jul 10, 2019 2:52:00 PM org.apache.kafka.common.config.AbstractConfig logAll
INFO: ConsumerConfig values:
allow.auto.create.topics = true
auto.commit.interval.ms = 5000
auto.offset.reset = earliest
bootstrap.servers = [kafka-server:9090, kafka-server:9091, kafka-server:9092]
check.crcs = true
client.dns.lookup = default
client.id =
client.rack =
connections.max.idle.ms = 540000
default.api.timeout.ms = 60000
enable.auto.commit = true
exclude.internal.topics = true
fetch.max.bytes = 52428800
fetch.max.wait.ms = 500
fetch.min.bytes = 1
group.id = carthusian-consumer
group.instance.id = null
heartbeat.interval.ms = 3000
interceptor.classes = []
internal.leave.group.on.close = true
isolation.level = read_uncommitted
key.deserializer = class org.apache.kafka.common.serialization.IntegerDeserializer
max.partition.fetch.bytes = 1048576
max.poll.interval.ms = 300000
max.poll.records = 100000
metadata.max.age.ms = 300000
metric.reporters = []
metrics.num.samples = 2
metrics.recording.level = DEBUG
metrics.sample.window.ms = 30000
partition.assignment.strategy = [class org.apache.kafka.clients.consumer.RangeAssignor]
receive.buffer.bytes = 65536
reconnect.backoff.max.ms = 1000
reconnect.backoff.ms = 50
request.timeout.ms = 30000
retry.backoff.ms = 100
sasl.client.callback.handler.class = null
sasl.jaas.config = null
sasl.kerberos.kinit.cmd = /usr/bin/kinit
sasl.kerberos.min.time.before.relogin = 60000
sasl.kerberos.service.name = null
sasl.kerberos.ticket.renew.jitter = 0.05
sasl.kerberos.ticket.renew.window.factor = 0.8
sasl.login.callback.handler.class = null
sasl.login.class = null
sasl.login.refresh.buffer.seconds = 300
sasl.login.refresh.min.period.seconds = 60
sasl.login.refresh.window.factor = 0.8
sasl.login.refresh.window.jitter = 0.05
sasl.mechanism = GSSAPI
security.protocol = PLAINTEXT
send.buffer.bytes = 131072
session.timeout.ms = 10000
ssl.cipher.suites = null
ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]
ssl.endpoint.identification.algorithm = https
ssl.key.password = null
ssl.keymanager.algorithm = SunX509
ssl.keystore.location = null
ssl.keystore.password = null
ssl.keystore.type = JKS
ssl.protocol = TLS
ssl.provider = null
ssl.secure.random.implementation = null
ssl.trustmanager.algorithm = PKIX
ssl.truststore.location = null
ssl.truststore.password = null
ssl.truststore.type = JKS
value.deserializer = class org.apache.kafka.common.serialization.StringDeserializer
Jul 10, 2019 2:52:01 PM org.apache.kafka.common.utils.AppInfoParser$AppInfo <init>
INFO: Kafka version: 2.3.0
Jul 10, 2019 2:52:01 PM org.apache.kafka.common.utils.AppInfoParser$AppInfo <init>
INFO: Kafka commitId: fc1aaa116b661c8a
Jul 10, 2019 2:52:01 PM org.apache.kafka.common.utils.AppInfoParser$AppInfo <init>
INFO: Kafka startTimeMs: 1562763121219
Jul 10, 2019 2:52:01 PM org.apache.kafka.clients.consumer.KafkaConsumer subscribe
INFO: [Consumer clientId=consumer-1, groupId=carthusian-consumer] Subscribed to topic(s): sequence
Jul 10, 2019 2:52:01 PM org.apache.kafka.clients.Metadata update
INFO: [Consumer clientId=consumer-1, groupId=carthusian-consumer] Cluster ID: REIXp5FySKGPHlRyfTALLQ
Jul 10, 2019 2:52:01 PM org.apache.kafka.clients.consumer.internals.AbstractCoordinator$FindCoordinatorResponseHandler onSuccess
INFO: [Consumer clientId=consumer-1, groupId=carthusian-consumer] Discovered group coordinator kafka-tds:9091 (id: 2147483646 rack: null)
Jul 10, 2019 2:52:01 PM org.apache.kafka.clients.consumer.internals.ConsumerCoordinator onJoinPrepare
INFO: [Consumer clientId=consumer-1, groupId=carthusian-consumer] Revoking previously assigned partitions []
Revoke event: []
Jul 10, 2019 2:52:01 PM org.apache.kafka.clients.consumer.internals.AbstractCoordinator sendJoinGroupRequest
INFO: [Consumer clientId=consumer-1, groupId=carthusian-consumer] (Re-)joining group
Jul 10, 2019 2:52:01 PM org.apache.kafka.clients.consumer.internals.AbstractCoordinator sendJoinGroupRequest
INFO: [Consumer clientId=consumer-1, groupId=carthusian-consumer] (Re-)joining group
Jul 10, 2019 2:52:01 PM org.apache.kafka.clients.consumer.internals.AbstractCoordinator$1 onSuccess
INFO: [Consumer clientId=consumer-1, groupId=carthusian-consumer] Successfully joined group with generation 96
Jul 10, 2019 2:52:01 PM org.apache.kafka.clients.consumer.internals.ConsumerCoordinator onJoinComplete
INFO: [Consumer clientId=consumer-1, groupId=carthusian-consumer] Setting newly assigned partitions: sequence-1, sequence-0
Assignment event: [sequence-1, sequence-0]
Jul 10, 2019 2:52:01 PM org.apache.kafka.clients.consumer.internals.SubscriptionState lambda$requestOffsetReset$3
INFO: [Consumer clientId=consumer-1, groupId=carthusian-consumer] Seeking to EARLIEST offset of partition sequence-1
Jul 10, 2019 2:52:01 PM org.apache.kafka.clients.consumer.internals.SubscriptionState lambda$requestOffsetReset$3
INFO: [Consumer clientId=consumer-1, groupId=carthusian-consumer] Seeking to EARLIEST offset of partition sequence-0
Jul 10, 2019 2:52:01 PM org.apache.kafka.clients.consumer.internals.SubscriptionState maybeSeekUnvalidated
INFO: [Consumer clientId=consumer-1, groupId=carthusian-consumer] Resetting offset for partition sequence-0 to offset 0.
Jul 10, 2019 2:52:01 PM org.apache.kafka.clients.consumer.internals.SubscriptionState maybeSeekUnvalidated
INFO: [Consumer clientId=consumer-1, groupId=carthusian-consumer] Resetting offset for partition sequence-1 to offset 0.
Loaded 9804 records from [sequence-0] partitions
Loaded 9804 records from [sequence-1] partitions
Loaded 9799 records from [sequence-0] partitions
Loaded 9799 records from [sequence-1] partitions
Loaded 9799 records from [sequence-0] partitions
Loaded 9799 records from [sequence-1] partitions
Loaded 9799 records from [sequence-0] partitions
Loaded 9799 records from [sequence-1] partitions
Loaded 9799 records from [sequence-0] partitions

Solved by placing the following (simple) log4j.properties under src/main/resources and running the app straight from console (rather than from the IDE). Fetching messages are now shown.
# Root logger option
log4j.rootLogger=DEBUG, stdout
# Direct log messages to stdout
log4j.appender.stdout=org.apache.log4j.ConsoleAppender
log4j.appender.stdout.Target=System.out
log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
log4j.appender.stdout.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss} %-5p %c{1} - %m%n
At the moment I do not know which class is generating the messages I am looking for, hence the DEBUG setting on the rootLogger.

Related

Kafka Avro Consumer (kafka-avro-console-consumer) Logging Level

Is there any way to turn off the "INFO" logging level in /usr/bin/kafka-avro-console-consumer?
Or, is there an alternate way to view data that is in an avro schema?
Right now I use the program (via Docker) and the output contains a large number of log messages (which I do not want) before it emits the data on the topic that I do want:
[2022-05-10 14:08:33,090] INFO Registered kafka:type=kafka.Log4jController MBean (kafka.utils.Log4jControllerRegistration$)
[2022-05-10 14:08:33,643] INFO ConsumerConfig values:
allow.auto.create.topics = true
auto.commit.interval.ms = 5000
auto.offset.reset = earliest
bootstrap.servers = [kafka:9092]
check.crcs = true
client.dns.lookup = use_all_dns_ips
client.id = consumer-console-consumer-25832-1
client.rack =
connections.max.idle.ms = 540000
default.api.timeout.ms = 60000
enable.auto.commit = false
exclude.internal.topics = true
fetch.max.bytes = 52428800
fetch.max.wait.ms = 500
fetch.min.bytes = 1
group.id = console-consumer-25832
group.instance.id = null
heartbeat.interval.ms = 3000
interceptor.classes = []
internal.leave.group.on.close = true
internal.throw.on.fetch.stable.offset.unsupported = false
isolation.level = read_uncommitted
key.deserializer = class org.apache.kafka.common.serialization.ByteArrayDeserializer
max.partition.fetch.bytes = 1048576
max.poll.interval.ms = 300000
max.poll.records = 500
metadata.max.age.ms = 300000
metric.reporters = []
metrics.num.samples = 2
metrics.recording.level = INFO
metrics.sample.window.ms = 30000
partition.assignment.strategy = [class org.apache.kafka.clients.consumer.RangeAssignor]
receive.buffer.bytes = 65536
reconnect.backoff.max.ms = 1000
reconnect.backoff.ms = 50
request.timeout.ms = 30000
retry.backoff.ms = 100
sasl.client.callback.handler.class = null
sasl.jaas.config = null
sasl.kerberos.kinit.cmd = /usr/bin/kinit
sasl.kerberos.min.time.before.relogin = 60000
sasl.kerberos.service.name = null
sasl.kerberos.ticket.renew.jitter = 0.05
sasl.kerberos.ticket.renew.window.factor = 0.8
sasl.login.callback.handler.class = null
sasl.login.class = null
sasl.login.refresh.buffer.seconds = 300
sasl.login.refresh.min.period.seconds = 60
sasl.login.refresh.window.factor = 0.8
sasl.login.refresh.window.jitter = 0.05
sasl.mechanism = GSSAPI
security.protocol = PLAINTEXT
security.providers = null
send.buffer.bytes = 131072
session.timeout.ms = 10000
socket.connection.setup.timeout.max.ms = 30000
socket.connection.setup.timeout.ms = 10000
ssl.cipher.suites = null
ssl.enabled.protocols = [TLSv1.2, TLSv1.3]
ssl.endpoint.identification.algorithm = https
ssl.engine.factory.class = null
ssl.key.password = null
ssl.keymanager.algorithm = SunX509
ssl.keystore.certificate.chain = null
ssl.keystore.key = null
ssl.keystore.location = null
ssl.keystore.password = null
ssl.keystore.type = JKS
ssl.protocol = TLSv1.3
ssl.provider = null
ssl.secure.random.implementation = null
ssl.trustmanager.algorithm = PKIX
ssl.truststore.certificates = null
ssl.truststore.location = null
ssl.truststore.password = null
ssl.truststore.type = JKS
value.deserializer = class org.apache.kafka.common.serialization.ByteArrayDeserializer
(org.apache.kafka.clients.consumer.ConsumerConfig)
[2022-05-10 14:08:33,809] INFO Kafka version: 6.2.0-ce (org.apache.kafka.common.utils.AppInfoParser)
[2022-05-10 14:08:33,809] INFO Kafka commitId: 5c753752ae1445a1 (org.apache.kafka.common.utils.AppInfoParser)
[2022-05-10 14:08:33,809] INFO Kafka startTimeMs: 1652191713801 (org.apache.kafka.common.utils.AppInfoParser)
[2022-05-10 14:08:33,814] INFO [Consumer clientId=consumer-console-consumer-25832-1, groupId=console-consumer-25832] Subscribed to topic(s): dbserver1.public.x_account (org.apache.kafka.clients.consumer.KafkaConsumer)
[2022-05-10 14:08:34,471] INFO [Consumer clientId=consumer-console-consumer-25832-1, groupId=console-consumer-25832] Cluster ID: 9-BvG22VQrimBsWAceE02Q (org.apache.kafka.clients.Metadata)
[2022-05-10 14:08:34,473] INFO [Consumer clientId=consumer-console-consumer-25832-1, groupId=console-consumer-25832] Discovered group coordinator kafka:9092 (id: 2147483646 rack: null) (org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
[2022-05-10 14:08:34,477] INFO [Consumer clientId=consumer-console-consumer-25832-1, groupId=console-consumer-25832] (Re-)joining group (org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
[2022-05-10 14:08:34,501] INFO [Consumer clientId=consumer-console-consumer-25832-1, groupId=console-consumer-25832] (Re-)joining group (org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
[2022-05-10 14:08:34,507] INFO [Consumer clientId=consumer-console-consumer-25832-1, groupId=console-consumer-25832] Successfully joined group with generation Generation{generationId=1, memberId='consumer-console-consumer-25832-1-d482e04b-8ed0-4d68-b533-a010dde3c99a', protocol='range'} (org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
[2022-05-10 14:08:34,511] INFO [Consumer clientId=consumer-console-consumer-25832-1, groupId=console-consumer-25832] Finished assignment for group at generation 1: {consumer-console-consumer-25832-1-d482e04b-8ed0-4d68-b533-a010dde3c99a=Assignment(partitions=[dbserver1.public.x_account-0])} (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
[2022-05-10 14:08:34,524] INFO [Consumer clientId=consumer-console-consumer-25832-1, groupId=console-consumer-25832] Successfully synced group in generation Generation{generationId=1, memberId='consumer-console-consumer-25832-1-d482e04b-8ed0-4d68-b533-a010dde3c99a', protocol='range'} (org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
[2022-05-10 14:08:34,525] INFO [Consumer clientId=consumer-console-consumer-25832-1, groupId=console-consumer-25832] Notifying assignor about the new Assignment(partitions=[dbserver1.public.x_account-0]) (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
[2022-05-10 14:08:34,529] INFO [Consumer clientId=consumer-console-consumer-25832-1, groupId=console-consumer-25832] Adding newly assigned partitions: dbserver1.public.x_account-0 (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
[2022-05-10 14:08:34,540] INFO [Consumer clientId=consumer-console-consumer-25832-1, groupId=console-consumer-25832] Found no committed offset for partition dbserver1.public.x_account-0 (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
The command that generates the above output is provided below:
docker exec -i [schema-registry-container] /usr/bin/kafka-avro-console-consumer \
--bootstrap-server kafka:9092 \
--topic [some-topic] \
--from-beginning \
--property schema.registry.url="http://schema-registry:8081"
For the Confluent Schema Registry image, you can add this env-var for completely disabling any consumer package logs
SCHEMA_REGISTRY_LOG4J_LOGGERS="org.apache.kafka.clients.consumer=OFF"
Comma-separate more packages to further configure log4j
alternate way to view data that is in an avro schema
kcat or tools like AKHQ, Conduktor, etc.

Kafka consumer does not poll records intermittently

I have wrote a simple utility in scala to read kafka message as byte array.
The utility works on one machine but not on the other. Both are on same OS (centos 7) and same kafka server as well (which is in another machine all together).
However Kafka Tool (www.kafkatool.com) works on the machine which the utility not - so its not likely accessibility issue.
Following is the essence of the consumer code:
import java.io.BufferedOutputStream
import java.util.Properties
import org.apache.kafka.clients.consumer.KafkaConsumer
val outputFile = "output.txt"
val topic = "test_topic"
val server = "localhost:9092"
val id = "record-tool"
val props = new Properties()
props.put("bootstrap.servers", server)
props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer")
props.put("value.deserializer", "org.apache.kafka.common.serialization.ByteArrayDeserializer")
props.put("auto.offset.reset", "earliest")
props.put("enable.auto.commit", "false")
props.put("max.partition.fetch.bytes", "104857600")
props.put("group.id", id)
val bos = new BufferedOutputStream(new FileOutputStream(outputFile))
val consumer = new KafkaConsumer[String, Array[Byte]](props)
consumer.subscribe(Seq(topic).asJava)
Stream.continually(consumer.poll(5000).asScala.toList).takeWhile(_.nonEmpty).flatten.foreach(c => bos.write(c.value)))
consumer.close
bos.close
I dont see any errors in the logs as well, following is debug log
[root#vm util]# bin/record-tool consume --server=kafka-server:9092 --topic=test_topic --asBin --debug
16:44:02.548 [main] INFO org.apache.kafka.clients.consumer.ConsumerConfig - ConsumerConfig values:
metric.reporters = []
metadata.max.age.ms = 300000
partition.assignment.strategy = [org.apache.kafka.clients.consumer.RangeAssignor]
reconnect.backoff.ms = 50
sasl.kerberos.ticket.renew.window.factor = 0.8
max.partition.fetch.bytes = 104857600
bootstrap.servers = [kafka-server:9092]
ssl.keystore.type = JKS
enable.auto.commit = false
sasl.mechanism = GSSAPI
interceptor.classes = null
exclude.internal.topics = true
ssl.truststore.password = null
client.id =
ssl.endpoint.identification.algorithm = null
max.poll.records = 2147483647
check.crcs = true
request.timeout.ms = 40000
heartbeat.interval.ms = 3000
auto.commit.interval.ms = 5000
receive.buffer.bytes = 65536
ssl.truststore.type = JKS
ssl.truststore.location = null
ssl.keystore.password = null
fetch.min.bytes = 1
send.buffer.bytes = 131072
value.deserializer = class org.apache.kafka.common.serialization.ByteArrayDeserializer
group.id = record-tool
retry.backoff.ms = 100
sasl.kerberos.kinit.cmd = /usr/bin/kinit
sasl.kerberos.service.name = null
sasl.kerberos.ticket.renew.jitter = 0.05
ssl.trustmanager.algorithm = PKIX
ssl.key.password = null
fetch.max.wait.ms = 500
sasl.kerberos.min.time.before.relogin = 60000
connections.max.idle.ms = 540000
session.timeout.ms = 30000
metrics.num.samples = 2
key.deserializer = class org.apache.kafka.common.serialization.StringDeserializer
ssl.protocol = TLS
ssl.provider = null
ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]
ssl.keystore.location = null
ssl.cipher.suites = null
security.protocol = PLAINTEXT
ssl.keymanager.algorithm = SunX509
metrics.sample.window.ms = 30000
auto.offset.reset = earliest
16:44:02.550 [main] DEBUG org.apache.kafka.clients.consumer.KafkaConsumer - Starting the Kafka consumer
16:44:02.621 [main] DEBUG org.apache.kafka.clients.Metadata - Updated cluster metadata version 1 to Cluster(nodes = [kafka-server:9092 (id: -1 rack: null)], partitions = [])
16:44:02.632 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name connections-closed:
16:44:02.636 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name connections-created:
16:44:02.637 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name bytes-sent-received:
16:44:02.637 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name bytes-sent:
16:44:02.638 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name bytes-received:
16:44:02.638 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name select-time:
16:44:02.639 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name io-time:
16:44:02.649 [main] INFO org.apache.kafka.clients.consumer.ConsumerConfig - ConsumerConfig values:
metric.reporters = []
metadata.max.age.ms = 300000
partition.assignment.strategy = [org.apache.kafka.clients.consumer.RangeAssignor]
reconnect.backoff.ms = 50
sasl.kerberos.ticket.renew.window.factor = 0.8
max.partition.fetch.bytes = 104857600
bootstrap.servers = [kafka-server:9092]
ssl.keystore.type = JKS
enable.auto.commit = false
sasl.mechanism = GSSAPI
interceptor.classes = null
exclude.internal.topics = true
ssl.truststore.password = null
client.id = consumer-1
ssl.endpoint.identification.algorithm = null
max.poll.records = 2147483647
check.crcs = true
request.timeout.ms = 40000
heartbeat.interval.ms = 3000
auto.commit.interval.ms = 5000
receive.buffer.bytes = 65536
ssl.truststore.type = JKS
ssl.truststore.location = null
ssl.keystore.password = null
fetch.min.bytes = 1
send.buffer.bytes = 131072
value.deserializer = class org.apache.kafka.common.serialization.ByteArrayDeserializer
group.id = record-tool
retry.backoff.ms = 100
sasl.kerberos.kinit.cmd = /usr/bin/kinit
sasl.kerberos.service.name = null
sasl.kerberos.ticket.renew.jitter = 0.05
ssl.trustmanager.algorithm = PKIX
ssl.key.password = null
fetch.max.wait.ms = 500
sasl.kerberos.min.time.before.relogin = 60000
connections.max.idle.ms = 540000
session.timeout.ms = 30000
metrics.num.samples = 2
key.deserializer = class org.apache.kafka.common.serialization.StringDeserializer
ssl.protocol = TLS
ssl.provider = null
ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]
ssl.keystore.location = null
ssl.cipher.suites = null
security.protocol = PLAINTEXT
ssl.keymanager.algorithm = SunX509
metrics.sample.window.ms = 30000
auto.offset.reset = earliest
16:44:02.657 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name heartbeat-latency
16:44:02.657 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name join-latency
16:44:02.657 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name sync-latency
16:44:02.659 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name commit-latency
16:44:02.663 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name bytes-fetched
16:44:02.664 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name records-fetched
16:44:02.664 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name fetch-latency
16:44:02.664 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name records-lag
16:44:02.664 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name fetch-throttle-time
16:44:02.666 [main] INFO org.apache.kafka.common.utils.AppInfoParser - Kafka version : 0.10.0.1
16:44:02.666 [main] INFO org.apache.kafka.common.utils.AppInfoParser - Kafka commitId : a7a17cdec9eaa6c5
16:44:02.668 [main] DEBUG org.apache.kafka.clients.consumer.KafkaConsumer - Kafka consumer created
16:44:02.680 [main] DEBUG org.apache.kafka.clients.consumer.KafkaConsumer - Subscribed to topic(s): test_topic
16:44:02.681 [main] DEBUG org.apache.kafka.clients.consumer.internals.AbstractCoordinator - Sending coordinator request for group record-tool to broker kafka-server:9092 (id: -1 rack: null)
16:44:02.695 [main] DEBUG org.apache.kafka.clients.NetworkClient - Initiating connection to node -1 at kafka-server:9092.
16:44:02.816 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name node--1.bytes-sent
16:44:02.817 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name node--1.bytes-received
16:44:02.818 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name node--1.latency
16:44:02.820 [main] DEBUG org.apache.kafka.clients.NetworkClient - Completed connection to node -1
16:44:02.902 [main] DEBUG org.apache.kafka.clients.NetworkClient - Sending metadata request {topics=[test_topic]} to node -1
16:44:02.981 [main] DEBUG org.apache.kafka.clients.Metadata - Updated cluster metadata version 2 to Cluster(nodes = [kafka-server.mydomain.com:9092 (id: 0 rack: null)], partitions = [Partition(topic = test_topic, partition = 0, leader = 0, replicas = [0,], isr = [0,]])
16:44:02.982 [main] DEBUG org.apache.kafka.clients.consumer.internals.AbstractCoordinator - Received group coordinator response ClientResponse(receivedTimeMs=1583225042982, disconnected=false, request=ClientRequest(expectResponse=true, callback=org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient$RequestFutureCompletionHandler#434a63ab, request=RequestSend(header={api_key=10,api_version=0,correlation_id=0,client_id=consumer-1}, body={group_id=record-tool}), createdTimeMs=1583225042692, sendTimeMs=1583225042904), responseBody={error_code=0,coordinator={node_id=0,host=kafka-server.mydomain.com,port=9092}})
16:44:02.983 [main] INFO org.apache.kafka.clients.consumer.internals.AbstractCoordinator - Discovered coordinator kafka-server.mydomain.com:9092 (id: 2147483647 rack: null) for group record-tool.
16:44:02.983 [main] DEBUG org.apache.kafka.clients.NetworkClient - Initiating connection to node 2147483647 at kafka-server.mydomain.com:9092.
16:44:02.986 [main] INFO org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - Revoking previously assigned partitions [] for group record-tool
16:44:02.986 [main] INFO org.apache.kafka.clients.consumer.internals.AbstractCoordinator - (Re-)joining group record-tool
16:44:02.988 [main] DEBUG org.apache.kafka.clients.consumer.internals.AbstractCoordinator - Sending JoinGroup ({group_id=record-tool,session_timeout=30000,member_id=,protocol_type=consumer,group_protocols=[{protocol_name=range,protocol_metadata=java.nio.HeapByteBuffer[pos=0 lim=25 cap=25]}]}) to coordinator kafka-server.mydomain.com:9092 (id: 2147483647 rack: null)
16:44:03.051 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name node-2147483647.bytes-sent
16:44:03.051 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name node-2147483647.bytes-received
16:44:03.052 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name node-2147483647.latency
16:44:03.052 [main] DEBUG org.apache.kafka.clients.NetworkClient - Completed connection to node 2147483647
16:44:03.123 [main] DEBUG org.apache.kafka.clients.consumer.internals.AbstractCoordinator - Received successful join group response for group record-tool: {error_code=0,generation_id=3,group_protocol=range,leader_id=consumer-1-f82633ab-a06e-4474-8ddb-1ec096d6c7f2,member_id=consumer-1-f82633ab-a06e-4474-8ddb-1ec096d6c7f2,members=[{member_id=consumer-1-f82633ab-a06e-4474-8ddb-1ec096d6c7f2,member_metadata=java.nio.HeapByteBuffer[pos=0 lim=25 cap=25]}]}
16:44:03.123 [main] DEBUG org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - Performing assignment for group record-tool using strategy range with subscriptions {consumer-1-f82633ab-a06e-4474-8ddb-1ec096d6c7f2=Subscription(topics=[test_topic])}
16:44:03.124 [main] DEBUG org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - Finished assignment for group record-tool: {consumer-1-f82633ab-a06e-4474-8ddb-1ec096d6c7f2=Assignment(partitions=[test_topic-0])}
16:44:03.124 [main] DEBUG org.apache.kafka.clients.consumer.internals.AbstractCoordinator - Sending leader SyncGroup for group record-tool to coordinator kafka-server.mydomain.com:9092 (id: 2147483647 rack: null): {group_id=record-tool,generation_id=3,member_id=consumer-1-f82633ab-a06e-4474-8ddb-1ec096d6c7f2,group_assignment=[{member_id=consumer-1-f82633ab-a06e-4474-8ddb-1ec096d6c7f2,member_assignment=java.nio.HeapByteBuffer[pos=0 lim=33 cap=33]}]}
16:44:03.198 [main] INFO org.apache.kafka.clients.consumer.internals.AbstractCoordinator - Successfully joined group record-tool with generation 3
16:44:03.199 [main] INFO org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - Setting newly assigned partitions [test_topic-0] for group record-tool
16:44:03.200 [main] DEBUG org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - Group record-tool fetching committed offsets for partitions: [test_topic-0]
16:44:03.268 [main] DEBUG org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - Group record-tool has no committed offset for partition test_topic-0
16:44:03.269 [main] DEBUG org.apache.kafka.clients.consumer.internals.Fetcher - Resetting offset for partition test_topic-0 to earliest offset.
16:44:03.270 [main] DEBUG org.apache.kafka.clients.NetworkClient - Initiating connection to node 0 at kafka-server.mydomain.com:9092.
16:44:03.336 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name node-0.bytes-sent
16:44:03.337 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name node-0.bytes-received
16:44:03.337 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name node-0.latency
16:44:03.338 [main] DEBUG org.apache.kafka.clients.NetworkClient - Completed connection to node 0
16:44:03.407 [main] DEBUG org.apache.kafka.clients.consumer.internals.Fetcher - Fetched offset 0 for partition test_topic-0
16:44:06.288 [main] DEBUG org.apache.kafka.clients.consumer.internals.AbstractCoordinator - Received successful heartbeat response for group record-tool
16:44:07.702 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Removed sensor with name connections-closed:
16:44:07.702 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Removed sensor with name connections-created:
16:44:07.702 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Removed sensor with name bytes-sent-received:
16:44:07.702 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Removed sensor with name bytes-sent:
16:44:07.703 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Removed sensor with name bytes-received:
16:44:07.703 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Removed sensor with name select-time:
16:44:07.704 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Removed sensor with name io-time:
16:44:07.704 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Removed sensor with name node--1.bytes-sent
16:44:07.705 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Removed sensor with name node--1.bytes-received
16:44:07.705 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Removed sensor with name node--1.latency
16:44:07.705 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Removed sensor with name node-2147483647.bytes-sent
16:44:07.706 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Removed sensor with name node-2147483647.bytes-received
16:44:07.706 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Removed sensor with name node-2147483647.latency
16:44:07.706 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Removed sensor with name node-0.bytes-sent
16:44:07.707 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Removed sensor with name node-0.bytes-received
16:44:07.707 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Removed sensor with name node-0.latency
16:44:07.707 [main] DEBUG org.apache.kafka.clients.consumer.KafkaConsumer - The Kafka consumer has closed.
What I noticed within takeWhile(_.nonEmpty) is the list is empty.
Is there any mistake in the code? Thanks.

Flink application sink KafkaProducer is throwing java heap space error (outofmemory)

I have created flink app which takes a datastream of strings and sink it with Kafka. The datastream of strings is simple strings fromCollection.
List<String> listOfStrings = new ArrayList<>();
listOfStrings.add("testkafka1");
listOfStrings.add("testkafka2");
listOfStrings.add("testkafka3");
listOfStrings.add("testkafka4");
DataStream<String> testStringStream = env.fromCollection(listOfStrings);
The flink runs on Kubernetes with parllelism 1 and 1 task manager. As soon as flink job starts it is failing with following error.
ERROR org.apache.kafka.common.utils.KafkaThread - Uncaught exception in kafka-producer-network-thread | producer-1:
java.lang.OutOfMemoryError: Java heap space
at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57)
at java.nio.ByteBuffer.allocate(ByteBuffer.java:335)
at org.apache.kafka.common.network.NetworkReceive.readFromReadableChannel(NetworkReceive.java:97)
at org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:75)
at org.apache.kafka.common.network.KafkaChannel.receive(KafkaChannel.java:203)
at org.apache.kafka.common.network.KafkaChannel.read(KafkaChannel.java:167)
at org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:381)
at org.apache.kafka.common.network.Selector.poll(Selector.java:326)
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:433)
at org.apache.kafka.clients.NetworkClientUtils.awaitReady(NetworkClientUtils.java:71)
at org.apache.kafka.clients.producer.internals.Sender.awaitLeastLoadedNodeReady(Sender.java:409)
at org.apache.kafka.clients.producer.internals.Sender.maybeSendTransactionalRequest(Sender.java:337)
at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:204)
at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:162)
at java.lang.Thread.run(Thread.java:748)
The taskmanager config I have is (Taken from taskmanager logs)
Starting Task Manager
config file:
jobmanager.rpc.address: component-app-adb71002-tm-5c6f4d58bd-rtblz
jobmanager.rpc.port: 6123
jobmanager.heap.size: 1024m
taskmanager.heap.size: 1024m
taskmanager.numberOfTaskSlots: 2
parallelism.default: 1
jobmanager.execution.failover-strategy: region
blob.server.port: 6124
query.server.port: 6125
blob.server.port: 6125
fs.s3a.aws.credentials.provider: org.apache.flink.fs.s3base.shaded.com.amazonaws.auth.DefaultAWSCredentialsProviderChain
jobmanager.heap.size: 524288k
jobmanager.rpc.port: 6123
jobmanager.web.port: 8081
metrics.internal.query-service.port: 50101
metrics.reporter.dghttp.apikey: f52362263f032f2ebc3622cafc0171cd
metrics.reporter.dghttp.class: org.apache.flink.metrics.datadog.DatadogHttpReporter
metrics.reporter.dghttp.tags: componentingestion,dev
query.server.port: 6124
taskmanager.heap.size: 1048576k
taskmanager.numberOfTaskSlots: 1
web.upload.dir: /opt/flink
jobmanager.rpc.address: component-app-adb71002
taskmanager.host: 10.42.6.6
Starting taskexecutor as a console application on host component-app-adb71002-tm-5c6f4d58bd-rtblz.
2020-02-11 15:19:20,519 INFO org.apache.flink.runtime.taskexecutor.TaskManagerRunner - --------------------------------------------------------------------------------
2020-02-11 15:19:20,520 INFO org.apache.flink.runtime.taskexecutor.TaskManagerRunner - Starting TaskManager (Version: 1.9.2, Rev:c9d2c90, Date:24.01.2020 # 08:44:30 CST)
2020-02-11 15:19:20,520 INFO org.apache.flink.runtime.taskexecutor.TaskManagerRunner - OS current user: flink
2020-02-11 15:19:20,520 INFO org.apache.flink.runtime.taskexecutor.TaskManagerRunner - Current Hadoop/Kerberos user: <no hadoop dependency found>
2020-02-11 15:19:20,520 INFO org.apache.flink.runtime.taskexecutor.TaskManagerRunner - JVM: OpenJDK 64-Bit Server VM - Oracle Corporation - 1.8/25.242-b08
2020-02-11 15:19:20,521 INFO org.apache.flink.runtime.taskexecutor.TaskManagerRunner - Maximum heap size: 922 MiBytes
2020-02-11 15:19:20,521 INFO org.apache.flink.runtime.taskexecutor.TaskManagerRunner - JAVA_HOME: /usr/local/openjdk-8
2020-02-11 15:19:20,521 INFO org.apache.flink.runtime.taskexecutor.TaskManagerRunner - No Hadoop Dependency available
2020-02-11 15:19:20,521 INFO org.apache.flink.runtime.taskexecutor.TaskManagerRunner - JVM Options:
2020-02-11 15:19:20,521 INFO org.apache.flink.runtime.taskexecutor.TaskManagerRunner - -XX:+UseG1GC
2020-02-11 15:19:20,521 INFO org.apache.flink.runtime.taskexecutor.TaskManagerRunner - -Xms922M
2020-02-11 15:19:20,521 INFO org.apache.flink.runtime.taskexecutor.TaskManagerRunner - -Xmx922M
2020-02-11 15:19:20,521 INFO org.apache.flink.runtime.taskexecutor.TaskManagerRunner - -XX:MaxDirectMemorySize=8388607T
2020-02-11 15:19:20,521 INFO org.apache.flink.runtime.taskexecutor.TaskManagerRunner - -Dlog4j.configuration=file:/opt/flink/conf/log4j-console.properties
2020-02-11 15:19:20,522 INFO org.apache.flink.runtime.taskexecutor.TaskManagerRunner - -Dlogback.configurationFile=file:/opt/flink/conf/logback-console.xml
2020-02-11 15:19:20,522 INFO org.apache.flink.runtime.taskexecutor.TaskManagerRunner - Program Arguments:
2020-02-11 15:19:20,522 INFO org.apache.flink.runtime.taskexecutor.TaskManagerRunner - --configDir
2020-02-11 15:19:20,522 INFO org.apache.flink.runtime.taskexecutor.TaskManagerRunner - /opt/flink/conf
2020-02-11 15:19:20,522 INFO org.apache.flink.runtime.taskexecutor.TaskManagerRunner - Classpath: /opt/flink/lib/flink-metrics-datadog-1.9.2.jar:/opt/flink/lib/flink-table-blink_2.12-1.9.2.jar:/opt/flink/lib/flink-table_2.12-1.9.2.jar:/opt/flink/lib/log4j-1.2.17.jar:/opt/flink/lib/slf4j-log4j12-1.7.15.jar:/opt/flink/lib/flink-dist_2.12-1.9.2.jar:::
Producer config that I have is
acks = 1
batch.size = 16384
bootstrap.servers = [XXXXXXXXXXXXXXXX] ---I masked it intentionally
buffer.memory = 33554432
client.id =
compression.type = none
connections.max.idle.ms = 540000
enable.idempotence = false
interceptor.classes = null
key.serializer = class org.apache.kafka.common.serialization.ByteArraySerializer
linger.ms = 0
max.block.ms = 60000
max.in.flight.requests.per.connection = 5
max.request.size = 1048576
metadata.max.age.ms = 300000
metric.reporters = []
metrics.num.samples = 2
metrics.recording.level = INFO
metrics.sample.window.ms = 30000
partitioner.class = class org.apache.kafka.clients.producer.internals.DefaultPartitioner
receive.buffer.bytes = 32768
reconnect.backoff.max.ms = 1000
reconnect.backoff.ms = 50
request.timeout.ms = 30000
retries = 3
retry.backoff.ms = 100
sasl.jaas.config = null
sasl.kerberos.kinit.cmd = /usr/bin/kinit
sasl.kerberos.min.time.before.relogin = 60000
sasl.kerberos.service.name = null
sasl.kerberos.ticket.renew.jitter = 0.05
sasl.kerberos.ticket.renew.window.factor = 0.8
sasl.mechanism = GSSAPI
security.protocol = PLAINTEXT
send.buffer.bytes = 131072
ssl.cipher.suites = null
ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]
ssl.endpoint.identification.algorithm = null
ssl.key.password = null
ssl.keymanager.algorithm = SunX509
ssl.keystore.location = null
ssl.keystore.password = null
ssl.keystore.type = JKS
ssl.protocol = TLS
ssl.provider = null
ssl.secure.random.implementation = null
ssl.trustmanager.algorithm = PKIX
ssl.truststore.location = null
ssl.truststore.password = null
ssl.truststore.type = JKS
transaction.timeout.ms = 60000
transactional.id = Source: Collection Source -> Sink: Unnamed-eb99017e0f9125fa6648bf56123bdcf7-19
value.serializer = class org.apache.kafka.common.serialization.ByteArraySerializer
Most of the producer config is default, is there something I am missing here or something wrong with the config ?
As Dominik suggested, the issue is not related to Heap.
If the broker is setup with ssl authentication and client is not setup for ssl auth, this exception is thrown.
this is a bug with kafka.
https://issues.apache.org/jira/browse/KAFKA-4090

Offset commit failed on partition

Kafka consumer continues to print error messages
I built a cluster (kafka version 2.3.0) using 5 machines kafka, which has a partition with a partition of 0 and a data copy of 3. When I consume the kafka-clients api, I continue to output exceptions:
Offset commit failed on partition test-0 at offset 1: The request timed out.
But this topic reads and writes messages are fine.
Consumer configuration:
Auto.commit.interval.ms = 5000
Auto.offset.reset = latest
Bootstrap.servers = [qs-kfk-01:9092, qs-kfk-02:9092, qs-kfk-03:9092, qs-kfk-04:9092, qs-kfk-05:9092]
Check.crcs = true
Client.id =
Connections.max.idle.ms = 540000
Default.api.timeout.ms = 60000
Enable.auto.commit = true
Exclude.internal.topics = true
Fetch.max.bytes = 52428800
Fetch.max.wait.ms = 500
Fetch.min.bytes = 1
Group.id = erp-sales
Heartbeat.interval.ms = 3000
Interceptor.classes = []
Internal.leave.group.on.close = true
Isolation.level = read_uncommitted
Key.deserializer = class org.apache.kafka.common.serialization.StringDeserializer
Max.partition.fetch.bytes = 1048576
Max.poll.interval.ms = 300000
Max.poll.records = 500
Metadata.max.age.ms = 300000
Metric.reporters = []
Metrics.num.samples = 2
Metrics.recording.level = INFO
Metrics.sample.window.ms = 30000
Partition.assignment.strategy = [class org.apache.kafka.clients.consumer.RangeAssignor]
Receive.buffer.bytes = 65536
Reconnect.backoff.max.ms = 1000
Reconnect.backoff.ms = 50
Request.timeout.ms = 30000
Retry.backoff.ms = 100
Sasl.client.callback.handler.class = null
Sasl.jaas.config = null
Sasl.kerberos.kinit.cmd = /usr/bin/kinit
Sasl.kerberos.min.time.before.relogin = 60000
Sasl.kerberos.service.name = null
Sasl.kerberos.ticket.renew.jitter = 0.05
Sasl.kerberos.ticket.renew.window.factor = 0.8
Sasl.login.callback.handler.class = null
Sasl.login.class = null
Sasl.login.refresh.buffer.seconds = 300
Sasl.login.refresh.min.period.seconds = 60
Sasl.login.refresh.window.factor = 0.8
Sasl.login.refresh.window.jitter = 0.05
Sasl.mechanism = GSSAPI
Security.protocol = PLAINTEXT
Send.buffer.bytes = 131072
Session.timeout.ms = 10000
Ssl.cipher.suites = null
Ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]
Ssl.endpoint.identification.algorithm = https
Ssl.key.password = null
Ssl.keymanager.algorithm = SunX509
Ssl.keystore.location = null
Ssl.keystore.password = null
Ssl.keystore.type = JKS
Ssl.protocol = TLS
Ssl.provider = null
Ssl.secure.random.implementation = null
Ssl.trustmanager.algorithm = PKIX
Ssl.truststore.location = null
Ssl.truststore.password = null
Ssl.truststore.type = JKS
Value.deserializer = class org.apache.kafka.common.serialization.StringDeserializer
java code:
ConsumerRecords<K, V> consumerRecords = _kafkaConsumer.poll(50L);
for (ConsumerRecord<K, V> record : consumerRecords.records(topic)) {
kafkaConsumer.receive(topic, record.key(), record.value(), record.partition(), record.offset());
}
I have tried the following:
Increase request timeout to 5 minutes (not working)
Replaced another group-id (it work): I found that as long as I use this group-id, there will be problems.
Kill the machine where the group coordinator is located. After the group coordinator switches to another machine, the error remains.
I get continuous error message output in the console
2019-11-03 16:21:11.687 DEBUG org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=consumer-1, groupId=erp-sales] Sending asynchronous auto-commit of offsets {test-0=OffsetAndMetadata{offset=1, metadata=''}}
2019-11-03 16:21:11.704 ERROR org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=consumer-1, groupId=erp-sales] Offset commit failed on partition test-0 at offset 1: The request timed out.
2019-11-03 16:21:11.704 INFO org.apache.kafka.clients.consumer.internals.AbstractCoordinator - [Consumer clientId=consumer-1, groupId=erp-sales] Group coordinator qs-kfk-04:9092 (id: 2147483643 rack: null) is unavailable or invalid, will attempt rediscovery
2019-11-03 16:21:11.705 DEBUG org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=consumer-1, groupId=erp-sales] Asynchronous auto-commit of offsets {test-0=OffsetAndMetadata{offset=1, metadata=''}} failed due to retriable error: {}
org.apache.kafka.clients.consumer.RetriableCommitFailedException: Offset commit failed with a retriable exception. You should retry committing the latest consumed offsets.
Caused by: org.apache.kafka.common.errors.TimeoutException: The request timed out.
2019-11-03 16:21:11.708 DEBUG org.apache.kafka.clients.NetworkClient - [Consumer clientId=consumer-1, groupId=erp-sales] Manually disconnected from 2147483643. Removed requests: OFFSET_COMMIT.
2019-11-03 16:21:11.708 DEBUG org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient - [Consumer clientId=consumer-1, groupId=erp-sales] Cancelled request with header RequestHeader(apiKey=OFFSET_COMMIT, apiVersion=4, clientId=consumer-1, correlationId=42) due to node 2147483643 being disconnected
2019-11-03 16:21:11.708 DEBUG org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=consumer-1, groupId=erp-sales] Asynchronous auto-commit of offsets {test-0=OffsetAndMetadata{offset=1, metadata=''}} failed due to retriable error: {}
org.apache.kafka.clients.consumer.RetriableCommitFailedException: Offset commit failed with a retriable exception. You should retry committing the latest consumed offsets.
Caused by: org.apache.kafka.common.errors.DisconnectException: null
Why did the submission of the offset fail?
Why is the offset information in the kafka cluster still correct when the submission fails?
Thank you for your help.
Group coordinator Unavailability is the main cause of this issue.
Group coordinator is Unavailable --
This issue is already raised in the KAFKA Community (KAFKA-7017).
You can fix this by deleting the topic _offset_topics and restart the cluster.
You can go through the following to get few more details.
https://www.stuckintheloop.com/posts/my_first_kafka_outage/
#Rohit Yadav, Thanks for the answer, I have replaced a consumer group for the time being and I am working very well. But now the client has another problem, continuous output:
2019-11-05 14:11:14.892 INFO org.apache.kafka.clients.FetchSessionHandler - [Consumer clientId=consumer-1, groupId=fs-sales-be] Error sending fetch request (sessionId=2035993355, epoch=1) to node 4: org.apache.kafka.common.errors.DisconnectException.
2019-11-05 14:11:14.892 INFO org.apache.kafka.clients.FetchSessionHandler - [Consumer clientId=consumer-1, groupId=fs-sales-be] Error sending fetch request (sessionId=181175071, epoch=INITIAL) to node 5: org.apache.kafka.common.errors.DisconnectException.
What is this caused by this? 4, 5 node status is OK

Spring kafka consumer don't commit to kafka server after leader changed

I am using spring-kafka 2.1.10.RELEASE. I have a consumer with next properties (copied almost all of them):
auto.commit.interval.ms = 5000
auto.offset.reset = earliest
bootstrap.servers = [kafka1.local:9093, kafka2.local:9093, kafka3.local:9093]
check.crcs = true
client.id = kafkaListener-0
connections.max.idle.ms = 540000
enable.auto.commit = true
exclude.internal.topics = true
fetch.max.bytes = 52428800
fetch.max.wait.ms = 500
fetch.min.bytes = 1
group.id = kafkaLisneterContainer
heartbeat.interval.ms = 3000
interceptor.classes = null
internal.leave.group.on.close = true
isolation.level = read_uncommitted
max.poll.interval.ms = 300000
max.poll.records = 50
metadata.max.age.ms = 300000
metrics.num.samples = 2
metrics.recording.level = INFO
metrics.sample.window.ms = 30000
partition.assignment.strategy = [class org.apache.kafka.clients.consumer.RangeAssignor]
receive.buffer.bytes = 65536
reconnect.backoff.max.ms = 1000
reconnect.backoff.ms = 50
request.timeout.ms = 305000
retry.backoff.ms = 100
sasl.jaas.config = null
sasl.kerberos.kinit.cmd = /usr/bin/kinit
sasl.kerberos.min.time.before.relogin = 60000
sasl.kerberos.service.name = null
sasl.kerberos.ticket.renew.jitter = 0.05
sasl.kerberos.ticket.renew.window.factor = 0.8
sasl.mechanism = GSSAPI
security.protocol = PLAINTEXT
send.buffer.bytes = 131072
session.timeout.ms = 10000
ssl.cipher.suites = null
ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]
ssl.endpoint.identification.algorithm = null
ssl.key.password = null
ssl.keymanager.algorithm = SunX509
ssl.keystore.location = null
ssl.keystore.password = null
ssl.keystore.type = JKS
ssl.protocol = TLS
ssl.provider = null
ssl.secure.random.implementation = null
ssl.trustmanager.algorithm = PKIX
ssl.truststore.location = null
ssl.truststore.password = null
ssl.truststore.type = JKS
Apache Kafka version on my production is 2.11-1.0.0-0pan4.
There is a cluster with 3 nodes of kafka inside:
Faced a serious problem and cannot even reproduce it locally. And this is what happened:
I started my application with both kafka Producer and Consumer inside.
Everything worked fine untill leader node for my topic wasn't changed at 2019-01-17 06:47:39:
2019-01-17/controller.2019-01-17-03.aaa-aa3.gz:2019-01-17 06:47:39,365
+0000 [controller-event-thread] [kafka.controller.KafkaController] INFO [Controller id=3] New leader and ISR for partition topic_name-0
is {"leader":1,"leader_epoch":3,"isr":[1,3]}
(kafka.controller.KafkaController)
After that my consumer stopped commiting offsets to Kafka. Last commit took place same hour and same minute when the leader was changed - 17th January 2019 06:47.
4) MOST MYSTERIOUS: in application everything kind-a works OK. Spring-consumer reads new messages and sends them to kafka. I see such logs. Seems like spring consumer saves its offset in memory and sends commit to remote kafka (no errors and etc.):
2019-01-23 14:03:20,975 +0000
[kafkaLisneterContainer-0-C-1] [Fetcher] DEBUG [Consumer
clientId=kafkaListener-0,
groupId=kafkaLisneterContainer] Fetch READ_UNCOMMITTED at
offset 164871 for partition aaa-1 returned fetch data
(error=NONE, highWaterMark=164871, lastStableOffset = -1,
logStartOffset = 116738, abortedTransactions = null,
recordsSizeInBytes=0) 2019-01-23 14:03:20,975 +0000
[externalbetting] [kafkaLisneterContainer-0-C-1] [Fetcher]
DEBUG [Consumer clientId=kafkaListener-0,
groupId=kafkaLisneterContainer] Added READ_UNCOMMITTED fetch
request for partition eaaa-1 at offset 164871 to node
aaa-aa1.local:9093 (id: 1 rack: null) 2019-01-23 14:03:20,975
5) But anyway Lag in Apache kafka grows. And if I restart my application, spring bean consumer will be re-created and will loose its in-memory saved offset. It will read that Lag from kafka and process that records for second time.
Please, help to find the key!
When you enable auto commit (Kafka's default), the commits are completely managed by the kafka-clients and Spring has no control over it.
Setting it to false will allow the listener container to commit the offsets which it will do after each batch of records (poll result) by default or after every record if you set the container AckMode property to RECORD.
The container will also reliably commit any pending offsets when partitions are revoked due to a rebalance.
I generally recommend not using auto commit.