Sending events with Reactor Kafka seems very slow - apache-kafka

My project uses Reactor Kafka 1.2.2.RELEASE to send events serialized with Avro to my Kafka broker. This works well, however sending events seems to be quite slow.
We plugged a custom metric to the lifecycle of the Mono sending the event through KafkaSender.send(), and noticed it took around 100ms to deliver a message.
We did it this way:
send(event, code, getKafkaHeaders()).transform(withMetric("eventName"));
The send method just builds the record and sends it:
private Mono<Void> send(SpecificRecord value, String code, final List<Header> headers) {
final var producerRecord = new ProducerRecord<>("myTopic", null, code, value, headers);
final var record = Mono.just(SenderRecord.create(producerRecord, code));
return Optional.ofNullable(kafkaSender.send(record)).orElseGet(Flux::empty)
.switchMap(this::errorIfAnyException)
.doOnNext(res -> log.info("Successfully sent event on topic {}", res.recordMetadata().topic()))
.then();
}
And the withMetric transformer links a metric to the send mono lifecycle:
private Function<Mono<Void>, Mono<Void>> withMetric(final String methodName) {
return mono -> Mono.justOrEmpty(this.metricProvider)
.map(provider -> provider.buildMethodExecutionTimeMetric(methodName, "kafka"))
.flatMap(metric -> mono.doOnSubscribe(subscription -> metric.start())
.doOnTerminate(metric::end));
}
That is this custom metric that returns an average of 100ms.
We compared it to our Kafka producer metrics, and noticed that those ones returned and average of 40ms to deliver a message (0ms of queueing, and 40ms of request latency).
We have difficulties to understand the delta, an wonder if it could come from the Reactor Kafka method to send events.
Can anybody help please?
UPDATE
Here's a sample of my producer config:
acks = all
batch.size = 16384
buffer.memory = 33554432
client.dns.lookup = default
compression.type = none
connections.max.idle.ms = 540000
delivery.timeout.ms = 120000
enable.idempotence = true
key.serializer = class org.apache.kafka.common.serialization.StringSerializer
linger.ms = 0
max.block.ms = 60000
max.in.flight.requests.per.connection = 5
max.request.size = 1048576
metadata.max.age.ms = 300000
metrics.num.samples = 2
metrics.recording.level = INFO
metrics.sample.window.ms = 30000
partitioner.class = class org.apache.kafka.clients.producer.internals.DefaultPartitioner
receive.buffer.bytes = 32768
reconnect.backoff.max.ms = 1000
reconnect.backoff.ms = 50
request.timeout.ms = 30000
retries = 2147483647
retry.backoff.ms = 100
sasl.login.refresh.buffer.seconds = 300
sasl.login.refresh.min.period.seconds = 60
sasl.login.refresh.window.factor = 0.8
sasl.login.refresh.window.jitter = 0.05
sasl.mechanism = GSSAPI
security.protocol = PLAINTEXT
send.buffer.bytes = 131072
ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]
ssl.endpoint.identification.algorithm = https
ssl.keystore.type = JKS
ssl.protocol = TLS
ssl.trustmanager.algorithm = PKIX
ssl.truststore.type = JKS
transaction.timeout.ms = 60000
value.serializer = class io.confluent.kafka.serializers.KafkaAvroSerializer
Also, the maxInFlight is 256 and the scheduler is single, I didn't configure anything special here.

Related

Kafka Message loss when Kafkaserver goes down

I have a scenario that when Kafka server or Zookeeper both goes down, but the producer application trying to send a message to Kafka broker.
When the Kafka broker or Zookeeper comes online, messages is getting loss, and i see the producer delivery result offset has Offset: -1001 and Partition [Any]
Below are my configuration to reproduce
new ProducerConfig
{
BootstrapServers = <BROKET_END_POINT>,
MessageMaxBytes = 20971520,
MessageCopyMaxBytes = 20971520,
LogConnectionClose = false,
BrokerAddressFamily = BrokerAddressFamily.V4,
ConnectionsMaxIdleMs = 0,
//https://github.com/Azure/azure-event-hubs-for-kafka/issues/139
SocketKeepaliveEnable = true,
MetadataMaxAgeMs = 180000,
//https://learn.microsoft.com/en-us/azure/event-hubs/apache-kafka-configurations
RequestTimeoutMs = 60000,
MessageTimeoutMs = 0,
MessageSendMaxRetries = int.MaxValue,
RetryBackoffMs = 100,
EnableIdempotence = true,
Acks = Acks.All,
BatchSize = 2000000,
LingerMs = 0,
CompressionType = CompressionType.Gzip
SocketNagleDisable = true
};
I used the above setting to produce the message using Confluent kafka nuget package version 1.8.2 from .Net code.
Appreciate your inputs in advance

How to use the useBulkCopyForBatchInsert on the JdbcSinkConnector?

I’m trying to bulk insert to the mssql db table by adding “useBulkCopyForBatchInsert=true” to the connection.url option of Jdbcsinkconnector as below.
"connection.url": "jdbc:sqlserver://...:1433;database=****;useBulkCopyForBatchInsert=true"
But data is not being inserted using bulk insert.
I will attach the connect log and reference document.
Using bulk copy API for batch insert operation
https://learn.microsoft.com/en-us/sql/connect/jdbc/use-bulk-copy-api-batch-insert-operation?view=sql-server-ver16
Connect Log
[2022-07-18 16:46:32,224] INFO JdbcSinkConfig values:
auto.create = false
auto.evolve = false
batch.size = 3000
connection.attempts = 3
connection.backoff.ms = 10000
connection.password = [hidden]
connection.url = jdbc:sqlserver://...:1433;database=****;useBulkCopyForBatchInsert=true
connection.user = ****
db.timezone = Asia/Seoul
delete.enabled = false
dialect.name =
fields.whitelist = []
insert.mode = insert
max.retries = 10
pk.fields = []
pk.mode = none
quote.sql.identifiers = ALWAYS
retry.backoff.ms = 3000
table.name.format = ****
table.types = [TABLE]
(io.confluent.connect.jdbc.sink.JdbcSinkConfig:361)
I'm not sure if this is the issue, but you're using the wrong JDBC URL for SQL Server. You should use jdbc:sqlserver://...:1433;databaseName=; instead of jdbc:sqlserver://...:1433;database=;.

Lagged offsets skipped after new event is published before max poll interval in KAFKA

Kafka v2.4 Consumer Configurations:-
kafka.consumer.auto.offset.reset=earliest
kafka.consumer.auto.commit=false
Kafka consumer container config:-
#Bean
public ConcurrentKafkaListenerContainerFactory<String, PayoutDto> kafkaPayoutStatusPoolListenerContainerFactory() {
ConcurrentKafkaListenerContainerFactory<String, PayoutDto> factory = new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(kafkaConsumerFactoryForPayoutEvent());
factory.getContainerProperties().setAckMode(AckMode.MANUAL_IMMEDIATE);
factory.setMissingTopicsFatal(false);
return factory;
}
Kafka consumer:-
#KafkaListener(id = "regularPayoutEventConsumer", topics = "${kafka.regular.payout.consumer.queuename}", containerFactory = "kafkaPayoutStatusPoolListenerContainerFactory", groupId = "${kafka.regular.payout.consumer.groupId}")
public void listen(ConsumerRecord<String, PayoutDto> consumerRecord, Acknowledgment ack) {
StopWatch watch = new StopWatch();
watch.start();
String key = null;
Long offset = null;
try {
PayoutDto payoutDto = consumerRecord.value();
key = consumerRecord.key();
offset = consumerRecord.offset();
cpAccountsService.processPayoutEvent(payoutDto);
ack.acknowledge();
} catch (Exception e) {
log.error("Exception occured in RegularPayoutEventConsumer due to following issue {}", e);
} finally {
watch.stop();
log.debug("tolal time taken by consumer for requestID:" + key + " on offset:" + offset + " is:"
+ watch.getTotalTimeMillis());
}
}
Success Scenario:-
consumer failed to acknowledge on an exception which creates a lag, lets say last committed offset is 30 and now lag is 4.
on next auto poll cycle after poll interval, consumer continues to consumes where lag starts from 30 and ends at 33 normally and lag is now 0.
Failed Scenario:-
same as step 1 from success scenario.
now before consumer poll interval, producer pushed new message.
now on new producer event, consumer pulls data and jumps directly to offset record 33 and skipped 30,31,32 and clearing the lag to 0.
App startup logs of kafka:-
2021-04-14 10:38:06.132 INFO 10286 --- [ restartedMain] o.a.k.clients.consumer.KafkaConsumer : [Consumer clientId=consumer-RegularPayoutEventGroupId-3, groupId=RegularPayoutEventGroupId] Subscribed to topic(s): InstantPayoutTransactionsEv
2021-04-14 10:38:06.132 INFO 10286 --- [ restartedMain] o.s.s.c.ThreadPoolTaskScheduler : Initializing ExecutorService
2021-04-14 10:38:06.133 INFO 10286 --- [ restartedMain] o.a.k.clients.consumer.ConsumerConfig : ConsumerConfig values:
allow.auto.create.topics = true
auto.commit.interval.ms = 5000
auto.offset.reset = earliest
bootstrap.servers = [localhost:9092]
check.crcs = true
client.dns.lookup = use_all_dns_ips
client.id = consumer-PayoutEventGroupId-4
client.rack =
connections.max.idle.ms = 540000
default.api.timeout.ms = 60000
enable.auto.commit = false
exclude.internal.topics = true
fetch.max.bytes = 52428800
fetch.max.wait.ms = 500
fetch.min.bytes = 1
group.id = PayoutEventGroupId
group.instance.id = null
heartbeat.interval.ms = 3000
interceptor.classes = []
internal.leave.group.on.close = true
internal.throw.on.fetch.stable.offset.unsupported = false
isolation.level = read_uncommitted
key.deserializer = class org.apache.kafka.common.serialization.StringDeserializer
max.partition.fetch.bytes = 1048576
max.poll.interval.ms = 30000
max.poll.records = 500
metadata.max.age.ms = 300000
metric.reporters = []
metrics.num.samples = 2
metrics.recording.level = INFO
metrics.sample.window.ms = 30000
partition.assignment.strategy = [class org.apache.kafka.clients.consumer.RangeAssignor]
receive.buffer.bytes = 65536
reconnect.backoff.max.ms = 1000
reconnect.backoff.ms = 50
request.timeout.ms = 30000
retry.backoff.ms = 100
sasl.client.callback.handler.class = null
sasl.jaas.config = null
sasl.kerberos.kinit.cmd = /usr/bin/kinit
sasl.kerberos.min.time.before.relogin = 60000
sasl.kerberos.service.name = null
sasl.kerberos.ticket.renew.jitter = 0.05
sasl.kerberos.ticket.renew.window.factor = 0.8
sasl.login.callback.handler.class = null
sasl.login.class = null
sasl.login.refresh.buffer.seconds = 300
sasl.login.refresh.min.period.seconds = 60
sasl.login.refresh.window.factor = 0.8
sasl.login.refresh.window.jitter = 0.05
sasl.mechanism = GSSAPI
security.protocol = PLAINTEXT
security.providers = null
send.buffer.bytes = 131072
session.timeout.ms = 10000
ssl.cipher.suites = null
ssl.enabled.protocols = [TLSv1.2, TLSv1.3]
ssl.endpoint.identification.algorithm = https
ssl.engine.factory.class = null
ssl.key.password = null
ssl.keymanager.algorithm = SunX509
ssl.keystore.location = null
ssl.keystore.password = null
ssl.keystore.type = JKS
ssl.protocol = TLSv1.3
ssl.provider = null
ssl.secure.random.implementation = null
ssl.trustmanager.algorithm = PKIX
ssl.truststore.location = null
ssl.truststore.password = null
ssl.truststore.type = JKS
value.deserializer = class com.cms.cpa.config.KafkaPayoutDeserializer
2021-04-14 10:38:06.137 INFO 10286 --- [ restartedMain] o.a.kafka.common.utils.AppInfoParser : Kafka version: 2.6.0
2021-04-14 10:38:06.137 INFO 10286 --- [ restartedMain] o.a.kafka.common.utils.AppInfoParser : Kafka commitId: 62abe01bee039651
Kafka maintains 2 values for a consumer/partition - the committed offset (where the consumer will start if restarted) and position - which record will be returned on the next poll.
Not acknowledging a record will not cause the position to be repositioned.
It is working as-designed; if you want to re-process a failed record, you need to use acknowledgment.nack() with an optional sleep time, or throw an exception and configure a SeekToCurrentErrorHandler.
In those cases, the container will reposition the partitions so that the failed record is redelivered. With the error handler you can "recover" the failed record after the retries are exhausted. When using nack(), the listener has to keep track of the attempts.
See https://docs.spring.io/spring-kafka/docs/current/reference/html/#committing-offsets
and https://docs.spring.io/spring-kafka/docs/current/reference/html/#annotation-error-handling

Ksql avro format in quickstart gives error

Hey I'm doing the KSQL quickstart example. The problem is I want to generate data using avro format and it throws the error I list at the bottom.
The tutorial is at https://docs.ksqldb.io/en/latest/tutorials/basics-docker/
To duplicate the problem
git clone https://github.com/confluentinc/ksql.git
cd ksql
git checkout 5.5.0-post
cd docs/tutorials/
docker-compose up -d
If you docker ps you should see the following containers running:
confluentinc/ksqldb-examples:5.5.0
confluentinc/cp-ksql-server:5.4.0
confluentinc/cp-schema-registry:5.4.0,
confluentinc/cp-enterprise-kafka:5.4.0
confluentinc/cp-zookeeper:5.4.0
If I run the example in delimited format like this, the code works:
docker run --network tutorials_default --rm --name datagen-users \
confluentinc/ksqldb-examples:5.5.0 \
ksql-datagen \
bootstrap-server=kafka:39092 \
quickstart=users \
format=delimited \
topic=users \
msgRate=1
The example uses avro format, which I would like to use. Example here:
docker run --network tutorials_default --rm --name datagen-users \
confluentinc/ksqldb-examples:5.5.0 \
ksql-datagen \
bootstrap-server=kafka:39092 \
quickstart=users \
format=avro \
topic=users \
msgRate=1
When I use avro format it gives me the error that follows:
[2020-06-06 15:46:47,632] INFO AvroDataConfig values:
connect.meta.data = true
enhanced.avro.schema.support = false
schemas.cache.config = 1000
(io.confluent.connect.avro.AvroDataConfig:347)
[2020-06-06 15:46:47,650] INFO JsonSchemaDataConfig values:
decimal.format = BASE64
schemas.cache.size = 1000
(io.confluent.connect.json.JsonSchemaDataConfig:179)
[2020-06-06 15:46:47,651] INFO JsonSchemaDataConfig values:
decimal.format = BASE64
schemas.cache.size = 1000
(io.confluent.connect.json.JsonSchemaDataConfig:179)
[2020-06-06 15:46:47,654] INFO ProtobufDataConfig values:
schemas.cache.config = 1000
(io.confluent.connect.protobuf.ProtobufDataConfig:179)
[2020-06-06 15:46:47,672] INFO KsqlConfig values:
ksql.access.validator.enable = auto
ksql.authorization.cache.expiry.time.secs = 30
ksql.authorization.cache.max.entries = 10000
ksql.connect.url = http://localhost:8083
ksql.connect.worker.config =
ksql.extension.dir = ext
ksql.hidden.topics = [_confluent.*, __confluent.*, _schemas, __consumer_offsets, __transaction_state, connect-configs, connect-offsets, connect-status, connect-statuses]
ksql.insert.into.values.enabled = true
ksql.internal.topic.min.insync.replicas = 1
ksql.internal.topic.replicas = 1
ksql.metric.reporters = []
ksql.metrics.extension = null
ksql.metrics.tags.custom =
ksql.new.api.enabled = false
ksql.output.topic.name.prefix =
ksql.persistence.wrap.single.values = true
ksql.persistent.prefix = query_
ksql.pull.queries.enable = true
ksql.query.persistent.active.limit = 2147483647
ksql.query.pull.enable.standby.reads = false
ksql.query.pull.max.allowed.offset.lag = 9223372036854775807
ksql.readonly.topics = [_confluent.*, __confluent.*, _schemas, __consumer_offsets, __transaction_state, connect-configs, connect-offsets, connect-status, connect-statuses]
ksql.schema.registry.url = http://localhost:8081
ksql.security.extension.class = null
ksql.service.id = default_
ksql.sink.window.change.log.additional.retention = 1000000
ksql.streams.shutdown.timeout.ms = 300000
ksql.transient.prefix = transient_
ksql.udf.collect.metrics = false
ksql.udf.enable.security.manager = true
ksql.udfs.enabled = true
ssl.cipher.suites = null
ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]
ssl.endpoint.identification.algorithm = https
ssl.key.password = null
ssl.keymanager.algorithm = SunX509
ssl.keystore.location = null
ssl.keystore.password = null
ssl.keystore.type = JKS
ssl.protocol = TLS
ssl.provider = null
ssl.secure.random.implementation = null
ssl.trustmanager.algorithm = PKIX
ssl.truststore.location = null
ssl.truststore.password = null
ssl.truststore.type = JKS
(io.confluent.ksql.util.KsqlConfig:347)
[2020-06-06 15:46:47,720] INFO AvroDataConfig values:
connect.meta.data = true
enhanced.avro.schema.support = false
schemas.cache.config = 1
(io.confluent.connect.avro.AvroDataConfig:347)
[2020-06-06 15:46:47,752] INFO ProcessingLogConfig values:
ksql.logging.processing.rows.include = false
ksql.logging.processing.stream.auto.create = false
ksql.logging.processing.stream.name = KSQL_PROCESSING_LOG
ksql.logging.processing.topic.auto.create = false
ksql.logging.processing.topic.name =
ksql.logging.processing.topic.partitions = 1
ksql.logging.processing.topic.replication.factor = 1
(io.confluent.ksql.logging.processing.ProcessingLogConfig:347)
[2020-06-06 15:46:47,767] INFO AvroConverterConfig values:
bearer.auth.token = [hidden]
proxy.port = -1
schema.reflection = false
auto.register.schemas = true
max.schemas.per.subject = 1000
basic.auth.credentials.source = URL
value.subject.name.strategy = class io.confluent.kafka.serializers.subject.TopicNameStrategy
schema.registry.url = [http://localhost:8081]
basic.auth.user.info = [hidden]
proxy.host =
schema.registry.basic.auth.user.info = [hidden]
bearer.auth.credentials.source = STATIC_TOKEN
key.subject.name.strategy = class io.confluent.kafka.serializers.subject.TopicNameStrategy
(io.confluent.connect.avro.AvroConverterConfig:179)
[2020-06-06 15:46:47,770] INFO KafkaAvroSerializerConfig values:
bearer.auth.token = [hidden]
proxy.port = -1
schema.reflection = false
auto.register.schemas = true
max.schemas.per.subject = 1000
basic.auth.credentials.source = URL
value.subject.name.strategy = class io.confluent.kafka.serializers.subject.TopicNameStrategy
schema.registry.url = [http://localhost:8081]
basic.auth.user.info = [hidden]
proxy.host =
schema.registry.basic.auth.user.info = [hidden]
bearer.auth.credentials.source = STATIC_TOKEN
key.subject.name.strategy = class io.confluent.kafka.serializers.subject.TopicNameStrategy
(io.confluent.kafka.serializers.KafkaAvroSerializerConfig:179)
[2020-06-06 15:46:47,771] INFO KafkaAvroDeserializerConfig values:
bearer.auth.token = [hidden]
proxy.port = -1
schema.reflection = false
auto.register.schemas = true
max.schemas.per.subject = 1000
basic.auth.credentials.source = URL
specific.avro.reader = false
value.subject.name.strategy = class io.confluent.kafka.serializers.subject.TopicNameStrategy
schema.registry.url = [http://localhost:8081]
basic.auth.user.info = [hidden]
proxy.host =
schema.registry.basic.auth.user.info = [hidden]
bearer.auth.credentials.source = STATIC_TOKEN
key.subject.name.strategy = class io.confluent.kafka.serializers.subject.TopicNameStrategy
(io.confluent.kafka.serializers.KafkaAvroDeserializerConfig:179)
[2020-06-06 15:46:47,771] INFO AvroDataConfig values:
connect.meta.data = false
enhanced.avro.schema.support = false
schemas.cache.config = 1000
(io.confluent.connect.avro.AvroDataConfig:347)
[2020-06-06 15:46:47,772] INFO AvroConverterConfig values:
bearer.auth.token = [hidden]
proxy.port = -1
schema.reflection = false
auto.register.schemas = true
max.schemas.per.subject = 1000
basic.auth.credentials.source = URL
value.subject.name.strategy = class io.confluent.kafka.serializers.subject.TopicNameStrategy
schema.registry.url = [http://localhost:8081]
basic.auth.user.info = [hidden]
proxy.host =
schema.registry.basic.auth.user.info = [hidden]
bearer.auth.credentials.source = STATIC_TOKEN
key.subject.name.strategy = class io.confluent.kafka.serializers.subject.TopicNameStrategy
(io.confluent.connect.avro.AvroConverterConfig:179)
[2020-06-06 15:46:47,772] INFO KafkaAvroSerializerConfig values:
bearer.auth.token = [hidden]
proxy.port = -1
schema.reflection = false
auto.register.schemas = true
max.schemas.per.subject = 1000
basic.auth.credentials.source = URL
value.subject.name.strategy = class io.confluent.kafka.serializers.subject.TopicNameStrategy
schema.registry.url = [http://localhost:8081]
basic.auth.user.info = [hidden]
proxy.host =
schema.registry.basic.auth.user.info = [hidden]
bearer.auth.credentials.source = STATIC_TOKEN
key.subject.name.strategy = class io.confluent.kafka.serializers.subject.TopicNameStrategy
(io.confluent.kafka.serializers.KafkaAvroSerializerConfig:179)
[2020-06-06 15:46:47,772] INFO KafkaAvroDeserializerConfig values:
bearer.auth.token = [hidden]
proxy.port = -1
schema.reflection = false
auto.register.schemas = true
max.schemas.per.subject = 1000
basic.auth.credentials.source = URL
specific.avro.reader = false
value.subject.name.strategy = class io.confluent.kafka.serializers.subject.TopicNameStrategy
schema.registry.url = [http://localhost:8081]
basic.auth.user.info = [hidden]
proxy.host =
schema.registry.basic.auth.user.info = [hidden]
bearer.auth.credentials.source = STATIC_TOKEN
key.subject.name.strategy = class io.confluent.kafka.serializers.subject.TopicNameStrategy
(io.confluent.kafka.serializers.KafkaAvroDeserializerConfig:179)
[2020-06-06 15:46:47,773] INFO AvroDataConfig values:
connect.meta.data = false
enhanced.avro.schema.support = false
schemas.cache.config = 1000
(io.confluent.connect.avro.AvroDataConfig:347)
[2020-06-06 15:46:47,774] INFO AvroConverterConfig values:
bearer.auth.token = [hidden]
proxy.port = -1
schema.reflection = false
auto.register.schemas = true
max.schemas.per.subject = 1000
basic.auth.credentials.source = URL
value.subject.name.strategy = class io.confluent.kafka.serializers.subject.TopicNameStrategy
schema.registry.url = [http://localhost:8081]
basic.auth.user.info = [hidden]
proxy.host =
schema.registry.basic.auth.user.info = [hidden]
bearer.auth.credentials.source = STATIC_TOKEN
key.subject.name.strategy = class io.confluent.kafka.serializers.subject.TopicNameStrategy
(io.confluent.connect.avro.AvroConverterConfig:179)
[2020-06-06 15:46:47,775] INFO KafkaAvroSerializerConfig values:
bearer.auth.token = [hidden]
proxy.port = -1
schema.reflection = false
auto.register.schemas = true
max.schemas.per.subject = 1000
basic.auth.credentials.source = URL
value.subject.name.strategy = class io.confluent.kafka.serializers.subject.TopicNameStrategy
schema.registry.url = [http://localhost:8081]
basic.auth.user.info = [hidden]
proxy.host =
schema.registry.basic.auth.user.info = [hidden]
bearer.auth.credentials.source = STATIC_TOKEN
key.subject.name.strategy = class io.confluent.kafka.serializers.subject.TopicNameStrategy
(io.confluent.kafka.serializers.KafkaAvroSerializerConfig:179)
[2020-06-06 15:46:47,775] INFO KafkaAvroDeserializerConfig values:
bearer.auth.token = [hidden]
proxy.port = -1
schema.reflection = false
auto.register.schemas = true
max.schemas.per.subject = 1000
basic.auth.credentials.source = URL
specific.avro.reader = false
value.subject.name.strategy = class io.confluent.kafka.serializers.subject.TopicNameStrategy
schema.registry.url = [http://localhost:8081]
basic.auth.user.info = [hidden]
proxy.host =
schema.registry.basic.auth.user.info = [hidden]
bearer.auth.credentials.source = STATIC_TOKEN
key.subject.name.strategy = class io.confluent.kafka.serializers.subject.TopicNameStrategy
(io.confluent.kafka.serializers.KafkaAvroDeserializerConfig:179)
[2020-06-06 15:46:47,776] INFO AvroDataConfig values:
connect.meta.data = false
enhanced.avro.schema.support = false
schemas.cache.config = 1000
(io.confluent.connect.avro.AvroDataConfig:347)
[2020-06-06 15:46:47,776] INFO AvroConverterConfig values:
bearer.auth.token = [hidden]
proxy.port = -1
schema.reflection = false
auto.register.schemas = true
max.schemas.per.subject = 1000
basic.auth.credentials.source = URL
value.subject.name.strategy = class io.confluent.kafka.serializers.subject.TopicNameStrategy
schema.registry.url = [http://localhost:8081]
basic.auth.user.info = [hidden]
proxy.host =
schema.registry.basic.auth.user.info = [hidden]
bearer.auth.credentials.source = STATIC_TOKEN
key.subject.name.strategy = class io.confluent.kafka.serializers.subject.TopicNameStrategy
(io.confluent.connect.avro.AvroConverterConfig:179)
[2020-06-06 15:46:47,776] INFO KafkaAvroSerializerConfig values:
bearer.auth.token = [hidden]
proxy.port = -1
schema.reflection = false
auto.register.schemas = true
max.schemas.per.subject = 1000
basic.auth.credentials.source = URL
value.subject.name.strategy = class io.confluent.kafka.serializers.subject.TopicNameStrategy
schema.registry.url = [http://localhost:8081]
basic.auth.user.info = [hidden]
proxy.host =
schema.registry.basic.auth.user.info = [hidden]
bearer.auth.credentials.source = STATIC_TOKEN
key.subject.name.strategy = class io.confluent.kafka.serializers.subject.TopicNameStrategy
(io.confluent.kafka.serializers.KafkaAvroSerializerConfig:179)
[2020-06-06 15:46:47,777] INFO KafkaAvroDeserializerConfig values:
bearer.auth.token = [hidden]
proxy.port = -1
schema.reflection = false
auto.register.schemas = true
max.schemas.per.subject = 1000
basic.auth.credentials.source = URL
specific.avro.reader = false
value.subject.name.strategy = class io.confluent.kafka.serializers.subject.TopicNameStrategy
schema.registry.url = [http://localhost:8081]
basic.auth.user.info = [hidden]
proxy.host =
schema.registry.basic.auth.user.info = [hidden]
bearer.auth.credentials.source = STATIC_TOKEN
key.subject.name.strategy = class io.confluent.kafka.serializers.subject.TopicNameStrategy
(io.confluent.kafka.serializers.KafkaAvroDeserializerConfig:179)
[2020-06-06 15:46:47,777] INFO AvroDataConfig values:
connect.meta.data = false
enhanced.avro.schema.support = false
schemas.cache.config = 1000
(io.confluent.connect.avro.AvroDataConfig:347)
[2020-06-06 15:46:47,817] WARN The configuration 'ksql.schema.registry.url' was supplied but isn't a known config. (org.apache.kafka.clients.producer.ProducerConfig:355)
[2020-06-06 15:46:47,817] WARN The configuration 'ksql.schema.registry.url' was supplied but isn't a known config. (org.apache.kafka.clients.producer.ProducerConfig:355)
[2020-06-06 15:46:48,038] ERROR Failed to send HTTP request to endpoint: http://localhost:8081/subjects/users-value/versions (io.confluent.kafka.schemaregistry.client.rest.RestService:268)
java.net.ConnectException: Connection refused (Connection refused)
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:607)
at sun.net.NetworkClient.doConnect(NetworkClient.java:175)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:463)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:558)
at sun.net.www.http.HttpClient.<init>(HttpClient.java:242)
at sun.net.www.http.HttpClient.New(HttpClient.java:339)
at sun.net.www.http.HttpClient.New(HttpClient.java:357)
at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1226)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1162)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1056)
at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:990)
at sun.net.www.protocol.http.HttpURLConnection.getOutputStream0(HttpURLConnection.java:1340)
at sun.net.www.protocol.http.HttpURLConnection.getOutputStream(HttpURLConnection.java:1315)
at io.confluent.kafka.schemaregistry.client.rest.RestService.sendHttpRequest(RestService.java:264)
at io.confluent.kafka.schemaregistry.client.rest.RestService.httpRequest(RestService.java:352)
at io.confluent.kafka.schemaregistry.client.rest.RestService.registerSchema(RestService.java:495)
at io.confluent.kafka.schemaregistry.client.rest.RestService.registerSchema(RestService.java:486)
at io.confluent.kafka.schemaregistry.client.rest.RestService.registerSchema(RestService.java:459)
at io.confluent.kafka.schemaregistry.client.CachedSchemaRegistryClient.registerAndGetId(CachedSchemaRegistryClient.java:206)
at io.confluent.kafka.schemaregistry.client.CachedSchemaRegistryClient.register(CachedSchemaRegistryClient.java:268)
at io.confluent.kafka.schemaregistry.client.CachedSchemaRegistryClient.register(CachedSchemaRegistryClient.java:244)
at io.confluent.kafka.serializers.AbstractKafkaAvroSerializer.serializeImpl(AbstractKafkaAvroSerializer.java:74)
at io.confluent.connect.avro.AvroConverter$Serializer.serialize(AvroConverter.java:138)
at io.confluent.connect.avro.AvroConverter.fromConnectData(AvroConverter.java:84)
at io.confluent.ksql.serde.connect.KsqlConnectSerializer.serialize(KsqlConnectSerializer.java:49)
at io.confluent.ksql.serde.tls.ThreadLocalSerializer.serialize(ThreadLocalSerializer.java:37)
at io.confluent.ksql.serde.GenericRowSerDe$GenericRowSerializer.serialize(GenericRowSerDe.java:281)
at io.confluent.ksql.serde.GenericRowSerDe$GenericRowSerializer.serialize(GenericRowSerDe.java:248)
at org.apache.kafka.common.serialization.Serializer.serialize(Serializer.java:62)
at org.apache.kafka.clients.producer.KafkaProducer.doSend(KafkaProducer.java:902)
at org.apache.kafka.clients.producer.KafkaProducer.send(KafkaProducer.java:862)
at io.confluent.ksql.datagen.DataGenProducer.produceOne(DataGenProducer.java:122)
at io.confluent.ksql.datagen.DataGenProducer.populateTopic(DataGenProducer.java:91)
at io.confluent.ksql.datagen.DataGen.lambda$getProducerTask$1(DataGen.java:111)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
org.apache.kafka.common.errors.SerializationException: Error serializing message to topic: users
Caused by: org.apache.kafka.connect.errors.DataException: Failed to serialize Avro data from topic users :
at io.confluent.connect.avro.AvroConverter.fromConnectData(AvroConverter.java:87)
at io.confluent.ksql.serde.connect.KsqlConnectSerializer.serialize(KsqlConnectSerializer.java:49)
at io.confluent.ksql.serde.tls.ThreadLocalSerializer.serialize(ThreadLocalSerializer.java:37)
at io.confluent.ksql.serde.GenericRowSerDe$GenericRowSerializer.serialize(GenericRowSerDe.java:281)
at io.confluent.ksql.serde.GenericRowSerDe$GenericRowSerializer.serialize(GenericRowSerDe.java:248)
at org.apache.kafka.common.serialization.Serializer.serialize(Serializer.java:62)
at org.apache.kafka.clients.producer.KafkaProducer.doSend(KafkaProducer.java:902)
at org.apache.kafka.clients.producer.KafkaProducer.send(KafkaProducer.java:862)
at io.confluent.ksql.datagen.DataGenProducer.produceOne(DataGenProducer.java:122)
at io.confluent.ksql.datagen.DataGenProducer.populateTopic(DataGenProducer.java:91)
at io.confluent.ksql.datagen.DataGen.lambda$getProducerTask$1(DataGen.java:111)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.kafka.common.errors.SerializationException: Error serializing Avro message
Caused by: java.net.ConnectException: Connection refused (Connection refused)
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:607)
at sun.net.NetworkClient.doConnect(NetworkClient.java:175)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:463)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:558)
at sun.net.www.http.HttpClient.<init>(HttpClient.java:242)
at sun.net.www.http.HttpClient.New(HttpClient.java:339)
at sun.net.www.http.HttpClient.New(HttpClient.java:357)
at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1226)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1162)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1056)
at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:990)
at sun.net.www.protocol.http.HttpURLConnection.getOutputStream0(HttpURLConnection.java:1340)
at sun.net.www.protocol.http.HttpURLConnection.getOutputStream(HttpURLConnection.java:1315)
at io.confluent.kafka.schemaregistry.client.rest.RestService.sendHttpRequest(RestService.java:264)
at io.confluent.kafka.schemaregistry.client.rest.RestService.httpRequest(RestService.java:352)
at io.confluent.kafka.schemaregistry.client.rest.RestService.registerSchema(RestService.java:495)
at io.confluent.kafka.schemaregistry.client.rest.RestService.registerSchema(RestService.java:486)
at io.confluent.kafka.schemaregistry.client.rest.RestService.registerSchema(RestService.java:459)
at io.confluent.kafka.schemaregistry.client.CachedSchemaRegistryClient.registerAndGetId(CachedSchemaRegistryClient.java:206)
at io.confluent.kafka.schemaregistry.client.CachedSchemaRegistryClient.register(CachedSchemaRegistryClient.java:268)
at io.confluent.kafka.schemaregistry.client.CachedSchemaRegistryClient.register(CachedSchemaRegistryClient.java:244)
at io.confluent.kafka.serializers.AbstractKafkaAvroSerializer.serializeImpl(AbstractKafkaAvroSerializer.java:74)
at io.confluent.connect.avro.AvroConverter$Serializer.serialize(AvroConverter.java:138)
at io.confluent.connect.avro.AvroConverter.fromConnectData(AvroConverter.java:84)
at io.confluent.ksql.serde.connect.KsqlConnectSerializer.serialize(KsqlConnectSerializer.java:49)
at io.confluent.ksql.serde.tls.ThreadLocalSerializer.serialize(ThreadLocalSerializer.java:37)
at io.confluent.ksql.serde.GenericRowSerDe$GenericRowSerializer.serialize(GenericRowSerDe.java:281)
at io.confluent.ksql.serde.GenericRowSerDe$GenericRowSerializer.serialize(GenericRowSerDe.java:248)
at org.apache.kafka.common.serialization.Serializer.serialize(Serializer.java:62)
at org.apache.kafka.clients.producer.KafkaProducer.doSend(KafkaProducer.java:902)
at org.apache.kafka.clients.producer.KafkaProducer.send(KafkaProducer.java:862)
at io.confluent.ksql.datagen.DataGenProducer.produceOne(DataGenProducer.java:122)
at io.confluent.ksql.datagen.DataGenProducer.populateTopic(DataGenProducer.java:91)
at io.confluent.ksql.datagen.DataGen.lambda$getProducerTask$1(DataGen.java:111)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
By default, the registry url is localhost
schema.registry.url = [http://localhost:8081]
Datagen needs another parameter for schemaRegistryUrl assigned to the address of the running registry container
Looks like the docker file isn't correctly exposing the Schema Registry's port, try adding the ports mapping as follows:
schema-registry:
image: <something>
depends_on:
- zookeeper
- kafka
ports:
- "8081:8081"
environment:
SCHEMA_REGISTRY_HOST_NAME: schema-registry
SCHEMA_REGISTRY_KAFKASTORE_CONNECTION_URL: zookeeper:32181

How to monitor kafka consumer lag for transactional consumers

There is a useful metric for monitoring Kafka Consumer lag in spring-kafka called kafka_consumer_records_lag_max_records. But this metric is not working for transactional consumers. Is there specific configuration to enable lag metric for transactional consumers?
I have configured my consumer group to work with isolation level read_committed and the metric contains kafka_consumer_records_lag_max_records{client_id="listener-1",} -Inf
What do you mean by "doesn't work"? I just tested it and it works fine...
#SpringBootApplication
public class So56540759Application {
public static void main(String[] args) throws IOException {
ConfigurableApplicationContext context = SpringApplication.run(So56540759Application.class, args);
System.in.read();
context.close();
}
private MetricName lagNow;
private MetricName lagMax;
#Autowired
private MeterRegistry meters;
#KafkaListener(id = "so56540759", topics = "so56540759", clientIdPrefix = "so56540759",
properties = "max.poll.records=1")
public void listen(String in, Consumer<?, ?> consumer) {
Map<MetricName, ? extends Metric> metrics = consumer.metrics();
Metric currentLag = metrics.get(this.lagNow);
Metric maxLag = metrics.get(this.lagMax);
System.out.println(in
+ " lag " + currentLag.metricName().name() + ":" + currentLag.metricValue()
+ " max " + maxLag.metricName().name() + ":" + maxLag.metricValue());
Gauge gauge = meters.get("kafka.consumer.records.lag.max").gauge();
System.out.println("lag-max in Micrometer: " + gauge.value());
}
#Bean
public NewTopic topic() {
return new NewTopic("so56540759", 1, (short) 1);
}
#Bean
public ApplicationRunner runner(KafkaTemplate<String, String> template) {
Set<String> tags = new HashSet<>();
FetcherMetricsRegistry registry = new FetcherMetricsRegistry(tags, "consumer");
MetricNameTemplate temp = registry.recordsLagMax;
this.lagMax = new MetricName(temp.name(), temp.group(), temp.description(),
Collections.singletonMap("client-id", "so56540759-0"));
temp = registry.partitionRecordsLag;
Map<String, String> tagsMap = new LinkedHashMap<>();
tagsMap.put("client-id", "so56540759-0");
tagsMap.put("topic", "so56540759");
tagsMap.put("partition", "0");
this.lagNow = new MetricName(temp.name(), temp.group(), temp.description(), tagsMap);
return args -> IntStream.range(0, 10).forEach(i -> template.send("so56540759", "foo" + i));
}
}
2019-06-11 12:13:45.803 INFO 32187 --- [ main] o.a.k.clients.consumer.ConsumerConfig : ConsumerConfig values:
auto.commit.interval.ms = 5000
auto.offset.reset = earliest
bootstrap.servers = [localhost:9092]
check.crcs = true
client.id = so56540759-0
connections.max.idle.ms = 540000
default.api.timeout.ms = 60000
enable.auto.commit = false
exclude.internal.topics = true
fetch.max.bytes = 52428800
fetch.max.wait.ms = 500
fetch.min.bytes = 1
group.id = so56540759
heartbeat.interval.ms = 3000
interceptor.classes = []
internal.leave.group.on.close = true
isolation.level = read_committed
...
transaction.timeout.ms = 60000
...
2019-06-11 12:13:45.840 INFO 32187 --- [o56540759-0-C-1] o.s.k.l.KafkaMessageListenerContainer : partitions assigned: [so56540759-0]
foo0 lag records-lag:9.0 max records-lag-max:9.0
lag-max in Micrometer: 9.0
foo1 lag records-lag:8.0 max records-lag-max:9.0
lag-max in Micrometer: 9.0
foo2 lag records-lag:7.0 max records-lag-max:9.0
lag-max in Micrometer: 9.0
foo3 lag records-lag:6.0 max records-lag-max:9.0
lag-max in Micrometer: 9.0
foo4 lag records-lag:5.0 max records-lag-max:9.0
lag-max in Micrometer: 9.0
foo5 lag records-lag:4.0 max records-lag-max:9.0
lag-max in Micrometer: 9.0
foo6 lag records-lag:3.0 max records-lag-max:9.0
lag-max in Micrometer: 9.0
foo7 lag records-lag:2.0 max records-lag-max:9.0
lag-max in Micrometer: 9.0
foo8 lag records-lag:1.0 max records-lag-max:9.0
lag-max in Micrometer: 9.0
foo9 lag records-lag:0.0 max records-lag-max:9.0
lag-max in Micrometer: 9.0
EDIT2
I do see it going to -Infinity in the MBean if a transaction times out - i.e. if the listener doesn't exit within 60 seconds in my test.