Kafka Connect with Debezium - postgresql

I have setup the environment with Postgresql database , used debezium connector with Kafka Connect and Kafka. There are multiple instances(3) of Kafka running, and it is configured with Zookeeper (3).The connections in the entire pipeline is working , but as per the documentation of Debezium, there are no topics created automatically as per the tables in the database. Such as if is table A and table B inside some schema, I assume the 2 topics created implicitly in Kafka. The status of the connector and task is RUNNING, below mentioned are the configuration that I have done for the connector ,
{
"name": "geo-connector",
"config": {
"connector.class": "io.debezium.connector.postgresql.PostgresConnector",
"tasks.max": "1",
"database.hostname": <dbHostName>,
"database.port": <dbPort>,
"database.user": <dbUser>,
"database.password":<dbPassword> ,
"database.dbname" : <dbName>,
"database.server.name": <logicalName>,
"database.history.kafka.bootstrap.servers":<>,
"database.history.kafka.topic": "schema-changes.inventory",
"plugin.name":"wal2json",
"config.storage.replication.factor": "3",
"offset.storage.replication.factor" : "3",
"auto.create.topics.enable" : "true",
"snapshot.mode" : "always"
}
}
The error that I see in the Connect logs are as ,
2018-08-09 15:28:50,409 - DEBUG [KafkaBasedLog Work Thread - kconnect-offsets:Fetcher#199] - [Consumer clientId=consumer-1, groupId=1] Sending READ_UNCOMMITTED IncrementalFetchRequest(toSend=(), toForget=(), implied=(kconnect-offsets-10, kconnect-offsets-4, kconnect-offsets-16, kconnect-offsets-7, kconnect-offsets-19, kconnect-offsets-13, kconnect-offsets-22, kconnect-offsets-1)) to broker kafka-02.hotel02.pro06.eu.idealo.com:9092 (id: 2002 rack: pro06)
2018-08-09 15:28:50,465 - DEBUG [kafka-producer-network-thread | producer-6:NetworkClient$DefaultMetadataUpdater#927] - [Producer clientId=producer-6] Sending metadata request (type=MetadataRequest, topics=dbserver1.public.spatial_ref_sys) to node kafka-01.hotel02.pro05.eu.idealo.com:9092 (id: 2004 rack: pro05)
2018-08-09 15:28:50,467 - WARN [kafka-producer-network-thread | producer-6:NetworkClient$DefaultMetadataUpdater#882] - [Producer clientId=producer-6] Error while fetching metadata with correlation id 23856 : {dbserver1.public.spatial_ref_sys=UNKNOWN_TOPIC_OR_PARTITION}
2018-08-09 15:28:50,467 - DEBUG [kafka-producer-network-thread | producer-6:Metadata#270] - Updated cluster metadata version 23852 to Cluster(id = BwqlZApfT-ygzWr_wPcdng, nodes = [kafka-03.hotel02.pro05.eu.idealo.com:9092 (id: 2003 rack: pro05), kafka-01.hotel02.pro05.eu.idealo.com:9092 (id: 2004 rack: pro05), kafka-02.hotel02.pro06.eu.idealo.com:9092 (id: 2002 rack: pro06)], partitions = [])

The message you see in logs is warning, not an error. Could you please try kafka-topics.sh utility to list the topics available?

Related

Error deserializing message with Kafka Postgres Sink Connector

It's driving me crazy as I'm trying to sink a kafka topic into a Postgres table. Here's my setup and I'm not sure what I'm doing wrong.
This is a typical message from the Kafka topic
{
"flightId": "5cbc7ad25732ab0004c51c45",
"recordedAt": "2022-03-26T18:17:11.356Z",
"device": "iOS",
"platform": "A5",
"vehicleId": "621c12a9b12161009865bc5d"
}
Below is my docker-compose.yaml file
version: '3.7'
services:
connector:
image: custom-connector:latest
environment:
CONNECT_BOOTSTRAP_SERVERS: ${CONNECT_BOOTSTRAP_SERVERS}
CONNECT_GROUP_ID: "kafka-connect-group-id"
CONNECT_CONFIG_STORAGE_TOPIC: "kafka-connect-config"
CONNECT_OFFSET_STORAGE_TOPIC: "kafka-connect-offsets"
CONNECT_STATUS_STORAGE_TOPIC: "kafka-connect-status"
CONNECT_REST_ADVERTISED_HOST_NAME: ${CONNECT_REST_ADVERTISED_HOST_NAME}
CONNECT_SECURITY_PROTOCOL: ${CONNECT_SECURITY_PROTOCOL}
CONNECT_SASL_MECHANISM: ${CONNECT_SASL_MECHANISM}
CONNECT_REST_PORT: 8083
CONNECT_LOG4J_ROOT_LOGLEVEL: "INFO"
CONNECT_LOG4J_LOGGERS: "org.apache.kafka.connect.runtime.rest=WARN,org.reflections=ERROR"
CONNECT_CONFIG_STORAGE_REPLICATION_FACTOR: "3"
CONNECT_OFFSET_STORAGE_REPLICATION_FACTOR: "3"
CONNECT_STATUS_STORAGE_REPLICATION_FACTOR: "3"
CONNECT_SSL_ENDPOINT_IDENTIFICATION_ALGORITHM: "https"
CONNECT_REQUEST_TIMEOUT_MS: "20000"
CONNECT_RETRY_BACKOFF_MS: "500"
CONNECT_CONSUMER_SSL_ENDPOINT_IDENTIFICATION_ALGORITHM: "https"
CONNECT_CONSUMER_SASL_MECHANISM: "PLAIN"
CONNECT_CONSUMER_REQUEST_TIMEOUT_MS: "20000"
CONNECT_CONSUMER_RETRY_BACKOFF_MS: "500"
CONNECT_CONSUMER_SECURITY_PROTOCOL: ${CONNECT_SECURITY_PROTOCOL}
CONNECT_PRODUCER_SSL_ENDPOINT_IDENTIFICATION_ALGORITHM: "https"
CONNECT_PRODUCER_SASL_MECHANISM: "PLAIN"
CONNECT_PRODUCER_REQUEST_TIMEOUT_MS: "20000"
CONNECT_PRODUCER_RETRY_BACKOFF_MS: "500"
CONNECT_PRODUCER_SECURITY_PROTOCOL: ${CONNECT_SECURITY_PROTOCOL}
CONNECT_PLUGIN_PATH: /usr/share/java,/usr/share/confluent-hub-components,/u01/connectors
CONNECT_SASL_JAAS_CONFIG: ${JAAS_CONFIG}
CONNECT_CONSUMER_SASL_JAAS_CONFIG: ${JAAS_CONFIG}
CONNECT_PRODUCER_SASL_JAAS_CONFIG: ${JAAS_CONFIG}
CONNECT_VALUE_CONVERTER: io.confluent.connect.json.JsonSchemaConverter
CONNECT_VALUE_CONVERTER_SCHEMA_REGISTRY_URL: 'http://schema-registry:8081'
CONNECT_KEY_CONVERTER: org.apache.kafka.connect.storage.StringConverter
CONNECT_KEY_IGNORE: 'true'
ports:
- "8083:8083"
schema-registry:
image: "confluentinc/cp-schema-registry:5.2.1"
ports:
- '8081:8081'
environment:
SCHEMA_REGISTRY_HOST_NAME: schema-registry
SCHEMA_REGISTRY_KAFKASTORE_BOOTSTRAP_SERVERS: SASL_SSL://${CONNECT_BOOTSTRAP_SERVERS}
SCHEMA_REGISTRY_KAFKASTORE_SECURITY_PROTOCOL: SASL_SSL
SCHEMA_REGISTRY_KAFKASTORE_SASL_JAAS_CONFIG: ${JAAS_CONFIG}
SCHEMA_REGISTRY_KAFKASTORE_SASL_MECHANISM: PLAIN
SCHEMA_REGISTRY_LOG4J_ROOT_LOGLEVEL: INFO
My connector's config file when sending a PUT request to Kafka-connect.
{
"name": "test-postgres-sink-connector",
"config": {
"connector.class": "io.confluent.connect.jdbc.JdbcSinkConnector",
"connection.url": "jdbc:postgresql://******:5432/db",
"connection.user": "******",
"connection.password": "******",
"topics": "test-topic",
"table.name.format": "kafka_sink_test",
"value.converter": "io.confluent.connect.json.JsonSchemaConverter",
"value.converter.schemas.enable": "true",
"value.converter.schema.registry.url": "http://schema-registry:8081",
"key.converter": "org.apache.kafka.connect.storage.StringConverter",
"key.ignore": "true",
"name": "test-postgres-sink-connector"
},
"tasks": [
{
"connector": "test-postgres-sink-connector",
"task": 0
}
],
"type": "sink"
}
From the logs, kafka-connect is complaining:
ERROR WorkerSinkTask{id=test-postgres-sink-connector-0} Error converting message value in topic 'test-topic' partition 2 at offset 0 and timestamp 1647927842369: Converting byte[] to Kafka Connect data failed due to serialization error of topic test-topic: (org.apache.kafka.connect.runtime.WorkerSinkTask)
org.apache.kafka.connect.errors.DataException: Converting byte[] to Kafka Connect data failed due to serialization error of topic test-topic:
at io.confluent.connect.json.JsonSchemaConverter.toConnectData(JsonSchemaConverter.java:119)
at org.apache.kafka.connect.storage.Converter.toConnectData(Converter.java:87)
at org.apache.kafka.connect.runtime.WorkerSinkTask.convertValue(WorkerSinkTask.java:560)
at org.apache.kafka.connect.runtime.WorkerSinkTask.lambda$convertAndTransformRecord$4(WorkerSinkTask.java:516)
at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execAndRetry(RetryWithToleranceOperator.java:156)
at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execAndHandleError(RetryWithToleranceOperator.java:190)
at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execute(RetryWithToleranceOperator.java:132)
at org.apache.kafka.connect.runtime.WorkerSinkTask.convertAndTransformRecord(WorkerSinkTask.java:516)
at org.apache.kafka.connect.runtime.WorkerSinkTask.convertMessages(WorkerSinkTask.java:493)
at org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:332)
at org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:234)
at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:203)
at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:188)
at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:243)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: org.apache.kafka.common.errors.SerializationException: Error deserializing JSON message for id -1
at io.confluent.kafka.serializers.json.AbstractKafkaJsonSchemaDeserializer.deserialize(AbstractKafkaJsonSchemaDeserializer.java:177)
at io.confluent.kafka.serializers.json.AbstractKafkaJsonSchemaDeserializer.deserializeWithSchemaAndVersion(AbstractKafkaJsonSchemaDeserializer.java:235)
at io.confluent.connect.json.JsonSchemaConverter$Deserializer.deserialize(JsonSchemaConverter.java:165)
at io.confluent.connect.json.JsonSchemaConverter.toConnectData(JsonSchemaConverter.java:108)
... 18 more
Caused by: org.apache.kafka.common.errors.SerializationException: Unknown magic byte!
at io.confluent.kafka.serializers.AbstractKafkaSchemaSerDe.getByteBuffer(AbstractKafkaSchemaSerDe.java:250)
at io.confluent.kafka.serializers.json.AbstractKafkaJsonSchemaDeserializer.deserialize(AbstractKafkaJsonSchemaDeserializer.java:112)
... 21 more
[2022-03-26 18:11:31,779] ERROR WorkerSinkTask{id=test-postgres-sink-connector-0} Task threw an uncaught and unrecoverable exception. Task is being killed and will not recover until manually restarted (org.apache.kafka.connect.runtime.WorkerTask)
org.apache.kafka.connect.errors.ConnectException: Tolerance exceeded in error handler
at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execAndHandleError(RetryWithToleranceOperator.java:206)
at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execute(RetryWithToleranceOperator.java:132)
at org.apache.kafka.connect.runtime.WorkerSinkTask.convertAndTransformRecord(WorkerSinkTask.java:516)
at org.apache.kafka.connect.runtime.WorkerSinkTask.convertMessages(WorkerSinkTask.java:493)
at org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:332)
at org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:234)
at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:203)
at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:188)
at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:243)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: org.apache.kafka.connect.errors.DataException: Converting byte[] to Kafka Connect data failed due to serialization error of topic test-topic:
at io.confluent.connect.json.JsonSchemaConverter.toConnectData(JsonSchemaConverter.java:119)
at org.apache.kafka.connect.storage.Converter.toConnectData(Converter.java:87)
at org.apache.kafka.connect.runtime.WorkerSinkTask.convertValue(WorkerSinkTask.java:560)
at org.apache.kafka.connect.runtime.WorkerSinkTask.lambda$convertAndTransformRecord$4(WorkerSinkTask.java:516)
at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execAndRetry(RetryWithToleranceOperator.java:156)
at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execAndHandleError(RetryWithToleranceOperator.java:190)
... 13 more
Caused by: org.apache.kafka.common.errors.SerializationException: Error deserializing JSON message for id -1
at io.confluent.kafka.serializers.json.AbstractKafkaJsonSchemaDeserializer.deserialize(AbstractKafkaJsonSchemaDeserializer.java:177)
at io.confluent.kafka.serializers.json.AbstractKafkaJsonSchemaDeserializer.deserializeWithSchemaAndVersion(AbstractKafkaJsonSchemaDeserializer.java:235)
at io.confluent.connect.json.JsonSchemaConverter$Deserializer.deserialize(JsonSchemaConverter.java:165)
at io.confluent.connect.json.JsonSchemaConverter.toConnectData(JsonSchemaConverter.java:108)
... 18 more
Caused by: org.apache.kafka.common.errors.SerializationException: Unknown magic byte!
at io.confluent.kafka.serializers.AbstractKafkaSchemaSerDe.getByteBuffer(AbstractKafkaSchemaSerDe.java:250)
at io.confluent.kafka.serializers.json.AbstractKafkaJsonSchemaDeserializer.deserialize(AbstractKafkaJsonSchemaDeserializer.java:112)
... 21 more
[2022-03-26 18:11:31,780] INFO Stopping task (io.confluent.connect.jdbc.sink.JdbcSinkTask)
[2022-03-26 18:11:31,781] INFO [Consumer clientId=connector-consumer-test-postgres-sink-connector-0, groupId=test-postgres-sink-connector] Revoke previously assigned partitions test-topic-0, test-topic-1, test-topic-2, test-topic-3, test-topic-4, test-topic-5 (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
[2022-03-26 18:11:31,781] INFO [Consumer clientId=test-postgres-sink-connector-0, groupId=test-postgres-sink-connector] Member test-postgres-sink-connector-0-89225797-cac6-41f5-9373-bbd16bc8a1b6 sending LeaveGroup request to coordinator b2-pkc-2396y.us-east-1.aws.confluent.cloud:9092 (id: 2147483645 rack: null) due to the consumer is being closed (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
[2022-03-26 18:11:31,783] INFO [Consumer clientId=test-postgres-sink-connector-0, groupId=test-postgres-sink-connector] Resetting generation due to: consumer pro-actively leaving the group (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
[2022-03-26 18:11:31,783] INFO [Consumer clientId=connector-test-postgres-sink-connector-0, groupId=connect-test-postgres-sink-connector] Request joining group due to: consumer pro-actively leaving the group (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
[2022-03-26 18:11:32,284] INFO Metrics scheduler closed (org.apache.kafka.common.metrics.Metrics)
[2022-03-26 18:11:32,285] INFO Closing reporter org.apache.kafka.common.metrics.JmxReporter (org.apache.kafka.common.metrics.Metrics)
[2022-03-26 18:11:32,286] INFO Metrics reporters closed (org.apache.kafka.common.metrics.Metrics)
[2022-03-26 18:11:32,316] INFO App info kafka.consumer for connector-test-postgres-sink-connector-0 unregistered (org.apache.kafka.common.utils.AppInfoParser)
This is a typical message from the Kafka topic
Your data has no schema, so you cannot use JsonSchemaConverter. Plus, the JDBC Sink requires a schema. JDBC Sink Deep Dive
Since it is has no schema, and specifically didn't use the JSONSchema serializer with the Confluent Schema Registry, then you are getting Unknown magic byte! error from that Converter. Instead, you'll need to instead use the regular JSONConverter class (not prefixed with io.confluent, but rather org.apache.kafka). But as stated, value.converter.schemas.enable must be true.
More info - Converter Deep Dive

Unable to use sink connector inside kafka connect

I am trying to use S3 sink connector inside kafka connect , It starts and fails later .
My config looks like :
{
"name": "my-s3-sink3",
"config": {
"connector.class":"io.confluent.connect.s3.S3SinkConnector",
"tasks.max":"1",
"topics":"mysource.topic",
"s3.region":"us-east-1",
"s3.bucket.name": "topicbucket001",
"s3.part.size":"5242880",
"flush.size":"1",
"storage.class":"io.confluent.connect.s3.storage.S3Storage",
"format.class": "io.confluent.connect.s3.format.json.JsonFormat",
"partitioner.class":"io.confluent.connect.storage.partitioner.DefaultPartitioner",
"schema.compatibility":"NONE"
}
}
My connect-distributed.properties look like:
key.converter=org.apache.kafka.connect.json.JsonConverter
value.converter=org.apache.kafka.connect.json.JsonConverter
key.converter.schemas.enable=false
value.converter.schemas.enable=false
errors.tolerance = all
Complete Error log :
[2021-04-06 10:59:04,398] INFO [Consumer clientId=connector-consumer-s3connect12-0, groupId=connect-s3connect12] Member connector-consumer-s3connect12-0-f1e48df8-76ba-49f9-9080-e10b0a34202b sending LeaveGroup request to coordinator **********.kafka.us-east-1.amazonaws.com:9092 (id: 2147483645 rack: null) due to the consumer is being closed (org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
2021-04-06 16:29:04
[2021-04-06 10:59:04,397] ERROR WorkerSinkTask{id=s3connect12-0} Task threw an uncaught and unrecoverable exception. Task is being killed and will not recover until manually restarted (org.apache.kafka.connect.runtime.WorkerTask)
2021-04-06 16:29:04
[2021-04-06 10:59:04,396] ERROR WorkerSinkTask{id=s3connect12-0} Error converting message key in topic 'quickstart-status' partition 3 at offset 0 and timestamp 1617706740956: Converting byte[] to Kafka Connect data failed due to serialization error: (org.apache.kafka.connect.runtime.WorkerSinkTask)
2021-04-06 16:29:04
[2021-04-06 10:59:04,393] INFO [Consumer clientId=connector-consumer-s3connect12-0, groupId=connect-s3connect12] Resetting offset for partition quickstart-status-3 to position FetchPosition{offset=0, offsetEpoch=Optional.empty, currentLeader=LeaderAndEpoch{leader=Optional[***************.kafka.us-east-1.amazonaws.com:9092 (id: 1 rack: use1-az2)], epoch=absent}}. (org.apache.kafka.clients.consumer.internals.SubscriptionState)
Message type :
{
"registertime": 1511985752912,
"userid": "User_6",
"regionid": "Region_8",
"gender": "FEMALE"
}
New ERROR Log :
The problem is the Key SerDe. Per your screenshot the key data is a non-JSON string:
User_2
User_9
etc
So instead of
key.converter=org.apache.kafka.connect.json.JsonConverter
use
key.converter=org.apache.kafka.connect.storage.StringConverter
Edit:
Try this for your connector config, specifying the converters explicitly (as suggested by #OneCricketeer)
{
"name": "my-s3-sink3",
"config": {
"connector.class" : "io.confluent.connect.s3.S3SinkConnector",
"tasks.max" : "1",
"topics" : "mysource.topic",
"s3.region" : "us-east-1",
"s3.bucket.name" : "topicbucket001",
"s3.part.size" : "5242880",
"flush.size" : "1",
"key.converter" : "org.apache.kafka.connect.storage.StringConverter",
"value.converter" : "org.apache.kafka.connect.json.JsonConverter",
"value.converter.schemas.enable": "false",
"storage.class" : "io.confluent.connect.s3.storage.S3Storage",
"format.class" : "io.confluent.connect.s3.format.json.JsonFormat",
"partitioner.class" : "io.confluent.connect.storage.partitioner.DefaultPartitioner",
"schema.compatibility" : "NONE"
}
}
So I am able to resolve the issue. After specifying the converters explicitly, I was able to resolve the deserialization error and then had an issue with S3 Multipart Upload which was resolved by giving Fargate task permission to the S3 bucket by attaching S3 IAM Policy to the ECS Task definition.
Thanks, Robin Moffatt for the solution above!

Extremely slow startup of a Spring Cloud Stream Kafka application when using enable.idempotence true

My Scs application has two Kafka producers with this configuration:
spring:
cloud:
function:
definition: myProducer1;myProducer2
stream:
bindings:
myproducer1-out-0:
destination: topic1
producer:
useNativeEncoding: true
myproducer2-out-0:
destination: topic2
producer:
useNativeEncoding: true
kafka:
binder:
brokers: ${kafka.brokers:localhost}
min-partition-count: 3
replication-factor: 3
producerProperties:
enable:
idempotence: false
retries: 10000
acks: all
key:
serializer: io.confluent.kafka.serializers.KafkaAvroSerializer
subject:
name:
strategy: io.confluent.kafka.serializers.subject.RecordNameStrategy
value:
serializer: io.confluent.kafka.serializers.KafkaAvroSerializer
subject:
name:
strategy: io.confluent.kafka.serializers.subject.RecordNameStrategy
schema:
registry:
url: ${schema-registry.url:http://localhost:8081}
It starts in about ~10 seconds:
o.s.c.s.m.DirectWithAttributesChannel : Channel 'my-app-1.myproducer2-out-0' has 1 subscriber(s).
o.s.b.web.embedded.netty.NettyWebServer : Netty started on port(s): 8084
e.p.i.m.MyAppApplicationKt : Started MyAppApplicationKt in 11.288 seconds (JVM running for 11.868)
I need my producers to be idempotent so I set enabled.idempotence: true. With this change the startup time is 7x slower (sometimes even more than 10x):
o.s.c.s.m.DirectWithAttributesChannel : Channel 'my-app-1.myproducer2-out-0' has 1 subscriber(s).
o.s.b.web.embedded.netty.NettyWebServer : Netty started on port(s): 8084
e.p.i.m.MyAppApplicationKt : Started MyAppApplicationKt in 71.489 seconds (JVM running for 72.127)
How can I speed up the startup?
UPDATE:
I've found a problem during the startup (Proceeding to force close the producer since pending requests could not be completed within timeout 30000 ms.), sometimes it happens in one of the producers, others in both and others in none of them. When it doesn't show up, the startup is as fast as it used to be.
In the following log, it happens only in one producer:
o.a.k.clients.producer.KafkaProducer : [Producer clientId=producer-1] Instantiated an idempotent producer.
o.a.k.c.s.authenticator.AbstractLogin : Successfully logged in.
o.a.kafka.common.utils.AppInfoParser : Kafka version: 2.3.1
o.a.kafka.common.utils.AppInfoParser : Kafka commitId: 18a913733fb71c01
o.a.kafka.common.utils.AppInfoParser : Kafka startTimeMs: 1586864007183
org.apache.kafka.clients.Metadata : [Producer clientId=producer-1] Cluster ID: lkc-nvqmv
o.a.k.clients.producer.KafkaProducer : [Producer clientId=producer-1] Closing the Kafka producer with timeoutMillis = 30000 ms.
o.a.k.c.p.internals.TransactionManager : [Producer clientId=producer-1] ProducerId set to 32029 with epoch 0
Then after having been stuck for 30 seconds in ProducerId set to 32029 with epoch 0, it logs the info message of Proceeding to force close...and initializes the second producer without any problems:
o.a.k.clients.producer.KafkaProducer : [Producer clientId=producer-1] Proceeding to force close the producer since pending
o.s.c.s.m.DirectWithAttributesChannel : Channel 'my-app-1.myproducer1-out-0' has 1 subscriber(s).
o.s.c.s.b.k.p.KafkaTopicProvisioner : Using kafka topic for outbound: topic2
o.a.k.clients.admin.AdminClientConfig : AdminClientConfig values:
...
o.a.k.clients.producer.KafkaProducer : [Producer clientId=producer-2] Instantiated an idempotent producer.
o.a.k.c.s.authenticator.AbstractLogin : Successfully logged in.
o.a.kafka.common.utils.AppInfoParser : Kafka version: 2.3.1
o.a.kafka.common.utils.AppInfoParser : Kafka commitId: 18a913733fb71c01
o.a.kafka.common.utils.AppInfoParser : Kafka startTimeMs: 1586864038612
org.apache.kafka.clients.Metadata : [Producer clientId=producer-2] Cluster ID: lkc-nvqmv
o.a.k.clients.producer.KafkaProducer : [Producer clientId=producer-2] Closing the Kafka producer with timeoutMillis = 30000 ms.
o.a.k.c.p.internals.TransactionManager : [Producer clientId=producer-2] ProducerId set to 32030 with epoch 0
o.a.k.clients.producer.KafkaProducer : [Producer clientId=producer-2] Proceeding to force close the producer since pending
o.s.c.s.m.DirectWithAttributesChannel : Channel 'my-app-1.myproducer2-out-0' has 1 subscriber(s).
o.s.b.web.embedded.netty.NettyWebServer : Netty started on port(s): 8084
e.p.i.m.MetricsIngestorApplicationKt : Started MetricsIngestorApplicationKt in 66.834 seconds (JVM running for 67.544)
UPDATE 2:
I've debugged the logic behind this, it happends during the doBindProducer() method. It gets the partitions for the topic, for which it creates a ProducerFactory in KafkaMessageChannelBinder.
#Override
protected MessageHandler createProducerMessageHandler(
final ProducerDestination destination,
ExtendedProducerProperties<KafkaProducerProperties> producerProperties,
MessageChannel channel, MessageChannel errorChannel) throws Exception {
/*
* IMPORTANT: With a transactional binder, individual producer properties for
* Kafka are ignored; the global binder
* (spring.cloud.stream.kafka.binder.transaction.producer.*) properties are used
* instead, for all producers. A binder is transactional when
* 'spring.cloud.stream.kafka.binder.transaction.transaction-id-prefix' has text.
*/
final ProducerFactory<byte[], byte[]> producerFB = this.transactionManager != null
? this.transactionManager.getProducerFactory()
: getProducerFactory(null, producerProperties);
Collection<PartitionInfo> partitions = provisioningProvider.getPartitionsForTopic(
producerProperties.getPartitionCount(), false, () -> {
Producer<byte[], byte[]> producer = producerFB.createProducer();
List<PartitionInfo> partitionsFor = producer
.partitionsFor(destination.getName());
producer.close();
if (this.transactionManager == null) {
((DisposableBean) producerFB).destroy();
}
return partitionsFor;
}, destination.getName());
After retrieving correctly this list List<PartitionInfo> partitionsFor, it gets stuck in KafkaProducer.destroy() until the 30 seconds timeout expires:
Why does it block there? Could it be a bug of the binder?
I am not sure why the close is timing out, but you should be able to configure that timeout.
Please open an issue against the binder; it currently does not support reducing the close timeout from its default (30 seconds).

FlinkKafkaConsumer010 doesn't work when set with setStartFromTimestamp

I'm using flink streaming and flink-connector-kafka to process data from kafka. when I configure FlinkKafkaConsumer010 with setStartFromTimestamp(1586852770000L) , at this time, all data's time in kafka topic A is before 1586852770000L, then I send some message to partition-0 and partition-4 of Topic A (Topic A has 6 partitions, current system time is already after 1586852770000L). but my flink program doesn't consume any data from Topic A. So is this a issue?
if I stop my flink program and restart it, it can consume data from partition-0 and partition-4 of Topic A , but still won't consume any data from other 4 partitions if i send data to the other 4 partitions unless i restart my flink program again.
the log of kafka is as follows:
2020-04-15 11:48:46,447 TRACE org.apache.kafka.clients.consumer.internals.Fetcher - Sending ListOffsetRequest (type=ListOffsetRequest, replicaId=-1, partitionTimestamps={TopicA-4=1586836800000}, minVersion=1) to broker server1:9092 (id: 185 rack: null)
2020-04-15 11:48:46,463 TRACE org.apache.kafka.clients.NetworkClient - Sending {replica_id=-1,topics=[{topic=TopicA,partitions=[{partition=0,timestamp=1586836800000}]}]} to node 184.
2020-04-15 11:48:46,466 TRACE org.apache.kafka.clients.NetworkClient - Completed receive from node 185, for key 2, received {responses=[{topic=TopicA,partition_responses=[{partition=4,error_code=0,timestamp=1586852770000,offset=4}]}]}
2020-04-15 11:48:46,467 TRACE org.apache.kafka.clients.consumer.internals.Fetcher - Received ListOffsetResponse {responses=[{topic=TopicA,partition_responses=[{partition=4,error_code=0,timestamp=1586852770000,offset=4}]}]} from broker server1:9092 (id: 185 rack: null)
2020-04-15 11:48:46,467 DEBUG org.apache.kafka.clients.consumer.internals.Fetcher - Handling ListOffsetResponse response for TopicA-4. Fetched offset 4, timestamp 1586852770000
2020-04-15 11:48:46,448 TRACE org.apache.kafka.clients.consumer.internals.Fetcher - Sending ListOffsetRequest (type=ListOffsetRequest, replicaId=-1, partitionTimestamps={TopicA-0=1586836800000}, minVersion=1) to broker server2:9092 (id: 184 rack: null)
2020-04-15 11:48:46,463 TRACE org.apache.kafka.clients.NetworkClient - Sending {replica_id=-1,topics=[{topic=TopicA,partitions=[{partition=0,timestamp=1586836800000}]}]} to node 184.
2020-04-15 11:48:46,467 TRACE org.apache.kafka.clients.NetworkClient - Completed receive from node 184, for key 2, received {responses=[{topic=TopicA,partition_responses=[{partition=0,error_code=0,timestamp=1586863210000,offset=47}]}]}
2020-04-15 11:48:46,467 TRACE org.apache.kafka.clients.consumer.internals.Fetcher - Received ListOffsetResponse {responses=[{topic=TopicA,partition_responses=[{partition=0,error_code=0,timestamp=1586863210000,offset=47}]}]} from broker server2:9092 (id: 184 rack: null)
2020-04-15 11:48:46,467 DEBUG org.apache.kafka.clients.consumer.internals.Fetcher - Handling ListOffsetResponse response for TopicA-0. Fetched offset 47, timestamp 1586863210000
2020-04-15 11:48:46,448 TRACE org.apache.kafka.clients.consumer.internals.Fetcher - Sending ListOffsetRequest (type=ListOffsetRequest, replicaId=-1, partitionTimestamps={TopicA-2=1586836800000}, minVersion=1) to broker server3:9092 (id: 183 rack: null)
2020-04-15 11:48:46,465 TRACE org.apache.kafka.clients.NetworkClient - Sending {replica_id=-1,topics=[{topic=TopicA,partitions=[{partition=2,timestamp=1586836800000}]}]} to node 183.
2020-04-15 11:48:46,468 TRACE org.apache.kafka.clients.NetworkClient - Completed receive from node 183, for key 2, received {responses=[{topic=TopicA,partition_responses=[{partition=2,error_code=0,timestamp=-1,offset=-1}]}]}
2020-04-15 11:48:46,468 TRACE org.apache.kafka.clients.consumer.internals.Fetcher - Received ListOffsetResponse {responses=[{topic=TopicA,partition_responses=[{partition=2,error_code=
0,timestamp=-1,offset=-1}]}]} from broker server3:9092 (id: 183 rack: null)
2020-04-15 11:48:46,468 DEBUG org.apache.kafka.clients.consumer.internals.Fetcher - Handling ListOffsetResponse response for TopicA-2. Fetched offset -1, timestamp -1
2020-04-15 11:48:46,481 INFO org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumerBase - Consumer subtask 0 will start reading the following 2 partitions from timestamp 1586836800000: [KafkaTopicPartition{topic='TopicA', partition=4}, KafkaTopicPartition{topic='TopicA', partition=0}]
from the log, except partition-0 and partition-4, other 4 partition's offset is -1. why the return offset is -1 instead of the lastest offset?
in Kafka client's code( Fetcher.java,line: 674-680)
// Handle v1 and later response
log.debug("Handling ListOffsetResponse response for {}. Fetched offset {}, timestamp {}",topicPartition, partitionData.offset, partitionData.timestamp);
if (partitionData.offset != ListOffsetResponse.UNKNOWN_OFFSET) {
OffsetData offsetData = new OffsetData(partitionData.offset, partitionData.timestamp);
timestampOffsetMap.put(topicPartition, offsetData);
}
the value of ListOffsetResponse.UNKNOWN_OFFSET is -1 . So the other 4 partitions is filtered , and the kafka consumer will not consume data from the other 4 partitions.
My Flink version is 1.9.2 and flink kafka connertor is
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-connector-kafka-0.10_2.11</artifactId>
<version>1.9.2</version>
</dependency>
the doc of flink kafka connector is as follows:
setStartFromTimestamp(long): Start from the specified timestamp. For
each partition, the record whose timestamp is larger than or equal to
the specified timestamp will be used as the start position. If a
partition’s latest record is earlier than the timestamp, the partition
will simply be read from the latest record.
test program code:
import java.util.Properties
import org.apache.flink.api.common.serialization.SimpleStringSchema
import org.apache.flink.streaming.api.TimeCharacteristic
import org.apache.flink.streaming.api.scala._
import org.apache.flink.streaming.api.scala.StreamExecutionEnvironment
import org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer010
import org.junit.Test
class TestFlinkKafka {
#Test
def testFlinkKafkaDemo: Unit ={
//1. set up the streaming execution environment.
val env = StreamExecutionEnvironment.getExecutionEnvironment
env.setStreamTimeCharacteristic( TimeCharacteristic.ProcessingTime)
// To use fault tolerant Kafka Consumers, checkpointing needs to be enabled at the execution environment
env.enableCheckpointing(60000)
//2. kafka source
val topic = "message"
val schema = new SimpleStringSchema()
//server1:9092,server2:9092,server3:9092
val props = getKafkaConsumerProperties("localhost:9092","flink-streaming-client", "latest")
val consumer = new FlinkKafkaConsumer010(topic, schema, props)
//consume data from special timestamp's offset
//2020/4/14 20:0:0
//consumer.setStartFromTimestamp(1586865600000L)
//2020/4/15 20:0:0
consumer.setStartFromTimestamp(1586952000000L)
consumer.setCommitOffsetsOnCheckpoints(true)
//3. transform
val stream = env.addSource(consumer)
.map(x => x)
//4. sink
stream.print()
//5. execute
env.execute("testFlinkKafkaConsumer")
}
def getKafkaConsumerProperties(brokerList:String, groupId:String, offsetReset:String): Properties ={
val props = new Properties()
props.setProperty("bootstrap.servers", brokerList)
props.setProperty("group.id", groupId)
props.setProperty("auto.offset.reset", offsetReset)
props.setProperty("flink.partition-discovery.interval-millis", "30000")
props
}
}
set log level for kafka:
log4j.logger.org.apache.kafka=TRACE
create kafka topic:
kafka-topics --zookeeper localhost:2181/kafka --create --topic message --partitions 6 --replication-factor 1
send message to kafka topic
kafka-console-producer --broker-list localhost:9092 --topic message
{"name":"tom"}
{"name":"michael"}
This problem was resolved by upgrading the Flink/Kafka connector to the newer, universal connector -- FlinkKafkaConsumer -- available from flink-connector-kafka_2.11. This version of the connector is recommended for all versions of Kafka from 1.0.0 forward. With Kafka 0.10.x or 0.11.x, it is better to use the version-specific flink-connector-kafka-0.10_2.11 or flink-connector-kafka-0.11_2.11 connectors. (And in all cases, substitute 2.12 for 2.11 if you are using Scala 2.12.)
See the Flink documentation for more information on Flink's Kafka connector.

Kafka Streams not writing to sink topic

I am trying to learn Kafka Streams using Confluent's test platform and the setup instruction here. I can start up and connect to my test broker, but the streams application never writes to my sink topic. Looking in the logs, Kafka Streams is constantly fetching and monitoring the offset (if I am reading the logs correctly), but it never actually reads or writes anything.
14:07:29.654 [katanaTest-a821659c-a994-4c6f-9714-fa48020b6378-StreamThread-1] DEBUG org.apache.kafka.clients.consumer.internals.AbstractCoordinator - [Consumer clientId=katanaTest-a821659c-a994-4c6f-9714-fa48020b6378-StreamThread-1-consumer, groupId=katanaTest] Received successful Heartbeat response
14:07:29.770 [katanaTest-a821659c-a994-4c6f-9714-fa48020b6378-StreamThread-1] DEBUG org.apache.kafka.clients.consumer.internals.Fetcher - [Consumer clientId=katanaTest-a821659c-a994-4c6f-9714-fa48020b6378-StreamThread-1-consumer, groupId=katanaTest] Fetch READ_UNCOMMITTED at offset 4 for partition exportStatusUpdatesV2_rquinlivan-0 returned fetch data (error=NONE, highWaterMark=4, lastStableOffset = -1, logStartOffset = 0, abortedTransactions = null, recordsSizeInBytes=0)
14:07:29.770 [katanaTest-a821659c-a994-4c6f-9714-fa48020b6378-StreamThread-1] DEBUG org.apache.kafka.clients.consumer.internals.Fetcher - [Consumer clientId=katanaTest-a821659c-a994-4c6f-9714-fa48020b6378-StreamThread-1-consumer, groupId=katanaTest] Added READ_UNCOMMITTED fetch request for partition exportStatusUpdatesV2_rquinlivan-0 at offset 4 to node localhost:29092 (id: 1 rack: null)
14:07:29.770 [katanaTest-a821659c-a994-4c6f-9714-fa48020b6378-StreamThread-1] DEBUG org.apache.kafka.clients.consumer.internals.Fetcher - [Consumer clientId=katanaTest-a821659c-a994-4c6f-9714-fa48020b6378-StreamThread-1-consumer, groupId=katanaTest] Sending READ_UNCOMMITTED fetch for partitions [exportStatusUpdatesV2_rquinlivan-0] to broker localhost:29092 (id: 1 rack: null)
14:07:30.273 [katanaTest-a821659c-a994-4c6f-9714-fa48020b6378-StreamThread-1] DEBUG org.apache.kafka.clients.consumer.internals.Fetcher - [Consumer clientId=katanaTest-a821659c-a994-4c6f-9714-fa48020b6378-StreamThread-1-consumer, groupId=katanaTest] Fetch READ_UNCOMMITTED at offset 4 for partition exportStatusUpdatesV2_rquinlivan-0 returned fetch data (error=NONE, highWaterMark=4, lastStableOffset = -1, logStartOffset = 0, abortedTransactions = null, recordsSizeInBytes=0)
14:07:30.273 [katanaTest-a821659c-a994-4c6f-9714-fa48020b6378-StreamThread-1] DEBUG org.apache.kafka.clients.consumer.internals.Fetcher - [Consumer clientId=katanaTest-a821659c-a994-4c6f-9714-fa48020b6378-StreamThread-1-consumer, groupId=katanaTest] Added READ_UNCOMMITTED fetch request for partition exportStatusUpdatesV2_rquinlivan-0 at offset 4 to node localhost:29092 (id: 1 rack: null)
14:07:30.273 [katanaTest-a821659c-a994-4c6f-9714-fa48020b6378-StreamThread-1] DEBUG org.apache.kafka.clients.consumer.internals.Fetcher - [Consumer clientId=katanaTest-a821659c-a994-4c6f-9714-fa48020b6378-StreamThread-1-consumer, groupId=katanaTest] Sending READ_UNCOMMITTED fetch for partitions [exportStatusUpdatesV2_rquinlivan-0] to broker localhost:29092 (id: 1 rack: null)
14:07:30.775 [katanaTest-a821659c-a994-4c6f-9714-fa48020b6378-StreamThread-1] DEBUG org.apache.kafka.clients.consumer.internals.Fetcher - [Consumer clientId=katanaTest-a821659c-a994-4c6f-9714-fa48020b6378-StreamThread-1-consumer, groupId=katanaTest] Fetch READ_UNCOMMITTED at offset 4 for partition exportStatusUpdatesV2_rquinlivan-0 returned fetch data (error=NONE, highWaterMark=4, lastStableOffset = -1, logStartOffset = 0, abortedTransactions = null, recordsSizeInBytes=0)
14:07:30.776 [katanaTest-a821659c-a994-4c6f-9714-fa48020b6378-StreamThread-1] DEBUG org.apache.kafka.clients.consumer.internals.Fetcher - [Consumer clientId=katanaTest-a821659c-a994-4c6f-9714-fa48020b6378-StreamThread-1-consumer, groupId=katanaTest] Added READ_UNCOMMITTED fetch request for partition exportStatusUpdatesV2_rquinlivan-0 at offset 4 to node localhost:29092 (id: 1 rack: null)
14:07:30.776 [katanaTest-a821659c-a994-4c6f-9714-fa48020b6378-StreamThread-1] DEBUG org.apache.kafka.clients.consumer.internals.Fetcher - [Consumer clientId=katanaTest-a821659c-a994-4c6f-9714-fa48020b6378-StreamThread-1-consumer, groupId=katanaTest] Sending READ_UNCOMMITTED fetch for partitions [exportStatusUpdatesV2_rquinlivan-0] to broker localhost:29092 (id: 1 rack: null)
14:07:31.279 [katanaTest-a821659c-a994-4c6f-9714-fa48020b6378-StreamThread-1] DEBUG org.apache.kafka.clients.consumer.internals.Fetcher - [Consumer clientId=katanaTest-a821659c-a994-4c6f-9714-fa48020b6378-StreamThread-1-consumer, groupId=katanaTest] Fetch READ_UNCOMMITTED at offset 4 for partition exportStatusUpdatesV2_rquinlivan-0 returned fetch data (error=NONE, highWaterMark=4, lastStableOffset = -1, logStartOffset = 0, abortedTransactions = null, recordsSizeInBytes=0)
14:07:31.279 [katanaTest-a821659c-a994-4c6f-9714-fa48020b6378-StreamThread-1] DEBUG org.apache.kafka.clients.consumer.internals.Fetcher - [Consumer clientId=katanaTest-a821659c-a994-4c6f-9714-fa48020b6378-StreamThread-1-consumer, groupId=katanaTest] Added READ_UNCOMMITTED fetch request for partition exportStatusUpdatesV2_rquinlivan-0 at offset 4 to node localhost:29092 (id: 1 rack: null)
14:07:31.279 [katanaTest-a821659c-a994-4c6f-9714-fa48020b6378-StreamThread-1] DEBUG org.apache.kafka.clients.consumer.internals.Fetcher - [Consumer clientId=katanaTest-a821659c-a994-4c6f-9714-fa48020b6378-StreamThread-1-consumer, groupId=katanaTest] Sending READ_UNCOMMITTED fetch for partitions [exportStatusUpdatesV2_rquinlivan-0] to broker localhost:29092 (id: 1 rack: null)
14:07:31.782 [katanaTest-a821659c-a994-4c6f-9714-fa48020b6378-StreamThread-1] DEBUG org.apache.kafka.clients.consumer.internals.Fetcher - [Consumer clientId=katanaTest-a821659c-a994-4c6f-9714-fa48020b6378-StreamThread-1-consumer, groupId=katanaTest] Fetch READ_UNCOMMITTED at offset 4 for partition exportStatusUpdatesV2_rquinlivan-0 returned fetch data (error=NONE, highWaterMark=4, lastStableOffset = -1, logStartOffset = 0, abortedTransactions = null, recordsSizeInBytes=0)
14:07:31.782 [katanaTest-a821659c-a994-4c6f-9714-fa48020b6378-StreamThread-1] DEBUG org.apache.kafka.clients.consumer.internals.Fetcher - [Consumer clientId=katanaTest-a821659c-a994-4c6f-9714-fa48020b6378-StreamThread-1-consumer, groupId=katanaTest] Added READ_UNCOMMITTED fetch request for partition exportStatusUpdatesV2_rquinlivan-0 at offset 4 to node localhost:29092 (id: 1 rack: null)
14:07:31.782 [katanaTest-a821659c-a994-4c6f-9714-fa48020b6378-StreamThread-1] DEBUG org.apache.kafka.clients.consumer.internals.Fetcher - [Consumer clientId=katanaTest-a821659c-a994-4c6f-9714-fa48020b6378-StreamThread-1-consumer, groupId=katanaTest] Sending READ_UNCOMMITTED fetch for partitions [exportStatusUpdatesV2_rquinlivan-0] to broker localhost:29092 (id: 1 rack: null)
14:07:32.284 [katanaTest-a821659c-a994-4c6f-9714-fa48020b6378-StreamThread-1] DEBUG org.apache.kafka.clients.consumer.internals.Fetcher - [Consumer clientId=katanaTest-a821659c-a994-4c6f-9714-fa48020b6378-StreamThread-1-consumer, groupId=katanaTest] Fetch READ_UNCOMMITTED at offset 4 for partition exportStatusUpdatesV2_rquinlivan-0 returned fetch data (error=NONE, highWaterMark=4, lastStableOffset = -1, logStartOffset = 0, abortedTransactions = null, recordsSizeInBytes=0)
14:07:32.284 [katanaTest-a821659c-a994-4c6f-9714-fa48020b6378-StreamThread-1] DEBUG org.apache.kafka.clients.consumer.internals.Fetcher - [Consumer clientId=katanaTest-a821659c-a994-4c6f-9714-fa48020b6378-StreamThread-1-consumer, groupId=katanaTest] Added READ_UNCOMMITTED fetch request for partition exportStatusUpdatesV2_rquinlivan-0 at offset 4 to node localhost:29092 (id: 1 rack: null)
14:07:32.284 [katanaTest-a821659c-a994-4c6f-9714-fa48020b6378-StreamThread-1] DEBUG org.apache.kafka.clients.consumer.internals.Fetcher - [Consumer clientId=katanaTest-a821659c-a994-4c6f-9714-fa48020b6378-StreamThread-1-consumer, groupId=katanaTest] Sending READ_UNCOMMITTED fetch for partitions [exportStatusUpdatesV2_rquinlivan-0] to broker localhost:29092 (id: 1 rack: null)
14:07:32.656 [kafka-coordinator-heartbeat-thread | katanaTest] DEBUG org.apache.kafka.clients.consumer.internals.AbstractCoordinator - [Consumer clientId=katanaTest-a821659c-a994-4c6f-9714-fa48020b6378-StreamThread-1-consumer, groupId=katanaTest] Sending Heartbeat request to coordinator localhost:29092 (id: 2147483646 rack: null)
I don't understand from this stack trace what the issue is, and there is never an error logged. How can I debug why my streams application isn't working? What is the recommended method of debugging in Kafka Streams?