Ksql avro format in quickstart gives error - apache-kafka

Hey I'm doing the KSQL quickstart example. The problem is I want to generate data using avro format and it throws the error I list at the bottom.
The tutorial is at https://docs.ksqldb.io/en/latest/tutorials/basics-docker/
To duplicate the problem
git clone https://github.com/confluentinc/ksql.git
cd ksql
git checkout 5.5.0-post
cd docs/tutorials/
docker-compose up -d
If you docker ps you should see the following containers running:
confluentinc/ksqldb-examples:5.5.0
confluentinc/cp-ksql-server:5.4.0
confluentinc/cp-schema-registry:5.4.0,
confluentinc/cp-enterprise-kafka:5.4.0
confluentinc/cp-zookeeper:5.4.0
If I run the example in delimited format like this, the code works:
docker run --network tutorials_default --rm --name datagen-users \
confluentinc/ksqldb-examples:5.5.0 \
ksql-datagen \
bootstrap-server=kafka:39092 \
quickstart=users \
format=delimited \
topic=users \
msgRate=1
The example uses avro format, which I would like to use. Example here:
docker run --network tutorials_default --rm --name datagen-users \
confluentinc/ksqldb-examples:5.5.0 \
ksql-datagen \
bootstrap-server=kafka:39092 \
quickstart=users \
format=avro \
topic=users \
msgRate=1
When I use avro format it gives me the error that follows:
[2020-06-06 15:46:47,632] INFO AvroDataConfig values:
connect.meta.data = true
enhanced.avro.schema.support = false
schemas.cache.config = 1000
(io.confluent.connect.avro.AvroDataConfig:347)
[2020-06-06 15:46:47,650] INFO JsonSchemaDataConfig values:
decimal.format = BASE64
schemas.cache.size = 1000
(io.confluent.connect.json.JsonSchemaDataConfig:179)
[2020-06-06 15:46:47,651] INFO JsonSchemaDataConfig values:
decimal.format = BASE64
schemas.cache.size = 1000
(io.confluent.connect.json.JsonSchemaDataConfig:179)
[2020-06-06 15:46:47,654] INFO ProtobufDataConfig values:
schemas.cache.config = 1000
(io.confluent.connect.protobuf.ProtobufDataConfig:179)
[2020-06-06 15:46:47,672] INFO KsqlConfig values:
ksql.access.validator.enable = auto
ksql.authorization.cache.expiry.time.secs = 30
ksql.authorization.cache.max.entries = 10000
ksql.connect.url = http://localhost:8083
ksql.connect.worker.config =
ksql.extension.dir = ext
ksql.hidden.topics = [_confluent.*, __confluent.*, _schemas, __consumer_offsets, __transaction_state, connect-configs, connect-offsets, connect-status, connect-statuses]
ksql.insert.into.values.enabled = true
ksql.internal.topic.min.insync.replicas = 1
ksql.internal.topic.replicas = 1
ksql.metric.reporters = []
ksql.metrics.extension = null
ksql.metrics.tags.custom =
ksql.new.api.enabled = false
ksql.output.topic.name.prefix =
ksql.persistence.wrap.single.values = true
ksql.persistent.prefix = query_
ksql.pull.queries.enable = true
ksql.query.persistent.active.limit = 2147483647
ksql.query.pull.enable.standby.reads = false
ksql.query.pull.max.allowed.offset.lag = 9223372036854775807
ksql.readonly.topics = [_confluent.*, __confluent.*, _schemas, __consumer_offsets, __transaction_state, connect-configs, connect-offsets, connect-status, connect-statuses]
ksql.schema.registry.url = http://localhost:8081
ksql.security.extension.class = null
ksql.service.id = default_
ksql.sink.window.change.log.additional.retention = 1000000
ksql.streams.shutdown.timeout.ms = 300000
ksql.transient.prefix = transient_
ksql.udf.collect.metrics = false
ksql.udf.enable.security.manager = true
ksql.udfs.enabled = true
ssl.cipher.suites = null
ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]
ssl.endpoint.identification.algorithm = https
ssl.key.password = null
ssl.keymanager.algorithm = SunX509
ssl.keystore.location = null
ssl.keystore.password = null
ssl.keystore.type = JKS
ssl.protocol = TLS
ssl.provider = null
ssl.secure.random.implementation = null
ssl.trustmanager.algorithm = PKIX
ssl.truststore.location = null
ssl.truststore.password = null
ssl.truststore.type = JKS
(io.confluent.ksql.util.KsqlConfig:347)
[2020-06-06 15:46:47,720] INFO AvroDataConfig values:
connect.meta.data = true
enhanced.avro.schema.support = false
schemas.cache.config = 1
(io.confluent.connect.avro.AvroDataConfig:347)
[2020-06-06 15:46:47,752] INFO ProcessingLogConfig values:
ksql.logging.processing.rows.include = false
ksql.logging.processing.stream.auto.create = false
ksql.logging.processing.stream.name = KSQL_PROCESSING_LOG
ksql.logging.processing.topic.auto.create = false
ksql.logging.processing.topic.name =
ksql.logging.processing.topic.partitions = 1
ksql.logging.processing.topic.replication.factor = 1
(io.confluent.ksql.logging.processing.ProcessingLogConfig:347)
[2020-06-06 15:46:47,767] INFO AvroConverterConfig values:
bearer.auth.token = [hidden]
proxy.port = -1
schema.reflection = false
auto.register.schemas = true
max.schemas.per.subject = 1000
basic.auth.credentials.source = URL
value.subject.name.strategy = class io.confluent.kafka.serializers.subject.TopicNameStrategy
schema.registry.url = [http://localhost:8081]
basic.auth.user.info = [hidden]
proxy.host =
schema.registry.basic.auth.user.info = [hidden]
bearer.auth.credentials.source = STATIC_TOKEN
key.subject.name.strategy = class io.confluent.kafka.serializers.subject.TopicNameStrategy
(io.confluent.connect.avro.AvroConverterConfig:179)
[2020-06-06 15:46:47,770] INFO KafkaAvroSerializerConfig values:
bearer.auth.token = [hidden]
proxy.port = -1
schema.reflection = false
auto.register.schemas = true
max.schemas.per.subject = 1000
basic.auth.credentials.source = URL
value.subject.name.strategy = class io.confluent.kafka.serializers.subject.TopicNameStrategy
schema.registry.url = [http://localhost:8081]
basic.auth.user.info = [hidden]
proxy.host =
schema.registry.basic.auth.user.info = [hidden]
bearer.auth.credentials.source = STATIC_TOKEN
key.subject.name.strategy = class io.confluent.kafka.serializers.subject.TopicNameStrategy
(io.confluent.kafka.serializers.KafkaAvroSerializerConfig:179)
[2020-06-06 15:46:47,771] INFO KafkaAvroDeserializerConfig values:
bearer.auth.token = [hidden]
proxy.port = -1
schema.reflection = false
auto.register.schemas = true
max.schemas.per.subject = 1000
basic.auth.credentials.source = URL
specific.avro.reader = false
value.subject.name.strategy = class io.confluent.kafka.serializers.subject.TopicNameStrategy
schema.registry.url = [http://localhost:8081]
basic.auth.user.info = [hidden]
proxy.host =
schema.registry.basic.auth.user.info = [hidden]
bearer.auth.credentials.source = STATIC_TOKEN
key.subject.name.strategy = class io.confluent.kafka.serializers.subject.TopicNameStrategy
(io.confluent.kafka.serializers.KafkaAvroDeserializerConfig:179)
[2020-06-06 15:46:47,771] INFO AvroDataConfig values:
connect.meta.data = false
enhanced.avro.schema.support = false
schemas.cache.config = 1000
(io.confluent.connect.avro.AvroDataConfig:347)
[2020-06-06 15:46:47,772] INFO AvroConverterConfig values:
bearer.auth.token = [hidden]
proxy.port = -1
schema.reflection = false
auto.register.schemas = true
max.schemas.per.subject = 1000
basic.auth.credentials.source = URL
value.subject.name.strategy = class io.confluent.kafka.serializers.subject.TopicNameStrategy
schema.registry.url = [http://localhost:8081]
basic.auth.user.info = [hidden]
proxy.host =
schema.registry.basic.auth.user.info = [hidden]
bearer.auth.credentials.source = STATIC_TOKEN
key.subject.name.strategy = class io.confluent.kafka.serializers.subject.TopicNameStrategy
(io.confluent.connect.avro.AvroConverterConfig:179)
[2020-06-06 15:46:47,772] INFO KafkaAvroSerializerConfig values:
bearer.auth.token = [hidden]
proxy.port = -1
schema.reflection = false
auto.register.schemas = true
max.schemas.per.subject = 1000
basic.auth.credentials.source = URL
value.subject.name.strategy = class io.confluent.kafka.serializers.subject.TopicNameStrategy
schema.registry.url = [http://localhost:8081]
basic.auth.user.info = [hidden]
proxy.host =
schema.registry.basic.auth.user.info = [hidden]
bearer.auth.credentials.source = STATIC_TOKEN
key.subject.name.strategy = class io.confluent.kafka.serializers.subject.TopicNameStrategy
(io.confluent.kafka.serializers.KafkaAvroSerializerConfig:179)
[2020-06-06 15:46:47,772] INFO KafkaAvroDeserializerConfig values:
bearer.auth.token = [hidden]
proxy.port = -1
schema.reflection = false
auto.register.schemas = true
max.schemas.per.subject = 1000
basic.auth.credentials.source = URL
specific.avro.reader = false
value.subject.name.strategy = class io.confluent.kafka.serializers.subject.TopicNameStrategy
schema.registry.url = [http://localhost:8081]
basic.auth.user.info = [hidden]
proxy.host =
schema.registry.basic.auth.user.info = [hidden]
bearer.auth.credentials.source = STATIC_TOKEN
key.subject.name.strategy = class io.confluent.kafka.serializers.subject.TopicNameStrategy
(io.confluent.kafka.serializers.KafkaAvroDeserializerConfig:179)
[2020-06-06 15:46:47,773] INFO AvroDataConfig values:
connect.meta.data = false
enhanced.avro.schema.support = false
schemas.cache.config = 1000
(io.confluent.connect.avro.AvroDataConfig:347)
[2020-06-06 15:46:47,774] INFO AvroConverterConfig values:
bearer.auth.token = [hidden]
proxy.port = -1
schema.reflection = false
auto.register.schemas = true
max.schemas.per.subject = 1000
basic.auth.credentials.source = URL
value.subject.name.strategy = class io.confluent.kafka.serializers.subject.TopicNameStrategy
schema.registry.url = [http://localhost:8081]
basic.auth.user.info = [hidden]
proxy.host =
schema.registry.basic.auth.user.info = [hidden]
bearer.auth.credentials.source = STATIC_TOKEN
key.subject.name.strategy = class io.confluent.kafka.serializers.subject.TopicNameStrategy
(io.confluent.connect.avro.AvroConverterConfig:179)
[2020-06-06 15:46:47,775] INFO KafkaAvroSerializerConfig values:
bearer.auth.token = [hidden]
proxy.port = -1
schema.reflection = false
auto.register.schemas = true
max.schemas.per.subject = 1000
basic.auth.credentials.source = URL
value.subject.name.strategy = class io.confluent.kafka.serializers.subject.TopicNameStrategy
schema.registry.url = [http://localhost:8081]
basic.auth.user.info = [hidden]
proxy.host =
schema.registry.basic.auth.user.info = [hidden]
bearer.auth.credentials.source = STATIC_TOKEN
key.subject.name.strategy = class io.confluent.kafka.serializers.subject.TopicNameStrategy
(io.confluent.kafka.serializers.KafkaAvroSerializerConfig:179)
[2020-06-06 15:46:47,775] INFO KafkaAvroDeserializerConfig values:
bearer.auth.token = [hidden]
proxy.port = -1
schema.reflection = false
auto.register.schemas = true
max.schemas.per.subject = 1000
basic.auth.credentials.source = URL
specific.avro.reader = false
value.subject.name.strategy = class io.confluent.kafka.serializers.subject.TopicNameStrategy
schema.registry.url = [http://localhost:8081]
basic.auth.user.info = [hidden]
proxy.host =
schema.registry.basic.auth.user.info = [hidden]
bearer.auth.credentials.source = STATIC_TOKEN
key.subject.name.strategy = class io.confluent.kafka.serializers.subject.TopicNameStrategy
(io.confluent.kafka.serializers.KafkaAvroDeserializerConfig:179)
[2020-06-06 15:46:47,776] INFO AvroDataConfig values:
connect.meta.data = false
enhanced.avro.schema.support = false
schemas.cache.config = 1000
(io.confluent.connect.avro.AvroDataConfig:347)
[2020-06-06 15:46:47,776] INFO AvroConverterConfig values:
bearer.auth.token = [hidden]
proxy.port = -1
schema.reflection = false
auto.register.schemas = true
max.schemas.per.subject = 1000
basic.auth.credentials.source = URL
value.subject.name.strategy = class io.confluent.kafka.serializers.subject.TopicNameStrategy
schema.registry.url = [http://localhost:8081]
basic.auth.user.info = [hidden]
proxy.host =
schema.registry.basic.auth.user.info = [hidden]
bearer.auth.credentials.source = STATIC_TOKEN
key.subject.name.strategy = class io.confluent.kafka.serializers.subject.TopicNameStrategy
(io.confluent.connect.avro.AvroConverterConfig:179)
[2020-06-06 15:46:47,776] INFO KafkaAvroSerializerConfig values:
bearer.auth.token = [hidden]
proxy.port = -1
schema.reflection = false
auto.register.schemas = true
max.schemas.per.subject = 1000
basic.auth.credentials.source = URL
value.subject.name.strategy = class io.confluent.kafka.serializers.subject.TopicNameStrategy
schema.registry.url = [http://localhost:8081]
basic.auth.user.info = [hidden]
proxy.host =
schema.registry.basic.auth.user.info = [hidden]
bearer.auth.credentials.source = STATIC_TOKEN
key.subject.name.strategy = class io.confluent.kafka.serializers.subject.TopicNameStrategy
(io.confluent.kafka.serializers.KafkaAvroSerializerConfig:179)
[2020-06-06 15:46:47,777] INFO KafkaAvroDeserializerConfig values:
bearer.auth.token = [hidden]
proxy.port = -1
schema.reflection = false
auto.register.schemas = true
max.schemas.per.subject = 1000
basic.auth.credentials.source = URL
specific.avro.reader = false
value.subject.name.strategy = class io.confluent.kafka.serializers.subject.TopicNameStrategy
schema.registry.url = [http://localhost:8081]
basic.auth.user.info = [hidden]
proxy.host =
schema.registry.basic.auth.user.info = [hidden]
bearer.auth.credentials.source = STATIC_TOKEN
key.subject.name.strategy = class io.confluent.kafka.serializers.subject.TopicNameStrategy
(io.confluent.kafka.serializers.KafkaAvroDeserializerConfig:179)
[2020-06-06 15:46:47,777] INFO AvroDataConfig values:
connect.meta.data = false
enhanced.avro.schema.support = false
schemas.cache.config = 1000
(io.confluent.connect.avro.AvroDataConfig:347)
[2020-06-06 15:46:47,817] WARN The configuration 'ksql.schema.registry.url' was supplied but isn't a known config. (org.apache.kafka.clients.producer.ProducerConfig:355)
[2020-06-06 15:46:47,817] WARN The configuration 'ksql.schema.registry.url' was supplied but isn't a known config. (org.apache.kafka.clients.producer.ProducerConfig:355)
[2020-06-06 15:46:48,038] ERROR Failed to send HTTP request to endpoint: http://localhost:8081/subjects/users-value/versions (io.confluent.kafka.schemaregistry.client.rest.RestService:268)
java.net.ConnectException: Connection refused (Connection refused)
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:607)
at sun.net.NetworkClient.doConnect(NetworkClient.java:175)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:463)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:558)
at sun.net.www.http.HttpClient.<init>(HttpClient.java:242)
at sun.net.www.http.HttpClient.New(HttpClient.java:339)
at sun.net.www.http.HttpClient.New(HttpClient.java:357)
at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1226)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1162)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1056)
at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:990)
at sun.net.www.protocol.http.HttpURLConnection.getOutputStream0(HttpURLConnection.java:1340)
at sun.net.www.protocol.http.HttpURLConnection.getOutputStream(HttpURLConnection.java:1315)
at io.confluent.kafka.schemaregistry.client.rest.RestService.sendHttpRequest(RestService.java:264)
at io.confluent.kafka.schemaregistry.client.rest.RestService.httpRequest(RestService.java:352)
at io.confluent.kafka.schemaregistry.client.rest.RestService.registerSchema(RestService.java:495)
at io.confluent.kafka.schemaregistry.client.rest.RestService.registerSchema(RestService.java:486)
at io.confluent.kafka.schemaregistry.client.rest.RestService.registerSchema(RestService.java:459)
at io.confluent.kafka.schemaregistry.client.CachedSchemaRegistryClient.registerAndGetId(CachedSchemaRegistryClient.java:206)
at io.confluent.kafka.schemaregistry.client.CachedSchemaRegistryClient.register(CachedSchemaRegistryClient.java:268)
at io.confluent.kafka.schemaregistry.client.CachedSchemaRegistryClient.register(CachedSchemaRegistryClient.java:244)
at io.confluent.kafka.serializers.AbstractKafkaAvroSerializer.serializeImpl(AbstractKafkaAvroSerializer.java:74)
at io.confluent.connect.avro.AvroConverter$Serializer.serialize(AvroConverter.java:138)
at io.confluent.connect.avro.AvroConverter.fromConnectData(AvroConverter.java:84)
at io.confluent.ksql.serde.connect.KsqlConnectSerializer.serialize(KsqlConnectSerializer.java:49)
at io.confluent.ksql.serde.tls.ThreadLocalSerializer.serialize(ThreadLocalSerializer.java:37)
at io.confluent.ksql.serde.GenericRowSerDe$GenericRowSerializer.serialize(GenericRowSerDe.java:281)
at io.confluent.ksql.serde.GenericRowSerDe$GenericRowSerializer.serialize(GenericRowSerDe.java:248)
at org.apache.kafka.common.serialization.Serializer.serialize(Serializer.java:62)
at org.apache.kafka.clients.producer.KafkaProducer.doSend(KafkaProducer.java:902)
at org.apache.kafka.clients.producer.KafkaProducer.send(KafkaProducer.java:862)
at io.confluent.ksql.datagen.DataGenProducer.produceOne(DataGenProducer.java:122)
at io.confluent.ksql.datagen.DataGenProducer.populateTopic(DataGenProducer.java:91)
at io.confluent.ksql.datagen.DataGen.lambda$getProducerTask$1(DataGen.java:111)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
org.apache.kafka.common.errors.SerializationException: Error serializing message to topic: users
Caused by: org.apache.kafka.connect.errors.DataException: Failed to serialize Avro data from topic users :
at io.confluent.connect.avro.AvroConverter.fromConnectData(AvroConverter.java:87)
at io.confluent.ksql.serde.connect.KsqlConnectSerializer.serialize(KsqlConnectSerializer.java:49)
at io.confluent.ksql.serde.tls.ThreadLocalSerializer.serialize(ThreadLocalSerializer.java:37)
at io.confluent.ksql.serde.GenericRowSerDe$GenericRowSerializer.serialize(GenericRowSerDe.java:281)
at io.confluent.ksql.serde.GenericRowSerDe$GenericRowSerializer.serialize(GenericRowSerDe.java:248)
at org.apache.kafka.common.serialization.Serializer.serialize(Serializer.java:62)
at org.apache.kafka.clients.producer.KafkaProducer.doSend(KafkaProducer.java:902)
at org.apache.kafka.clients.producer.KafkaProducer.send(KafkaProducer.java:862)
at io.confluent.ksql.datagen.DataGenProducer.produceOne(DataGenProducer.java:122)
at io.confluent.ksql.datagen.DataGenProducer.populateTopic(DataGenProducer.java:91)
at io.confluent.ksql.datagen.DataGen.lambda$getProducerTask$1(DataGen.java:111)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.kafka.common.errors.SerializationException: Error serializing Avro message
Caused by: java.net.ConnectException: Connection refused (Connection refused)
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:607)
at sun.net.NetworkClient.doConnect(NetworkClient.java:175)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:463)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:558)
at sun.net.www.http.HttpClient.<init>(HttpClient.java:242)
at sun.net.www.http.HttpClient.New(HttpClient.java:339)
at sun.net.www.http.HttpClient.New(HttpClient.java:357)
at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1226)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1162)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1056)
at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:990)
at sun.net.www.protocol.http.HttpURLConnection.getOutputStream0(HttpURLConnection.java:1340)
at sun.net.www.protocol.http.HttpURLConnection.getOutputStream(HttpURLConnection.java:1315)
at io.confluent.kafka.schemaregistry.client.rest.RestService.sendHttpRequest(RestService.java:264)
at io.confluent.kafka.schemaregistry.client.rest.RestService.httpRequest(RestService.java:352)
at io.confluent.kafka.schemaregistry.client.rest.RestService.registerSchema(RestService.java:495)
at io.confluent.kafka.schemaregistry.client.rest.RestService.registerSchema(RestService.java:486)
at io.confluent.kafka.schemaregistry.client.rest.RestService.registerSchema(RestService.java:459)
at io.confluent.kafka.schemaregistry.client.CachedSchemaRegistryClient.registerAndGetId(CachedSchemaRegistryClient.java:206)
at io.confluent.kafka.schemaregistry.client.CachedSchemaRegistryClient.register(CachedSchemaRegistryClient.java:268)
at io.confluent.kafka.schemaregistry.client.CachedSchemaRegistryClient.register(CachedSchemaRegistryClient.java:244)
at io.confluent.kafka.serializers.AbstractKafkaAvroSerializer.serializeImpl(AbstractKafkaAvroSerializer.java:74)
at io.confluent.connect.avro.AvroConverter$Serializer.serialize(AvroConverter.java:138)
at io.confluent.connect.avro.AvroConverter.fromConnectData(AvroConverter.java:84)
at io.confluent.ksql.serde.connect.KsqlConnectSerializer.serialize(KsqlConnectSerializer.java:49)
at io.confluent.ksql.serde.tls.ThreadLocalSerializer.serialize(ThreadLocalSerializer.java:37)
at io.confluent.ksql.serde.GenericRowSerDe$GenericRowSerializer.serialize(GenericRowSerDe.java:281)
at io.confluent.ksql.serde.GenericRowSerDe$GenericRowSerializer.serialize(GenericRowSerDe.java:248)
at org.apache.kafka.common.serialization.Serializer.serialize(Serializer.java:62)
at org.apache.kafka.clients.producer.KafkaProducer.doSend(KafkaProducer.java:902)
at org.apache.kafka.clients.producer.KafkaProducer.send(KafkaProducer.java:862)
at io.confluent.ksql.datagen.DataGenProducer.produceOne(DataGenProducer.java:122)
at io.confluent.ksql.datagen.DataGenProducer.populateTopic(DataGenProducer.java:91)
at io.confluent.ksql.datagen.DataGen.lambda$getProducerTask$1(DataGen.java:111)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

By default, the registry url is localhost
schema.registry.url = [http://localhost:8081]
Datagen needs another parameter for schemaRegistryUrl assigned to the address of the running registry container

Looks like the docker file isn't correctly exposing the Schema Registry's port, try adding the ports mapping as follows:
schema-registry:
image: <something>
depends_on:
- zookeeper
- kafka
ports:
- "8081:8081"
environment:
SCHEMA_REGISTRY_HOST_NAME: schema-registry
SCHEMA_REGISTRY_KAFKASTORE_CONNECTION_URL: zookeeper:32181

Related

Cannot use csvSftpConnector with schema registry

I'm trying to configure a standalone producer to read csv files from a sftp server and send data to a topic on the cloud.
So far I succeeded in reading my csv data from the file and parsing it according to my value.schema.
But now instead of using a fixed configuration schema, I'd like to use the schema registry. So I configured an AVRO schema for my test topic on the confluent cloud, generated the API key/secret and updated my config files.
I can see that connection is working fine, no authentication errors, via cli I can access the test schema, but when I try to run the producer I get the following error:
[2021-09-20 16:39:53,442] INFO SftpCsvSourceConnectorConfig values:
batch.size = 1000
behavior.on.error = IGNORE
cleanup.policy = NONE
csv.case.sensitive.field.names = false
csv.escape.char = 92
csv.file.charset = UTF-8
csv.first.row.as.header = false
csv.ignore.leading.whitespace = true
csv.ignore.quotations = false
csv.keep.carriage.return = false
csv.null.field.indicator = NEITHER
csv.quote.char = 34
csv.rfc.4180.parser.enabled = false
csv.separator.char = 44
csv.skip.lines = 0
csv.strict.quotes = false
csv.verify.reader = true
empty.poll.wait.ms = 250
error.path = /home/alberto/opt/confluent-6.2.0/sftp2/error
file.minimum.age.ms = 0
finished.path = /home/alberto/opt/confluent-6.2.0/sftp2/finished
input.file.pattern = .*.csv
input.path = /home/alberto/opt/confluent-6.2.0/sftp2/data
kafka.topic = testSchema
kerberos.keytab.path =
kerberos.user.principal =
key.schema = {"name" : "com.example.users.UserKey","type" : "STRUCT","isOptional" : true,"fieldSchemas" : {"material" : {"type" : "STRING","isOptional" : true}}}
parser.timestamp.date.formats = [yyyy-MM-dd'T'HH:mm:ss, yyyy-MM-dd' 'HH:mm:ss]
parser.timestamp.timezone = UTC
processing.file.extension = .PROCESSING
proxy.password = [hidden]
proxy.username =
schema.generation.enabled = false
schema.generation.key.fields = []
schema.generation.key.name = defaultkeyschemaname
schema.generation.value.name = defaultvalueschemaname
sftp.host = 192.168.1.6
sftp.password = [hidden]
sftp.port = 22
sftp.proxy.url =
sftp.username = user
timestamp.field =
timestamp.mode = PROCESS_TIME
tls.passphrase = [hidden]
tls.pemfile =
tls.private.key = [hidden]
tls.public.key = [hidden]
value.schema =
...
Caused by: org.apache.kafka.common.config.ConfigException: Both configs key.schema and value.schema must be set if schema.generation.enabled is false, but key.schema was not null and value.schema was null.
at io.confluent.connect.sftp.source.SftpSourceConnectorConfig.validateSchema(SftpSourceConnectorConfig.java:181)
at io.confluent.connect.sftp.source.SftpSourceConnectorConfig.<init>(SftpSourceConnectorConfig.java:121)
at io.confluent.connect.sftp.source.SftpCsvSourceConnectorConfig.<init>(SftpCsvSourceConnectorConfig.java:157)
at io.confluent.connect.sftp.SftpCsvSourceConnector.start(SftpCsvSourceConnector.java:44)
at org.apache.kafka.connect.runtime.WorkerConnector.doStart(WorkerConnector.java:184)
at org.apache.kafka.connect.runtime.WorkerConnector.start(WorkerConnector.java:209)
at org.apache.kafka.connect.runtime.WorkerConnector.doTransitionTo(WorkerConnector.java:348)
at org.apache.kafka.connect.runtime.WorkerConnector.doTransitionTo(WorkerConnector.java:331)
... 7 more
If I set schema.generation.enabled to true, it seems that it creates an empty schema:
value.schema = {"type":"STRUCT","isOptional":false,"fieldSchemas":{}}
and then I get:
org.apache.kafka.common.config.ConfigException: Failed to access Avro data from topic testSchema : Schema being registered is incompatible with an earlier schema for subject "testSchema-value"; error code: 409; error code: 409
as if it's trying to register a schema, except that it's not what I want, I just need to fetch the schema from the registry and use it.
If anyone need any addition information regarding the configuration I'll happy to provide.

Lagged offsets skipped after new event is published before max poll interval in KAFKA

Kafka v2.4 Consumer Configurations:-
kafka.consumer.auto.offset.reset=earliest
kafka.consumer.auto.commit=false
Kafka consumer container config:-
#Bean
public ConcurrentKafkaListenerContainerFactory<String, PayoutDto> kafkaPayoutStatusPoolListenerContainerFactory() {
ConcurrentKafkaListenerContainerFactory<String, PayoutDto> factory = new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(kafkaConsumerFactoryForPayoutEvent());
factory.getContainerProperties().setAckMode(AckMode.MANUAL_IMMEDIATE);
factory.setMissingTopicsFatal(false);
return factory;
}
Kafka consumer:-
#KafkaListener(id = "regularPayoutEventConsumer", topics = "${kafka.regular.payout.consumer.queuename}", containerFactory = "kafkaPayoutStatusPoolListenerContainerFactory", groupId = "${kafka.regular.payout.consumer.groupId}")
public void listen(ConsumerRecord<String, PayoutDto> consumerRecord, Acknowledgment ack) {
StopWatch watch = new StopWatch();
watch.start();
String key = null;
Long offset = null;
try {
PayoutDto payoutDto = consumerRecord.value();
key = consumerRecord.key();
offset = consumerRecord.offset();
cpAccountsService.processPayoutEvent(payoutDto);
ack.acknowledge();
} catch (Exception e) {
log.error("Exception occured in RegularPayoutEventConsumer due to following issue {}", e);
} finally {
watch.stop();
log.debug("tolal time taken by consumer for requestID:" + key + " on offset:" + offset + " is:"
+ watch.getTotalTimeMillis());
}
}
Success Scenario:-
consumer failed to acknowledge on an exception which creates a lag, lets say last committed offset is 30 and now lag is 4.
on next auto poll cycle after poll interval, consumer continues to consumes where lag starts from 30 and ends at 33 normally and lag is now 0.
Failed Scenario:-
same as step 1 from success scenario.
now before consumer poll interval, producer pushed new message.
now on new producer event, consumer pulls data and jumps directly to offset record 33 and skipped 30,31,32 and clearing the lag to 0.
App startup logs of kafka:-
2021-04-14 10:38:06.132 INFO 10286 --- [ restartedMain] o.a.k.clients.consumer.KafkaConsumer : [Consumer clientId=consumer-RegularPayoutEventGroupId-3, groupId=RegularPayoutEventGroupId] Subscribed to topic(s): InstantPayoutTransactionsEv
2021-04-14 10:38:06.132 INFO 10286 --- [ restartedMain] o.s.s.c.ThreadPoolTaskScheduler : Initializing ExecutorService
2021-04-14 10:38:06.133 INFO 10286 --- [ restartedMain] o.a.k.clients.consumer.ConsumerConfig : ConsumerConfig values:
allow.auto.create.topics = true
auto.commit.interval.ms = 5000
auto.offset.reset = earliest
bootstrap.servers = [localhost:9092]
check.crcs = true
client.dns.lookup = use_all_dns_ips
client.id = consumer-PayoutEventGroupId-4
client.rack =
connections.max.idle.ms = 540000
default.api.timeout.ms = 60000
enable.auto.commit = false
exclude.internal.topics = true
fetch.max.bytes = 52428800
fetch.max.wait.ms = 500
fetch.min.bytes = 1
group.id = PayoutEventGroupId
group.instance.id = null
heartbeat.interval.ms = 3000
interceptor.classes = []
internal.leave.group.on.close = true
internal.throw.on.fetch.stable.offset.unsupported = false
isolation.level = read_uncommitted
key.deserializer = class org.apache.kafka.common.serialization.StringDeserializer
max.partition.fetch.bytes = 1048576
max.poll.interval.ms = 30000
max.poll.records = 500
metadata.max.age.ms = 300000
metric.reporters = []
metrics.num.samples = 2
metrics.recording.level = INFO
metrics.sample.window.ms = 30000
partition.assignment.strategy = [class org.apache.kafka.clients.consumer.RangeAssignor]
receive.buffer.bytes = 65536
reconnect.backoff.max.ms = 1000
reconnect.backoff.ms = 50
request.timeout.ms = 30000
retry.backoff.ms = 100
sasl.client.callback.handler.class = null
sasl.jaas.config = null
sasl.kerberos.kinit.cmd = /usr/bin/kinit
sasl.kerberos.min.time.before.relogin = 60000
sasl.kerberos.service.name = null
sasl.kerberos.ticket.renew.jitter = 0.05
sasl.kerberos.ticket.renew.window.factor = 0.8
sasl.login.callback.handler.class = null
sasl.login.class = null
sasl.login.refresh.buffer.seconds = 300
sasl.login.refresh.min.period.seconds = 60
sasl.login.refresh.window.factor = 0.8
sasl.login.refresh.window.jitter = 0.05
sasl.mechanism = GSSAPI
security.protocol = PLAINTEXT
security.providers = null
send.buffer.bytes = 131072
session.timeout.ms = 10000
ssl.cipher.suites = null
ssl.enabled.protocols = [TLSv1.2, TLSv1.3]
ssl.endpoint.identification.algorithm = https
ssl.engine.factory.class = null
ssl.key.password = null
ssl.keymanager.algorithm = SunX509
ssl.keystore.location = null
ssl.keystore.password = null
ssl.keystore.type = JKS
ssl.protocol = TLSv1.3
ssl.provider = null
ssl.secure.random.implementation = null
ssl.trustmanager.algorithm = PKIX
ssl.truststore.location = null
ssl.truststore.password = null
ssl.truststore.type = JKS
value.deserializer = class com.cms.cpa.config.KafkaPayoutDeserializer
2021-04-14 10:38:06.137 INFO 10286 --- [ restartedMain] o.a.kafka.common.utils.AppInfoParser : Kafka version: 2.6.0
2021-04-14 10:38:06.137 INFO 10286 --- [ restartedMain] o.a.kafka.common.utils.AppInfoParser : Kafka commitId: 62abe01bee039651
Kafka maintains 2 values for a consumer/partition - the committed offset (where the consumer will start if restarted) and position - which record will be returned on the next poll.
Not acknowledging a record will not cause the position to be repositioned.
It is working as-designed; if you want to re-process a failed record, you need to use acknowledgment.nack() with an optional sleep time, or throw an exception and configure a SeekToCurrentErrorHandler.
In those cases, the container will reposition the partitions so that the failed record is redelivered. With the error handler you can "recover" the failed record after the retries are exhausted. When using nack(), the listener has to keep track of the attempts.
See https://docs.spring.io/spring-kafka/docs/current/reference/html/#committing-offsets
and https://docs.spring.io/spring-kafka/docs/current/reference/html/#annotation-error-handling

Sending events with Reactor Kafka seems very slow

My project uses Reactor Kafka 1.2.2.RELEASE to send events serialized with Avro to my Kafka broker. This works well, however sending events seems to be quite slow.
We plugged a custom metric to the lifecycle of the Mono sending the event through KafkaSender.send(), and noticed it took around 100ms to deliver a message.
We did it this way:
send(event, code, getKafkaHeaders()).transform(withMetric("eventName"));
The send method just builds the record and sends it:
private Mono<Void> send(SpecificRecord value, String code, final List<Header> headers) {
final var producerRecord = new ProducerRecord<>("myTopic", null, code, value, headers);
final var record = Mono.just(SenderRecord.create(producerRecord, code));
return Optional.ofNullable(kafkaSender.send(record)).orElseGet(Flux::empty)
.switchMap(this::errorIfAnyException)
.doOnNext(res -> log.info("Successfully sent event on topic {}", res.recordMetadata().topic()))
.then();
}
And the withMetric transformer links a metric to the send mono lifecycle:
private Function<Mono<Void>, Mono<Void>> withMetric(final String methodName) {
return mono -> Mono.justOrEmpty(this.metricProvider)
.map(provider -> provider.buildMethodExecutionTimeMetric(methodName, "kafka"))
.flatMap(metric -> mono.doOnSubscribe(subscription -> metric.start())
.doOnTerminate(metric::end));
}
That is this custom metric that returns an average of 100ms.
We compared it to our Kafka producer metrics, and noticed that those ones returned and average of 40ms to deliver a message (0ms of queueing, and 40ms of request latency).
We have difficulties to understand the delta, an wonder if it could come from the Reactor Kafka method to send events.
Can anybody help please?
UPDATE
Here's a sample of my producer config:
acks = all
batch.size = 16384
buffer.memory = 33554432
client.dns.lookup = default
compression.type = none
connections.max.idle.ms = 540000
delivery.timeout.ms = 120000
enable.idempotence = true
key.serializer = class org.apache.kafka.common.serialization.StringSerializer
linger.ms = 0
max.block.ms = 60000
max.in.flight.requests.per.connection = 5
max.request.size = 1048576
metadata.max.age.ms = 300000
metrics.num.samples = 2
metrics.recording.level = INFO
metrics.sample.window.ms = 30000
partitioner.class = class org.apache.kafka.clients.producer.internals.DefaultPartitioner
receive.buffer.bytes = 32768
reconnect.backoff.max.ms = 1000
reconnect.backoff.ms = 50
request.timeout.ms = 30000
retries = 2147483647
retry.backoff.ms = 100
sasl.login.refresh.buffer.seconds = 300
sasl.login.refresh.min.period.seconds = 60
sasl.login.refresh.window.factor = 0.8
sasl.login.refresh.window.jitter = 0.05
sasl.mechanism = GSSAPI
security.protocol = PLAINTEXT
send.buffer.bytes = 131072
ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]
ssl.endpoint.identification.algorithm = https
ssl.keystore.type = JKS
ssl.protocol = TLS
ssl.trustmanager.algorithm = PKIX
ssl.truststore.type = JKS
transaction.timeout.ms = 60000
value.serializer = class io.confluent.kafka.serializers.KafkaAvroSerializer
Also, the maxInFlight is 256 and the scheduler is single, I didn't configure anything special here.

MACRO Debug for pivot table

Trying to create a MACRO to run a pivot table and I need help debugging. Here is the script:
Sheets.Add
newsheet = ActiveSheet.Name
ActiveWorkbook.PivotCaches.Create(SourceType:=xlDatabase, SourceData:= _
dataname, Version:=xlPivotTableVersion10). _
CreatePivotTable TableDestination:=newsheet & "!R3C1", TableName:="PivotTable2" _
, DefaultVersion:=xlPivotTableVersion10
Sheets(newsheet).Select
Cells(3, 1).Select
With ActiveSheet.PivotTables("PivotTable2")
.ColumnGrand = True
.HasAutoFormat = True
.DisplayErrorString = False
.DisplayNullString = True
.EnableDrilldown = True
.ErrorString = ""
.MergeLabels = False
.NullString = ""
.PageFieldOrder = 2
.PageFieldWrapCount = 0
.PreserveFormatting = True
.RowGrand = True
.SaveData = True
.PrintTitles = False
.RepeatItemsOnEachPrintedPage = True
.TotalsAnnotation = False
.CompactRowIndent = 1
.InGridDropZones = True
.DisplayFieldCaptions = True
.DisplayMemberPropertyTooltips = False
.DisplayContextTooltips = True
.ShowDrillIndicators = True
.PrintDrillIndicators = False
.AllowMultipleFilters = True
.SortUsingCustomLists = True
.FieldListSortAscending = False
.ShowValuesRow = True
.CalculatedMembersInFilters = False
.RowAxisLayout xlTabularRow
End With
With ActiveSheet.PivotTables("PivotTable2").PivotCache
.RefreshOnFileOpen = False
.MissingItemsLimit = xlMissingItemsDefault
End With
ActiveSheet.PivotTables("PivotTable2").RepeatAllLabels xlRepeatLabels
With ActiveSheet.PivotTables("PivotTable2").PivotFields("Status")
.Orientation = xlRowField
.Position = 1
End With
ActiveSheet.PivotTables("PivotTable2").AddDataField ActiveSheet.PivotTables( _
"PivotTable2").PivotFields("Status"), "Count of Status", xlCount
End Sub
The error message I receive is run-time error 1004
application-defined or object-declined error
The line that is the issue is below:
.DisplayMemberPropertyTooltips = False
How do I correct? Thanks!!!

Quartz : There is no DataSource named 'myDS'

When i use datasource name as "quartzDS" everything is working fine, but when i change datasource name to any other name any other like "myDS". i get error.
Caused by: java.sql.SQLException: There is no DataSource named 'myDS'
My quartz.properties file.
org.quartz.scheduler.instanceName = QuartzClusterScheduler
org.quartz.scheduler.instanceId = AUTO
org.quartz.threadPool.class = org.quartz.simpl.SimpleThreadPool
org.quartz.threadPool.threadCount = 100
org.quartz.threadPool.threadPriority = 8
org.quartz.jobStore.class = org.quartz.impl.jdbcjobstore.JobStoreTX
org.quartz.jobStore.driverDelegateClass = org.quartz.impl.jdbcjobstore.StdJDBCDelegate
org.quartz.jobStore.useProperties = false
org.quartz.jobStore.dataSource = myDS
org.quartz.jobStore.misfireThreshold = 60000
org.quartz.jobStore.tablePrefix = QRTZ_
org.quartz.jobStore.isClustered = true
org.quartz.jobStore.clusterCheckinInterval = 5000
org.quartz.dataSource.quartzDS.jndiURL = java:jboss/myDS
Resolved, Changed from
org.quartz.dataSource.quartzDS.jndiURL = java:jboss/myDS
to
org.quartz.dataSource.myDS.jndiURL = java:jboss/myDS