Kafka messages are not getting inserted in to postgresql database. I could see the messages in the consumer, but its not getting inserted into the table. Any suggestion will be helpful.
kafka-avro-console-producer --broker-list localhost:9092 --topic Kafka_pg --property value.schema='{"type":"record","name":"kafka_sink_pg","fields":[{"name":"serial_no","type":"int"},{"name":"technology", "type": "string"}, {"name":"platform", "type": "string"}]}'
{"serial_no": 1, "technology": "ETL", "platform": "Informatica"}
{"serial_no": 2, "technology": "ETL", "platform": "Talend"}
Below are the error messages in the log file,
[2020-08-12 03:50:09,940] INFO Kafka Connect started (org.apache.kafka.connect.runtime.Connect:57)
[2020-08-12 03:50:09,943] ERROR Failed to create job for ../config/sink-quickstart-Postgres.properties (org.apache.kafka.connect.cli.ConnectStandalone:110)
[2020-08-12 03:50:09,952] ERROR Stopping after connector error (org.apache.kafka.connect.cli.ConnectStandalone:121)
java.util.concurrent.ExecutionException: org.apache.kafka.connect.errors.ConnectException: Failed to find any class that implements Connector and which name matches io.confluent.connect.jdbc.JdbcSinkConnector

Error is because of the jdbc drivers where the plugin path could not identify the location.
Resolved the issue by providing the complete path to the plugin in connect-avro-standalone.properties file
Changed to
plugin.path=/usr/kafka/share/java #Provided the complete path


Kafka connect MongoDB sink connector using kafka-avro-console-producer

I'm trying to write some documents to MongoDB using the Kafka connect MongoDB connector. I've managed to set up all the components required and start up the connector but when I send the message to Kafka using the kafka-avro-console-producer, Kafka connect is giving me the following error:
org.apache.kafka.connect.errors.DataException: Error: `operationType` field is doc is missing.
I've tried to add this field to the message but then kafka connect is asking me to include a documentKey field. It seems like I need to include some extra fields apart from the payload defined in my schema but I can't find a comprehensive documentation. Does anyone have an example of a kafka message payload (using kafka-avro-console-producer) that goes through a Kafka -> Kafka connect -> MongoDB pipeline?
See following an example of one of the messages I'm sending to Kafka (btw, kafka-avro-console-consumer is able to consume the messages):
./kafka-avro-console-producer --broker-list kafka:9093 --topic sampledata --property value.schema='{"type":"record","name":"myrecord","fields":[{"name":"field1","type":"string"}]}'
{"field1": "value1"}
And see also following the configuration of the sink connector:
{"name": "mongo-sink",
"config": {
"value.converter":"io.confluent.connect.avro.AvroConverter", "value.converter.schema.registry.url":"http://schemaregistry:8081",
"change.data.capture.handler": "com.mongodb.kafka.connect.sink.cdc.mongodb.ChangeStreamHandler"
I've just managed to make the connector work. I deleted the change.data.capture.handler property from the connector configuration and it works now.

Error while consuming AVRO Kafka Topic from KSQL Stream

I created some dummydata as a Stream in KSQLDB with
The Setup is over Docker-compose. I am running a Kafka Broker, Schema-registry, ksqldbcli, ksqldb-server, zookeeper
Now I want to consume these records from the topic.
My first and last approach was over the commandline with following command
docker run --net=host --rm confluentinc/cp-schema-registry:5.0.0 kafka-avro-console-consumer
--bootstrap-server localhost:29092 --topic DXT --from-beginning --max-messages 10
--property print.key=true --property print.value=true
--value-deserializer io.confluent.kafka.serializers.KafkaAvroDeserializer
--key-deserializer org.apache.kafka.common.serialization.StringDeserializer
But that just returns the error
[2021-04-22 21:45:42,926] ERROR Unknown error when running consumer: (kafka.tools.ConsoleConsumer$:76)
org.apache.kafka.common.errors.SerializationException: Error deserializing Avro message for id -1
Caused by: org.apache.kafka.common.errors.SerializationException: Unknown magic byte!
I also tried it with different use cases in Java Spring but with no prevail. I just cannot consume the created topics.
If I would need to define my own schema, where should I do that and what would be the easiest way because I just created a stream in Ksqldb?
Is there an easy to follow example. I did not specifiy anything else when I created the stream like in the quickstart example on Ksqldb.io. (I added the schema-registry in my deployment)
As I am a noob that is sitting here for almost 10 hours any help would be appreciated.
Edit: I found that pure JSON does not need the Schema-registry with ksqldb. Here.
But how to deserialize it?
If you've written JSON data to the topic then you can read it with the kafka-console-consumer.
The error you're getting (Error deserializing Avro message for id -1…Unknown magic byte!) is because you're using the kafka-avro-console-consumer which attempts to deserialise the topic data as Avro - which it isn't, hence the error.
You can also use PRINT DXT; from within ksqlDB.

Configure Apache Kafka sink jdbc connector

I want to send the data sent to the topic to a postgresql-database. So I follow this guide and have configured the properties-file like this:
I start the connector with
./bin/connect-standalone etc/schema-registry/connect-avro-standalone.properties etc/kafka-connect-jdbc/sink-quickstart-postgresql.properties
The sink-connector is created but does not start due to this error:
Caused by: org.apache.kafka.common.errors.SerializationException: Error deserializing Avro message for id -1
Caused by: org.apache.kafka.common.errors.SerializationException: Unknown magic byte!
The schema is in avro-format and registered and I can send (produce) messages to the topic and read (consume) from it. But I can't seem to sent it to the database.
This is my ./etc/schema-registry/connect-avro-standalone.properties
This is a producer feeding the topic using the java-api:
properties.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
properties.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, KafkaAvroSerializer.class);
properties.put(AbstractKafkaAvroSerDeConfig.SCHEMA_REGISTRY_URL_CONFIG, "http://localhost:8081");
try (KafkaProducer<String, Transaction> producer = new KafkaProducer<>(properties)) {
Transaction transaction = new Transaction();
UUID uuid = UUID.randomUUID();
final ProducerRecord<String, Transaction> record = new ProducerRecord<>(TOPIC, uuid.toString(), transaction);
I'm verifying data is properly serialized and deserialized using
./bin/kafka-avro-console-consumer --bootstrap-server localhost:9092 \
--property schema.registry.url=http://localhost:8081 \
--topic transactions \
--from-beginning --max-messages 1
The database is up and running.
This is not correct:
The unknown magic byte can be due to a id-field not part of the schema
What that error means that the message on the topic was not serialised using the Schema Registry Avro serialiser.
How are you putting data on the topic?
Maybe all the messages have the problem, maybe only some—but by default this will halt the Kafka Connect task.
You can set
to get it to ignore messages that it can't deserialise. But if all of them are not correctly Avro serialised this won't help and you need to serialise them correctly, or choose a different Converter (e.g. if they're actually JSON, use the JSONConverter).
These references should help you more:
Edit :
If you are serialising the key with StringSerializer then you need to use this in your Connect config:
You can set it at the worker (global property, applies to all connectors that you run on it), or just for this connector (i.e. put it in the connector properties itself, it will override the worker settings)

CORRUPT_MESSAGE when trying to run a Kafka JDBC source connector

I am trying to run a Kafka JDBC source connector with the following configuration:
"name": "source-mariadb-VIEW_GIORGOS",
"config": { "connector.class":"io.confluent.connect.jdbc.JdbcSourceConnector",
But Kafka Connect reports the following error:
WARN [Producer clientId=producer-8] Got error produce response with correlation id 1504 on topic-partition GIORGOS-VW_GIORGOS, retrying (2147483149 attempts left).
Error: CORRUPT_MESSAGE (org.apache.kafka.clients.producer.internals.Sender:526)
I figured out that this error was related to the retention policy.
compact policy requires a key and a value. Since a view does not have a key, the message is corrupted. Changing policy to delete has fixed the issue for me.

Kafka-connect issue

I installed Apache Kafka on centos 7 (confluent), am trying to run filestream kafka connect in distributed mode but I was getting below error:
[2017-08-10 05:26:27,355] INFO Added alias 'ValueToKey' to plugin 'org.apache.kafka.connect.transforms.ValueToKey' (org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader:290)
Exception in thread "main" org.apache.kafka.common.config.ConfigException: Missing required configuration "internal.key.converter" which has no default value.
at org.apache.kafka.common.config.ConfigDef.parseValue(ConfigDef.java:463)
at org.apache.kafka.common.config.ConfigDef.parse(ConfigDef.java:453)
at org.apache.kafka.common.config.AbstractConfig.<init>(AbstractConfig.java:62)
at org.apache.kafka.common.config.AbstractConfig.<init>(AbstractConfig.java:75)
at org.apache.kafka.connect.runtime.WorkerConfig.<init>(WorkerConfig.java:197)
at org.apache.kafka.connect.runtime.distributed.DistributedConfig.<init>(DistributedConfig.java:289)
at org.apache.kafka.connect.cli.ConnectDistributed.main(ConnectDistributed.java:65)
Which is now resolved by updating the workers.properties as mentioned in http://docs.confluent.io/current/connect/userguide.html#connect-userguide-distributed-config
Command used:
/home/arun/kafka/confluent-3.3.0/bin/connect-distributed.sh ../../../properties/file-stream-demo-distributed.properties
Filestream properties file (workers.properties):
I added below properties and command went through without any errors.
But, now when I run consumer command, I am unable to see the messages in /tmp/demo-file.txt. Please let me know if there is a way I can check if the messages are published to kafka topics and partitions ?
kafka-console-consumer --zookeeper localhost:2181 --topic demo-2-distributed --from-beginning
I believe I am missing something really basic here. Can some one please help?
You need to define unique topics for Kafka connect framework to store its config, offset, and status.
In your workers.properties file change these parameters to something like the following:
These topics are use to store state and configuration metadata of connect and not for storing the messages for any of the connectors that run on top of connect. Do not use console consumer on any of these three topics and expect to see the messages.
The messages are stored in the topic configured in the connector configuration json with the parameter called "topic".
Example file-sink-config.json file
"name": "MyFileSink",
"config": {
"topics": "mytopic",
"connector.class": "org.apache.kafka.connect.file.FileStreamSinkConnector",
"tasks.max": 1,
"key.converter": "org.apache.kafka.connect.storage.StringConverter",
"value.converter": "org.apache.kafka.connect.storage.StringConverter",
"file": "/tmp/demo-file.txt"
Once the distributed worker is running you need to apply the config file to it using curl like so:
curl -X POST -H "Content-Type: application/json" --data #file-sink-config.json http://localhost:8083/connectors
After that the config will be safely stored in the config topic you created for all distributed workers to use. Make sure the config topic (and the status and offset topics) will not expire messages or you will loose you Connector configuration when it does.