Is Schema registry mandatory for Kafka-connect? - apache-kafka

when i tried to run Kafka without schema registry,
I got an error like
CONNECT_VALUE_CONVERTER_SCHEMA_REGISTRY_URL is required.
Command [/usr/local/bin/dub ensure CONNECT_VALUE_CONVERTER_SCHEMA_REGISTRY_URL] FAILED !
Is schema registry mandatory for setting up kafka connect? But I didn't find any words like this from the confluent official documentation.

The Registry is only required if you're using Confluent Converters that do need it, for example their AvroConverter, ProtobufConverter, or JsonSchemaConverter; it is not required for Connect itself.
If you want to run Connect in a container with minimal dependencies, see my image here - https://github.com/OneCricketeer/apache-kafka-connect-docker

Related

How to run the schema register for Apache Kafka without Confluent env

I am not getting any doc through which I can setup my schema registry for Apache Kafka env only with the help of Confluent. but all I am getting is docs in Confluent env using.
See if this approach makes sense for your usecase.
Apicurio registry
You can host apicurio in Kubernetes and allow client first verify schema before publishing message to kafka.
Some good blogpost outlining on how to use:
https://itnext.io/stream-processing-with-apache-spark-kafka-avro-and-apicurio-registry-on-amazon-emr-and-amazon-13080defa3be
https://www.youtube.com/watch?v=xthbYl7xC74

Retrieving specific subjects/schema from Confluent Kafka Schema Registry

I am using the Confluent Kafka Schema registry and have some schema/subjects defined.
Few examples:
dev.delivery.kafka.delivery-reason-value
dev.delivery.kafka.delivery-day-value
dev.travel.kafka.places-ice-value
When I use confluent CLI and connect to the registry, I run the below commands:
#this gives me all the subjects/schemas defined in the registry --perfectly fine :)
confluent schema-registry subject list --prefix ":*:"
But, when I want to retrieve the specific topic's schema, for instance only the schema which has travel word in it
#this gives me "No Subjects."
confluent schema-registry subject list --prefix ":travel:"
OR
confluent schema-registry subject list --prefix ":*travel*:"
Can anyone help me here if I am missing something on the wild cards within the prefix?
Thanks in Advance
You don't really need that CLI. This is easily done with Python, for example.
import requests
r = requests.get('http://registry:8081/subjects')
for subject in r.json():
if 'travel' in subject: # example
print(subject)

Does kafka support schema registries out of the box, or is it a confluent platform feature?

I came across the following article on how to use the schema registry available in the confluent platform.
https://docs.confluent.io/platform/current/schema-registry/schema-validation.html
According to that article, we can specify confluent.schema.registry.url in server.properties to point Kafka to the schema registry.
My question is, is it possible to point a Kafka cluster which is not a part of confluent platform deployment, to a schema registry using confluent.schema.registry.url?
Server-side schema validation is part of Confluent Server, not Apache Kafka.
I will make sure that that docs page gets updated to be more clear - thanks for raising it.

Configuring Kafka connect Postgress Debezium CDC plugin

I am trying to use kafka connect to read changes in postgress DB.
I have Kafka running on my local system and i want to use the Kafka connect API in standalone mode to read the postgress server DB changes.
connect-standalone.sh connect-standalone.properties dbezium.properties
i would appreciate if someone can help me with setting up configuration properties for CDC postgress debezium connector
https://www.confluent.io/connector/debezium-postgresql-cdc-connector/
I am following the below to construct the properties
https://debezium.io/docs/connectors/postgresql/#how-the-postgresql-connector-works
The name of the Kafka topics takes by default the form
serverName.schemaName.tableName, where serverName is the logical name
of the connector as specified with the database.server.name
configuration property
and here is what i have come up with for dbezium.properties
name=cdc_demo
connector.class=io.debezium.connector.postgresql.PostgresConnector
tasks.max=1
plugin.name=wal2json
slot.name=debezium
slot.drop_on_stop=false
database.hostname=localhost
database.port=5432
database.user=postgress
database.password=postgress
database.dbname=test
time.precision.mode=adaptive
database.sslmode=disable
Lets say i create a PG schema name as demo and table name as suppliers
So i need to create a topic with name as test.demo.suppliers so that this plugin can push the data to?
Also can someone suggest a docker image which has the postgress server + with suitable replication plugin such as wal2json etc? i am having hard time configuring postgress and the CDC plugin myself.
Check out the tutorial with associated Docker Compose and sample config.
The topic you've come up with sounds correct, but if you have your Kafka broker configured to auto-create topics (which is the default behaviour IIRC) then it will get created for you and you don't need to pre-create it.

Can I access Kafka Connect Worker config from connector or task?

I am developing a custom Kafka source connector. I would like to access the worker configuration, such as key converter or value converter or schema registry url or zookeeper url etc., but I didn't find a way to do that. Any idea? Is that possible?
To be more specific, in my implementation of connector and task, can I access worker's configuration? I checked, the only thing I can get is a ConnectorContext from connector implementation, and it has only one useful method to do reconfiguration, but it is not what I want.