I have confluent kafka, zookeeper, schema-registry and ksql running in containers on Kubernetes cluster. Kafka, zookeeper and schema registry works fine, a can create topic and write data in Avro format, but when I'm trying to check ksql and create some streaming with curl like:
curl -XPOST http://ksql-svc.someapp:8080/ksql -H "Content-Type: application/json" -d $'
{"ksql": "CREATE STREAM kawabanga_stream (log_id varchar, created_date varchar) WITH (kafka_topic = '\'kawabanga\'', value_format = '\'avro\'');","streamsProperties":{}}'
I get error:
[{"error":{"statementText":"CREATE STREAM kawabanga_stream (log_id varchar, created_date varchar) WITH (kafka_topic = 'kawabanga', value_format = 'avro');","errorMessage":{"message":"Avro schema file path should be set for avro topics.","stackTrace":["io.confluent.ksql.ddl.commands.RegisterTopicCommand.extractTopicSerDe(RegisterTopicCommand.java:75)","io.confluent.ksql.ddl.commands.RegisterTopicCommand.<init>
Please find belowe my ksql server config:
# cat /etc/ksql/ksqlserver.properties
bootstrap.servers=kafka-0.kafka-hs:9093,kafka-1.kafka-hs:9093,kafka-
2.kafka-hs:9093
schema.registry.host=schema-svc
schema.registry.port=8081
ksql.command.topic.suffix=commands
listeners=http://0.0.0.0:8080
Also I tried to start server without schema.registry strings but with no luck
You must set the configuration ksql.schema.registry.url (see KSQL v0.5 documentation).
FYI: We will have better documentation for Avro usage in KSQL and for Confluent Schema registry integration with the upcoming GA release of KSQL in early April (as part of Confluent Platform 4.1).
I don't know how to check version, but I'm using docker image confluentinc/ksql-cli (I think it's 0.5). For Kafka I used Confluent Kafka Docker image 4.0.0
The KSQL server shows the version on startup.
If you can't see the KSQL server startup message, you can also check the KSQL version by entering "VERSION" in the KSQL CLI prompt (ksql> VERSION).
According to the KSQLConfig class, you should use the ksql.schema.registry.url property to specify the location of the Schema Registry.
This looks to of been the case since at least v0.5 onwards.
It's also worth noting that using the RESTful API directly isn't currently supported. So you may find the API changes between releases.
Related
I am trying to connect a running redpanda kafka cluster to a redpanda schema registry, so that the schema registry verifies incoming messages to the topic, and/or messages being read from the topic.
I am able to add a schema to the registry and read it back with curl requests, as well as add messages to the kafka topic I created in the redpanda cluster.
My question is, how to implement the schema registry with a topic in a kafka cluster? Like how do I instruct the schema registry and/or the kafka topic to validate incoming messages against the schema I added to the registry?
Thanks for your help or a point in the right direction!
Relevant info:
Cluster & topic creation:
https://vectorized.io/docs/guide-rpk-container
rpk container start -n 3
rpk topic create -p 6 -r 3 new-topic --brokers <broker1_address>,<broker2_address>...
Schema Registry creation:
https://vectorized.io/blog/schema_registry/
Command to add a schema:
curl -s \
-X POST \
"http://localhost:8081/subjects/sensor-value/versions" \
-H "Content-Type: application/vnd.schemaregistry.v1+json" \
-d '{"schema": "{\"type\":\"record\",\"name\":\"sensor_sample\",\"fields\":[{\"name\":\"timestamp\",\"type\":\"long\",\"logicalType\":\"timestamp-millis\"},{\"name\":\"identifier\",\"type\":\"string\",\"logicalType\":\"uuid\"},{\"name\":\"value\",\"type\":\"long\"}]}"}' \
| jq
The client is responsible for such integration. For example, the Confluent Schema Registry includes KafkaAvroSerializer Java class that wraps an HTTP Client that handles the schema registration and message validation. The broker doesn't handle "topic schemas", since schemas are really per-record. Unless RedPanda has something similar, broker-side validation is only offered by Enterprise "Confluent Server."
RedPanda is primarily a server that exposes a Kafka-compatible API; I assume it is up to you to create (de)serializer interfaces for your respective client languages. There is a Python example on Vectorized Github.
That being said, the Confluent Schema Registry should work with RedPanda as well, and so you can use its serializers and HTTP client libraries with it.
I am trying to use kafka connect to read changes in postgress DB.
I have Kafka running on my local system and i want to use the Kafka connect API in standalone mode to read the postgress server DB changes.
connect-standalone.sh connect-standalone.properties dbezium.properties
i would appreciate if someone can help me with setting up configuration properties for CDC postgress debezium connector
https://www.confluent.io/connector/debezium-postgresql-cdc-connector/
I am following the below to construct the properties
https://debezium.io/docs/connectors/postgresql/#how-the-postgresql-connector-works
The name of the Kafka topics takes by default the form
serverName.schemaName.tableName, where serverName is the logical name
of the connector as specified with the database.server.name
configuration property
and here is what i have come up with for dbezium.properties
name=cdc_demo
connector.class=io.debezium.connector.postgresql.PostgresConnector
tasks.max=1
plugin.name=wal2json
slot.name=debezium
slot.drop_on_stop=false
database.hostname=localhost
database.port=5432
database.user=postgress
database.password=postgress
database.dbname=test
time.precision.mode=adaptive
database.sslmode=disable
Lets say i create a PG schema name as demo and table name as suppliers
So i need to create a topic with name as test.demo.suppliers so that this plugin can push the data to?
Also can someone suggest a docker image which has the postgress server + with suitable replication plugin such as wal2json etc? i am having hard time configuring postgress and the CDC plugin myself.
Check out the tutorial with associated Docker Compose and sample config.
The topic you've come up with sounds correct, but if you have your Kafka broker configured to auto-create topics (which is the default behaviour IIRC) then it will get created for you and you don't need to pre-create it.
I am trying to upgrade from the apache kafka to the confluent kafka
As the storage of the temp folder is quite limited I have changed the log.dirs of server.properties to a custom folder
log.dirs=<custom location>
Then try to start kafka server via the Confluent CLI (version 4.0) using below command :
bin/confluent start kafka
However when I check the kafka data folder, the data still persitted under the temp folder instead of the customzied one.
I have tried to start kafka server directly which is not using the Confluent CLI
bin/kafka-server-start etc/kafka/server.properties
then seen the config has been picked up properly
is this a bug with confluent CLI or it is supposed to be
I am trying to upgrade from the apache kafka to the confluent kafka
There is no such thing as "confluent kafka".
You can refer to the Apache or Confluent Upgrade documentation steps for switching Kafka versions, but at the end of the day, both are Apache Kafka.
On a related note: You don't need Kafka from the Confluent site to run other parts of the Confluent Platform.
The confluent command, though, will read it's own embedded config files for running on localhost only, and is not intended to integrate with external brokers / zookeepers.
Therefore, kafka-server-start is the production way to run Apache Kafka
Confluent CLI is meant to be used during development with Confluent Platform. Therefore, it currently gathers all the data and logs under a common location in order for a developer to be able to easily inspect (with confluent log or manually) and delete (with confluent destroy or manually) such data.
You are able to change this common location by setting
export CONFLUENT_CURRENT=<top-level-logs-and-data-directory>
and get which location is used any time with:
confluent current
The rest of the properties are used as set in the various .properties files for each service.
I was wondering can I use Confluent Schema registry to generate (and then send it to kafka) schema less avro records? If yes can somebody please share some resources for it?
I am not able to find any example on Confluent website and Google.
I have a plain delimited file and I have a separate schema for it, currently I am using Avro Generic Record schema to serialize the Avro records and sending it through Kafka. This way the schema is still attached with the record which makes it more bulkier. My logic is that if I remove the schema while sending the record from kafka I will be able to get higher throughput.
The Confluent Schema Registry will send Avro messages serialized without the entire Avro Schema in the message. I think this is what you mean by "schema less" messages.
The Confluent Schema Registry will store the Avro schemas and only a short index id is included in the message on the wire.
The full docs including a quickstart guide for testing the Confluent Schema Registry is here
http://docs.confluent.io/current/schema-registry/docs/index.html
You can register the your avro schema first time with the help of below command from cmd
curl -X POST -i -H "Content-Type: application/vnd.schemaregistry.v1+json" \
--data '{"schema": "{\"type\": \"string\"}"}' \
http://localhost:8081/subjects/topic
You can see all versions of your topic using
curl -X GET -i http://localhost:8081/subjects/topic/versions
To see complete Acro schema for version 1 from all versions present in confluent schema registry use below command, will show schema in json format
curl -X GET -i http://localhost:8081/subjects/topica/versions/1
Avro schema registration is task of Kafka producer
After having schema in confluent schema registry, you just need to publish avro generic records to specific kafka topic, in our case it is 'topic'
Kafka Consumer :Use below code to take latest schema for specific Kafka topic
val schemaReg = new CachedSchemaRegistryClient(kafkaAvroSchemaRegistryUrl, 100)
val schemaMeta = schemaReg.getLatestSchemaMetadata(kafkaTopic + "-value")
val schema = schemaMeta.getSchema
val schema =new Schema.Parser().parse(schema)
Above will be use to get schema and then we can use confluent to decode record from kafka topic.
How to delete Kafka topic using Kafka REST Proxy? I tried the following command, but it returns the error message:
curl -X DELETE XXX.XX.XXX.XX:9092/topics/test_topic
If it's impossible, then how to update delete the messages and update the scheme of a topic?
According to the documentation API Reference, you cannot delete topics via REST Proxy, and I agree with them because such a destructive operation should not be available via interface that is exposed to outside.
The topic deletion operation can be performed on the server where broker runs using command line utility. See How to Delete a topic in apache kafka
You can update the schema for a message when you publish it using the POST /topics/(string: topic_name) REST endpoint. If the schema for the new messages is not backward compatible with the older messages in the same topic you will have to configure your Schema Registry to allow publishing of incompatible messages, otherwise you will get an error.
See the "Example Avro request" here:
http://docs.confluent.io/3.1.1/kafka-rest/docs/api.html#post--topics-(string-topic_name)
See how to configure Schema Registry for forward, backward, or no compatibility see the documentation here:
http://docs.confluent.io/3.1.1/schema-registry/docs/api.html#compatibility
I confirmed that it is supported from version 5.5.0 or higher, and the test result worked normally. (REST Proxy API v3)
https://docs.confluent.io/current/kafka-rest/api.html#topic