Confluent Schema Registry - why is auto.register.schemas not applicable at schema-registry level, instead at the producer level - apache-kafka

i've installed the Schema Registry and want to set the auto.register.schema set to false - so only avro messages conforming to the registered schema can be published to Kafka Topic.
From whet i understand, the property - auto.register.schemas is a Kafka Producer property, and not a schema registry property.
Here is the code I use to set the property auto.register.schemas
Console Producer:
kafka-avro-console-producer --bootstrap-server localhost:9092 --property schema.registry.url=http://localhost:8081 --topic srtest-optionalfield --property value.schema='{"type":"record","name":"myrecord","fields":[{"name":"f2","type":"string"}]}' --property value.subject.name.strategy=io.confluent.kafka.serializers.subject.RecordNameStrategy --property auto.register.schemas=false
Java Avro Producer :
Properties props = new Properties();
props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "http://localhost:9092");
props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, KafkaAvroSerializer.class);
props.put(ProducerConfig.CLIENT_ID_CONFIG, "Kafka Avro Producer");
props.put("schema.registry.url", "http://localhost:8081");
props.put(AbstractKafkaAvroSerDeConfig.AUTO_REGISTER_SCHEMAS, false);
Does this mean that - if a Kafka producer passes the property auto.register.schemas=true, it will be able to add the schema to the Schema Registry.
This does not provide safeguard, since I want to ensure that producer is allowed to produce only messages that conform to schemas in Schema Registry
How do I do this ?
Is there a way from me to set the property - auto.register.schemas - at the Schema Registry level ?
tia!

Related

How to add key serializer and value serializer in kafka console producer

I have below property set in spring boot kafka producer application.yaml
consumer-properties:
key.deserializer: io.confluent.kafka.serializers.KafkaAvroDeserializer
value.deserializer: io.confluent.kafka.serializers.KafkaAvroDeserializer
producer-properties:
key.serializer: io.confluent.kafka.serializers.KafkaAvroSerializer
value.serializer: io.confluent.kafka.serializers.KafkaAvroSerializer
I have to produce message from kafka console producer eg-
kafka-console-producer --bootstrap-server confluent-cp-kafka:9092 --topic TSTTOPIC --producer-property key.serializer: io.confluent.kafka.serializers.KafkaAvroSerializer value.serializer: io.confluent.kafka.serializers.KafkaAvroSerializer
but its not working and whn I produce message from console producer I get error in consumer log as below
You cannot use colons on the CLI.
If you want to use your property file, then pass --producer.config with the producer.properties file
Otherwise, you can use kafka-avro-console-producer along with --producer-property key.serializer=io.confluent.kafka.serializers.KafkaAvroSerializer
As for the Avro serializers, you appear to be missing any key.schema or value.schema + schema.registry.url, which are only properties read by the kakfa-avro-console-producer and would explain why your Avro consumer would be unable to read the data (it was sent as plaintext)

How to send key, value messages with flume to a kafka producer

In console you add producer properties --property "parse.key=true" --property "key.separator=:" to produce key-value data into Kafka, but how to do this with flume? I tried to add
a1.sinks.k1.producer.parse.key=true
a1.sinks.k1.producer.key.separator=:
in .conf file but was of no avail, the kafka treated the key like a string.
Those are console-producer CLI arguments, not ProducerConfig properties for Kafka (which are passed to Flume)
The key will always be a string, but you pass it via the headers of the Flume record
https://github.com/apache/flume/blob/trunk/flume-ng-sinks/flume-ng-kafka-sink/src/main/java/org/apache/flume/sink/kafka/KafkaSink.java#L193

Configure Apache Kafka sink jdbc connector

I want to send the data sent to the topic to a postgresql-database. So I follow this guide and have configured the properties-file like this:
name=transaction-sink
connector.class=io.confluent.connect.jdbc.JdbcSinkConnector
tasks.max=1
topics=transactions
connection.url=jdbc:postgresql://localhost:5432/db
connection.user=db-user
connection.password=
auto.create=true
insert.mode=insert
table.name.format=transaction
pk.mode=none
I start the connector with
./bin/connect-standalone etc/schema-registry/connect-avro-standalone.properties etc/kafka-connect-jdbc/sink-quickstart-postgresql.properties
The sink-connector is created but does not start due to this error:
Caused by: org.apache.kafka.common.errors.SerializationException: Error deserializing Avro message for id -1
Caused by: org.apache.kafka.common.errors.SerializationException: Unknown magic byte!
The schema is in avro-format and registered and I can send (produce) messages to the topic and read (consume) from it. But I can't seem to sent it to the database.
This is my ./etc/schema-registry/connect-avro-standalone.properties
key.converter=io.confluent.connect.avro.AvroConverter
key.converter.schema.registry.url=http://localhost:8081
value.converter=io.confluent.connect.avro.AvroConverter
value.converter.schema.registry.url=http://localhost:8081
This is a producer feeding the topic using the java-api:
properties.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
properties.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, KafkaAvroSerializer.class);
properties.put(AbstractKafkaAvroSerDeConfig.SCHEMA_REGISTRY_URL_CONFIG, "http://localhost:8081");
try (KafkaProducer<String, Transaction> producer = new KafkaProducer<>(properties)) {
Transaction transaction = new Transaction();
transaction.setFoo("foo");
transaction.setBar("bar");
UUID uuid = UUID.randomUUID();
final ProducerRecord<String, Transaction> record = new ProducerRecord<>(TOPIC, uuid.toString(), transaction);
producer.send(record);
}
I'm verifying data is properly serialized and deserialized using
./bin/kafka-avro-console-consumer --bootstrap-server localhost:9092 \
--property schema.registry.url=http://localhost:8081 \
--topic transactions \
--from-beginning --max-messages 1
The database is up and running.
This is not correct:
The unknown magic byte can be due to a id-field not part of the schema
What that error means that the message on the topic was not serialised using the Schema Registry Avro serialiser.
How are you putting data on the topic?
Maybe all the messages have the problem, maybe only someā€”but by default this will halt the Kafka Connect task.
You can set
"errors.tolerance":"all",
to get it to ignore messages that it can't deserialise. But if all of them are not correctly Avro serialised this won't help and you need to serialise them correctly, or choose a different Converter (e.g. if they're actually JSON, use the JSONConverter).
These references should help you more:
https://www.confluent.io/blog/kafka-connect-deep-dive-converters-serialization-explained
https://www.confluent.io/blog/kafka-connect-deep-dive-error-handling-dead-letter-queues
http://rmoff.dev/ksldn19-kafka-connect
Edit :
If you are serialising the key with StringSerializer then you need to use this in your Connect config:
key.converter=org.apache.kafka.connect.storage.StringConverter
You can set it at the worker (global property, applies to all connectors that you run on it), or just for this connector (i.e. put it in the connector properties itself, it will override the worker settings)

Kafka Consumer API not subscribing using Java client

Kafka: 0.10.1.0 (Client & Server)
Java client.
Zookeeper: 3.4.6
Setup: Producer publishes messages. Sent messages on topic counted using ./kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list localhost:9093 --topic TEST.TOPIC --time -1
Issue Consumer when polled while subscribing doesn't work but if you manually assign() - it works. There had been a separate thread on same question but no answer. It may be UUID issue but need more details as we are in evaluating phase and details would help.
Consumer Settings:
props.put("bootstrap.servers", servers);
props.put("enable.auto.commit", ENABLE_AUTO_COMMIT);
props.put("auto.commit.interval.ms", AUTO_COMMIT_INTERVAL_MS);
props.put("session.timeout.ms", SESSION_TIMEOUT_MS);
props.put("group.id", CONSUMER_GROUP_ID);
props.put("key.deserializer", STRING_DESRIALIZER);
props.put("value.deserializer", STRING_DESRIALIZER);
props.put("auto.offset.reset", "earliest");
Issue was with Version of Kafka.
Switched to 0.10.2.1 (server and client) and subscribe() worked flawlessly.

Kafka: dynamically query configurations

Is there a way to access the configuration values in server.properties without direct access to that file itself?
I thought that:
kafka-configs.sh --describe --entity-type topics --zookeeper localhost:2181
might give me what I want, but I did not see the values set in server.properties. Just the following (I set 'ddos' as my own topic from kafka-topics.sh):
Configs for topics:ddos are
Configs for topics:__consumer_offsets are segment.bytes=104857600,cleanup.policy=compact
I was thinking I'd also see globally configured options, like this from the default configuration I have:
log.retention.hours=168
Thanks in advance.
Since Kafka 0.11, you can use the AdminClient describeConfigs() API to retrieve configuration of brokers.
For example, skeleton code to retrieve configuration for broker 0:
Properties adminProps = new Properties();
adminProps.load(new FileInputStream("admin.properties"));
AdminClient admin = KafkaAdminClient.create(adminProps);
Collection<ConfigResource> resources = new ArrayList<>();
ConfigResource cr = new ConfigResource(Type.BROKER, "0");
resources.add(cr);
DescribeConfigsResult dcr = admin.describeConfigs(resources);
System.out.println(dcr.all().get());