Kafka schema registry not compatible in the same topic - apache-kafka

I'm using Kafka schema registry for producing/consuming Kafka messages, for example I have two fields they are both string type, the pseudo schema as following:
{"name": "test1", "type": "string"}
{"name": "test2", "type": "string"}
but after sending and consuming a while, I need modify schema to change the second filed to long type, then it threw the following exception:
Schema being registered is incompatible with an earlier schema; error code: 409
I'm confused, if schema registry can not evolve the schema upgrade/change, then why should I use Schema registry, or say why I use Avro?

Fields cannot be renamed in BACKWARD compatibility mode. As a workaround you can change the compatibility rules for the schema registry.
According to the docs:
The schema registry server can enforce certain compatibility rules
when new schemas are registered in a subject. Currently, we support
the following compatibility rules.
Backward compatibility (default): A new schema is backward compatible
if it can be used to read the data written in all previous schemas.
Backward compatibility is useful for loading data into systems like
Hadoop since one can always query data of all versions using the
latest schema.
Forward compatibility: A new schema is forward
compatible if all previous schemas can read data written in this
schema. Forward compatibility is useful for consumer applications that
can only deal with data in a particular version that may not always be
the latest version.
Full compatibility: A new schema is fully
compatible if it’s both backward and forward compatible.
No compatibility: A new schema can be any schema as long as it’s a valid
Avro.
Setting compatibility to NONE should do the trick.
# Update compatibility requirements globally
$ curl -X PUT -H "Content-Type: application/vnd.schemaregistry.v1+json" \
--data '{"compatibility": "NONE"}' \
http://localhost:8081/config
And the response should be
{"compatibility":"NONE"}
I generally discourage setting compatibility to NONE on a subject unless absolutely necessary.

If you need just the new schema and you don't need the previous schemas from schema registry, you can delete the older schemas as mentioned below
:
I've tested this with confluent-kafka and it worked for me
Deletes all schema versions registered under the subject "Kafka-value"
curl -X DELETE http://localhost:8081/subjects/Kafka-value
Deletes version 1 of the schema registered under subject "Kafka-value"
curl -X DELETE http://localhost:8081/subjects/Kafka-value/versions/1
Deletes the most recently registered schema under subject "Kafka-value"
curl -X DELETE http://localhost:8081/subjects/Kafka-value/versions/latest
Ref: https://docs.confluent.io/platform/current/schema-registry/schema-deletion-guidelines.html

https://docs.confluent.io/current/avro.html
You might need to add a "default": null.
You can also delete existing one and register the updated one.

You can simply append a default value like this.
{"name": "test3", "type": "string","default": null}

Related

How to add changes for NO ENCRYPT DB2 option to db2RestoreStruct

I am trying to restore encrypted DB to non-encryped DB. I made changes by setting piDbEncOpts to SQL_ENCRYPT_DB_NO but still restore is being failed. Is there db2 sample code is there where I can check how to set "NO Encrypt" option in DB2. I am adding with below code snippet.
db2RestoreStruct->piDbEncOpts->encryptDb = SQL_ENCRYPT_DB_NO
The 'C' API named db2Restore will restore an encrypted-image to a unencrypted database , when used correctly.
You can use a modified version of IBM's samples files: dbrestore.sqc and related files, to see how to do it.
Depending on your 'C' compiler version and settings you might get a lot of warnings from IBM's code, because IBM does not appear to maintain the code of their samples as the years pass. However, you do not need to run IBM's sample code, you can study it to understand how to fix your own C code.
If installed, the samples component must match your Db2-server version+fixpack , and you must use the C include files that come with your Db2-server version+fixpack to get the relevant definitions.
The modifications to IBM's samples code include:
When using the db2Restore API ensure its first argument has a value that is compatible with your server Db2-version-and-fixpack to access the required functionality. If you specify the wrong version number for the first argument, for example a version of Db2 that did not support this functionality, then the API will fail. For example, on my Db2-LUW v11.1.4.6, I used the predefined db2Version1113 , like this:
db2Restore(db2Version1113, &restoreStruct, &sqlca);
When setting the restore iOptions field: enable the flag DB2RESTORE_NOENCRYPT, for example, in IBM's example, include the additional flag: restoreStruct.iOptions = DB2RESTORE_OFFLINE | DB2RESTORE_DB | DB2RESTORE_NODATALINK | DB2RESTORE_NOROLLFWD | DB2RESTORE_NOENCRYPT;
Ensure the restoredDbAlias differs from the encrypted-backup alias name.
I tested with Db2 v11.1.4.6 (db2Version1113 in the API) with gcc 9.3.
I also tested with Db2 v11.5 (db2Version11500 in the API) with gcc 9.3.

Subject does not have subject-level compatibility configured

We use Kafka, Kafka connect and Schema-registry in our stack. Version is 2.8.1(Confluent 6.2.1).
We use Kafka connect's configs(key.converter and value.converter) with value: io.confluent.connect.avro.AvroConverter.
It registers a new schema for topics automatically. But there's an issue, AvroConverter doesn't specify subject-level compatibility for a new schema
and the error appears when we are trying to get config for the schema via REST API /config: Subject 'schema-value' does not have subject-level compatibility configured
If we specify the request parameter defaultToGlobal then global compatibility is returned. But it doesn't work for us because we cannot specify it in the request. We are using 3rd party UI: AKHQ.
How can I specify subject-level compatibility when registering a new schema via AvroConverter?
Last I checked, the only properties that can be provided to any of the Avro serializer configs that affect the Registry HTTP client are the url, whether to auto-register, and whether to use the latest schema version.
There's no property (or even method call) that sets either the subject level or global config during schema registration
You're welcome to check out the source code to verify this
But it doesn't work for us because we cannot specify it in the request. We are using 3rd party UI: AKHQ
Doesn't sound like a Connect problem. Create a PR for AKHQ project to fix the request
As of 2021-10-26, I used akhq 0.18.0 jar and confluent-6.2.0, the schema registry in akhq is working fine.
Note: I also used confluent-6.2.1, seeing exactly the same error. So, you may want to switch back to 6.2.0 to give a try.
P.S: using all only for my local dev env, VirtualBox, Ubuntu.
#OneCricketeer is correct.
There is no possibility to specify subject-level compatibility in AvroConverter unfortunately.
I see only two solutions:
Override AvroConverter to add property and functionality to send an additional request to API /config/{subject} after registering the schema.
Contribute to AKHQ to support defaultToGlobal parameter. But in this case, we also need to backport schema-registry RestClient. Github issue
The second solution is more preferable till the user would specify the compatibility level in the settings of the converter. Without this setting in the native AvroConverter, we have to use the custom converter for every client who writes a schema. And it makes a lot of effort.
For me, it looks strange why the client cannot set up the compatibility at the moment of registering the schema and has to use a different request for it.

Kafka mongo db source connector not working

Hi in my POC I am using both the sink and the source mongodb connector.
The sink connector works fine. But the source connector does not push data into the resultant topic. The objective is to push full documents of all changes (Insert and Update) in a collection call 'request'.
Below is the code.
curl -X PUT http://localhost:8083/connectors/source-mongodb-request/config -H "Content-Type: application/json" -d '{
"tasks.max":1,
"connector.class":"com.mongodb.kafka.connect.MongoSourceConnector",
"key.converter":"org.apache.kafka.connect.storage.StringConverter",
"value.converter":"org.apache.kafka.connect.storage.StringConverter",
"connection.uri":"mongodb://localhost:27017",
"pipeline":"[]",
"database":"proj",
"publish.full.document.only":"true",
"collection":"request",
"topic.prefix": ""
}'
No messages are getting pushed to proj.request topic. The topic gets created once I insert a record in the collection 'request'.
Would be great to get help on this, as its a make or break task for the POC.
Things work fine n the connectors on confluent cloud. But its the on premise set up on which I need to get this working.
make sure you have a valid pipeline - stages included in your properties file such as this one
"pipeline":" [{"$match":{"type":{"$in"["insert","update","replace"]}}}]",
Refer : https://docs.mongodb.com/manual/reference/operator/aggregation-pipeline/

Storing Avro schema in schema registry

I am using Confluent's JDBC connector to send data into Kafka in the Avro format. I need to store this schema in the schema registry, but I'm not sure what format it accepts. I've read the documentation here, but it doesn't mention much.
I have tried this (taking the Avro output and pasting it in - for one int and one string field):
curl -X POST -H "Content-Type: application/vnd.schemaregistry.v1+json" --data '{"type":"struct","fields":[{"type":"int64","optional":true,"field":"id"},{"type":"string","optional":true,"field":"serial"}],"optional":false,"name":"test"}' http://localhost:8081/subjects/view/versions
but I get the error: {"error_code":422,"message":"Unrecognized field: type"}
The schema that you give as a JSON should start with a 'schema' key. The actual schema that you provide will be the value of the key schema.
So your request should look like this:
curl -X POST -H "Content-Type: application/vnd.schemaregistry.v1+json" --data '{"schema" : "{\"type\":\"string\",\"fields\":[{\"type\":\"int64\",\"optional\":true,\"field\":\"id\"},{\"type\":\"string\",\"optional\":true,\"field\":\"serial\"}],\"optional\":false,\"name\":\"test\"}"}' http://localhost:8081/subjects/view/versions
I've made two other changes to the command:
I've escaped each double quote within the value of the schema key.
I've changed the struct data structure to string. I'm not sure why it isn't taking complex structures though.
Check out how they've modeled the schema here, for the first POST request described in the documentation.
First, do you need to store the schema in advance? If you use the JDBC connector with the Avro converter (which is part of the schema registry package), the JDBC connector will figure out the schema of the table from the database and register it for you. You will need to specify the converter in your KafkaConnect config file. You can use this as an example: https://github.com/confluentinc/schema-registry/blob/master/config/connect-avro-standalone.properties
If you really want to register the schema yourself, there's some chance the issue is with the shell command - escaping JSON in shell is tricky. I installed Advanced Rest Client in Chrome and use that to work with the REST APIs of both schema registry and KafkaConnect.

Getting null.txt file when using Cygnus HDFS sink

I'm using Cygnus 0.5 with default configuration for HDFS sink. In order make it to run, I have deactivated the "ds" interceptor (otherwise I get an error at start time that precludes Cygnus to start, related with not finding the matching table file).
Cygnus seems to work, but the file in which entity information is stored in HDFS gets a weird name: "null.txt". How can I fix this?
First of all do no deactivate the DestinationExtractor interceptor. This is the piece of code infering the destination the context data notified by Orion is going to be persisted. Please observe the destination may refer to a HDFS file name, a MySQL table name or a CKAN resource name, it depends on the sinks you have configured. Once infered, the destination is added to the internal Flume event as a header called destination in order the sinks know where to persist. Thus, if deactivated, such a header is not found by the sinks and a null name is used as the destination name.
Regarding the "matching table file not found" problem you experienced (and which leaded you yo deactivate the Interceptor), it was due to the Cygnus configuration template had a bad default value for the cygnusagent.sources.http-source.interceptors.de.matching_table parameter. This has been solved in Cygnus 0.5.1.
A workaround while Cygnus 0.5.1 gets released is:
Do no deactivate the DestinatonExtractor (as #frb says in his answer)
Create an empty matching table file and use it for the matching_table configuration, i.e.: touch /tmp/dummy_table.conf then set in the cygnus configuration file: cygnusagent.sources.http-source.interceptors.de.matching_table = /tmp/dummy_table.conf