To use the confluent schema registry and avro serializer in the spring application, how to do for json data input - apache-kafka

Currently, I have used spring cloud stream github example, however I do not know how to transfer the manually typed objects into json format by providing from the existing json data. I can infer the avro schema using some tool from json data. However the problem is I do not want to use POJOs that is inferred from class in import using avro schema instead I want to use the existing json data. I am also confused about the application/json part, when I am using
curl -X POST, maybe is there a way to feed the data in http request(add annotations in the send message part). Also, give an explanation of #ResquestMapping, and #Enablebinding, #StreamListener, when to use them.

To start off, you'd have to define a producer using the KafkaAvroSerializer rather than some StringSerializer or JSON one
from the existing json data
You'd use a from the existing json data like Jackson or Gson to take JSON to a POJO by parsing it.
the problem is I do not want to use POJOs that is inferred from class
POJO defines classes. They're not inferred
using avro schema instead I want to use the existing json data.
JSON and Avro are different formats. You'd have to use some tool to translate them or manually parse JSON yourself and create an Avro record
I am also confused about the application/json part, when I am using curl -X POST, maybe is there a way to feed the data in http request(add annotations in the send message part
Yes, headers define extra metadata in the request
curl -H 'Content-Type:application/json`
#StreamListener, when to use them
When you're consuming events, not sending them

Related

Why AVRO for Kafka

If Java can serialize anything when sending over the network. Why did they create a brand new framework for Kafka (AVRO) and not just serialize regular JSON ?
This article very well explains why serialization is required and where does AVRO comes into play. Also it explains how it differs from textual serialization formats like JSON, XML, CSV etc.
https://devtechfactory.com/blogs/kafka-producer-publish-message-with-avro-schema

Conversion of JSON to Java Pojo using MapStruct

How to convert an incoming JSON object structure to Java Pojo using mapstruct?
The incoming JSON response object structure might be different as per the configuration.
Thanks.
For a moment MapStruct hasn't this functionality. I know, that they are working on mapping from object to JSON. Mb they are also working on opposite conversion.

Kafka Source Connector - How to pass the schema from String (json)

I've developed a custom source connector for an external REST service.
I get JSONs, convert them to org.apache.kafka.connect.data.Struct with manually defined schema (SchemaBuilder) and wrap all this to SourceRecord.
All of this is for one entity only, but there a dozen of them.
My new goal is to make this connector universal and parametrize the schema. The idea is to get the schema as String (json) from configs or external files and pass it to SourceRecord, but it only accepts Schema objects.
Is there any simple/good ways to convert String/json to Schema or even pass String schema directly?
There is a JSON to Avro converter, however, if you are already building a Struct/Schema combination, then you shouldn't need to do anything, as the Converter classes in Kafka Connect can handle the conversion for you
https://www.confluent.io/blog/kafka-connect-deep-dive-converters-serialization-explained/

google-cloud-datastore java client: Is there a way to infer schema and/or retrieve results as Json?

I am working on datastore datasource for apache-spark based on spark datasource V2 api. I was able to implement using hard-coded single entity but couldn't generalize it. Either I need to infer entity schema and translate entity record into Spark Row or read entity record as json and let the user translate into scala product (datastore java client is REST based so the payload is being pulled as json). I could see "entity.properties" as json key-values from within IntelliJ debugger which includes everything I need (column name, value, type etc.) but I can't use entity.properties due to access restrictions. Appreciate any ideas.
fixed by switching to low level API https://github.com/GoogleCloudPlatform/google-cloud-datastore
full source for spark-datastore-connector https://github.com/sgireddy/spark-datastore-connector

Kafka avro serialization with schema evolution

I am trying to build a kakfa pipeline which will read JSON input data into Kafka topic.
I am using AVRO serialization with schema registry, as my schema gets changed on regular basis.
As of now GenericRecord is used to parse the schema.
But I recently came to know that avro-tools are available to read schema and generate Java classes which can be used to create Producer Code.
I am confused choose between these two options.
Can you please suggest me which one is better, as my schema gets frequently changed?
avro-tools are available to read schema and generate java classes which can be used to create Producer Code
They create specific Avro classes, not Producer code, but regarding the question. Both will work.
The way I see it
GenericRecord - Think of it as a HashMap<String, Object>. As a consumer need to know the fields to get. If, as a producer or schema creator, you are not able to send your classes as a library to your consumers, this is the essentially the best you can get. I believe you'll always be able to get the latest data, though (all possible fields can be accessed by a get("fieldname") call. See example here
SpecificRecord (what avro-tools generates) - It is just a generated class with getter methods and builder objects / setter methods. Any consumer will be able to import your producer classes as dependencies, deserialize the message, then immediately know what fields are available. You are not guaranteed to get the latest schema here - you will be "downgraded" and limited to whatever schema was used to generate those classes.
I use avro-maven-plugin to generally create the classes. Just as this example
You could also use AvroReflect to build an Avro schema from a Java class rather than the other way around. Annotations can be used on fields to set #Union or #AvroDefault settings.
Further Reading about using the Confluent Schema Registry