Conversion of JSON to Avro failed: Failed to convert JSON to Avro: Unknown union branch - apache-kafka

I am trying to send a JSON message to a Kafka topic using Kafka-rest service to serialize JSON as an Avro object, but the JSON message is failed to get accepted by Kafka-rest with the following error:
Conversion of JSON to Avro failed: Failed to convert JSON to Avro: Unknown union branch postId
I suspect that there is an issue with the Avro schema I am using as it is a nested record type with nullable fields.
Avro schema:
{
"type": "record",
"name": "ExportRequest",
"namespace": "com.example.avro.model",
"fields": [
{
"name": "context",
"type": {
"type": "map",
"values": {
"type": "string",
"avro.java.string": "String"
},
"avro.java.string": "String"
}
},
{
"name": "exportInfo",
"type": {
"type": "record",
"name": "ExportInfo",
"fields": [
{
"name": "exportId",
"type": {
"type": "string",
"avro.java.string": "String"
}
},
{
"name": "exportType",
"type": {
"type": "string",
"avro.java.string": "String"
}
},
{
"name": "exportQuery",
"type": {
"type": "record",
"name": "ExportQuery",
"fields": [
{
"name": "postExport",
"type": [
"null",
{
"type": "record",
"name": "PostExport",
"fields": [
{
"name": "postId",
"type": {
"type": "string",
"avro.java.string": "String"
}
},
{
"name": "isCommentIncluded",
"type": "boolean"
}
]
}
],
"default": null
},
{
"name": "feedExport",
"type": [
"null",
{
"type": "record",
"name": "FeedExport",
"fields": [
{
"name": "accounts",
"type": {
"type": "array",
"items": {
"type": "string",
"avro.java.string": "String"
}
}
},
{
"name": "recordTypes",
"type": {
"type": "array",
"items": {
"type": "string",
"avro.java.string": "String"
}
}
},
{
"name": "actions",
"type": {
"type": "array",
"items": {
"type": "string",
"avro.java.string": "String"
}
}
},
{
"name": "contentTypes",
"type": {
"type": "array",
"items": {
"type": "string",
"avro.java.string": "String"
}
}
},
{
"name": "startDate",
"type": "long"
},
{
"name": "endDate",
"type": "long"
},
{
"name": "advancedSearch",
"type": [
"null",
{
"type": "record",
"name": "AdvancedSearchExport",
"fields": [
{
"name": "allOfTheWords",
"type": {
"type": "array",
"items": {
"type": "string",
"avro.java.string": "String"
}
}
},
{
"name": "anyOfTheWords",
"type": {
"type": "array",
"items": {
"type": "string",
"avro.java.string": "String"
}
}
},
{
"name": "noneOfTheWords",
"type": {
"type": "array",
"items": {
"type": "string",
"avro.java.string": "String"
}
}
},
{
"name": "hashtags",
"type": {
"type": "array",
"items": {
"type": "string",
"avro.java.string": "String"
}
}
},
{
"name": "keyword",
"type": {
"type": "string",
"avro.java.string": "String"
}
},
{
"name": "exactPhrase",
"type": {
"type": "string",
"avro.java.string": "String"
}
}
]
}
],
"default": null
}
]
}
],
"default": null
}
]
}
}
]
}
}
]
}
Json message:
{
"context": {
"user_id": "1",
"group_id": "1",
"organization_id": "1"
},
"exportInfo": {
"exportId": "93874dd7-35d7-4f1f-8cf8-051c606d920b",
"exportType": "json",
"exportQuery": {
"postExport": {
"postId": "dd",
"isCommentIncluded": false
},
"feedExport": {
"accounts": [
"1677143852565319"
],
"recordTypes": [],
"actions": [],
"contentTypes": [],
"startDate": 0,
"endDate": 0,
"advancedSearch": {
"allOfTheWords": [
"string"
],
"anyOfTheWords": [
"string"
],
"noneOfTheWords": [
"string"
],
"hashtags": [
"string"
],
"keyword": "string",
"exactPhrase": "string"
}
}
}
}
}
I would appreciate it if someone could help me to understand what the issue is.

Both of your JSON and Avro looks good.
You are facing the issue because JSON doesn't conform to Avro's JSON encoding spec.
So, if you convert your JSON accordingly, it will somehow look like this
{
"context": {
"user_id": "1",
"group_id": "1",
"organization_id": "1"
},
"exportInfo": {
"exportId": "93874dd7-35d7-4f1f-8cf8-051c606d920b",
"exportType": "json",
"exportQuery": {
"postExport": {
"com.example.avro.model.PostExport": {
"postId": "dd",
"isCommentIncluded": false
}
},
"feedExport": {
"com.example.avro.model.FeedExport": {
"accounts": [
"1677143852565319"
],
"recordTypes": [],
"actions": [],
"contentTypes": [],
"startDate": 0,
"endDate": 0,
"advancedSearch": {
"com.example.avro.model.AdvancedSearchExport": {
"allOfTheWords": [
"string"
],
"anyOfTheWords": [
"string"
],
"noneOfTheWords": [
"string"
],
"hashtags": [
"string"
],
"keyword": "string",
"exactPhrase": "string"
}
}
}
}
}
}
}

Related

How is the value NULL?

I am getting query results that determine if user story hasn't been changed (changedate) in the last one day.
I'm following this article to build the logic app as the intention is similar
For some reason, despite the query returning a valid response (at least 1 user story result), the foreach expression is throwing this error:
ExpressionEvaluationFailed. The execution of template action 'For_each' failed: the result of the evaluation of 'foreach' expression '#body('Parse_JSON')?['body']?['value']' is of type 'Null'. The result must be a valid array.
How is it NULL when clearly there is a user story returned?
Get query results:
OUTPUTS:
[
{
"System.Id": 12345,
"System.WorkItemType": "User Story",
"System.State": "New",
"System.Title": "Experiment"
}
]
Parse JSON:
Inputs:
Content:
{
"value": [
{
"System.Id": 12345,
"System.WorkItemType": "User Story",
"System.State": "New",
"System.Title": "Experiment"
}
],
"#odata.nextLink": null
}
Schema
{
"type": "object",
"properties": {
"body": {
"type": "object",
"properties": {
"value": {
"type": "array",
"items": {
"type": "object",
"properties": {
"System.AssignedTo": {
"type": "string"
},
"System.Id": {
"type": "integer"
},
"System.State": {
"type": "string"
},
"System.Tags": {
"type": "string"
},
"System.Title": {
"type": "string"
},
"System.WorkItemType": {
"type": "string"
}
},
"required": [
"System.Id",
"System.WorkItemType",
"System.State",
"System.AssignedTo",
"System.Title"
]
}
},
"#odata.nextLink": {}
}
},
"headers": {
"type": "object",
"properties": {
"Cache-Control": {
"type": "string"
},
"Content-Length": {
"type": "string"
},
"Content-Type": {
"type": "string"
},
"Date": {
"type": "string"
},
"Expires": {
"type": "string"
},
"Pragma": {
"type": "string"
},
"Set-Cookie": {
"type": "string"
},
"Strict-Transport-Security": {
"type": "string"
},
"Timing-Allow-Origin": {
"type": "string"
},
"Transfer-Encoding": {
"type": "string"
},
"Vary": {
"type": "string"
},
"X-Content-Type-Options": {
"type": "string"
},
"X-Frame-Options": {
"type": "string"
},
"x-ms-apihub-cached-response": {
"type": "string"
},
"x-ms-apihub-obo": {
"type": "string"
},
"x-ms-request-id": {
"type": "string"
}
}
},
"statusCode": {
"type": "integer"
}
}
}
Outputs:
{
"value": [
{
"System.Id": 12345,
"System.WorkItemType": "User Story",
"System.State": "New",
"System.Title": "Experiment"
}
],
"#odata.nextLink": null
}
Using the Value from the Get Query Results directly works.

AWS-API gateway -- jsonschema child object should validate when parent object exists

I need to create Jsonschema for the following JSON input. Here properties under Vehicle like( Manufacturer, Model, etc) should be required only when Vehicle object exists.
{
"Manufacturer": "",
"Characteristics": {
"Starts": "new",
"vehicle": {
"Manufacturer": "hello",
"Model": "hh",
"Opening": "",
"Quantity": "",
"Principle": "",
"Type": ""
}
}
}
I tried the following JsonSchema but this works when Vehicle object is not there but if we rename Vehicle to some other ex: Vehicle1 it doesn't give an error. Please guide me on how to fix this.
{
"$schema": "http://json-schema.org/draft-07/schema",
"type": "object",
"properties": {
"Manufacturer": {
"type": [
"string",
"null"
]
},
"Characteristics": {
"type": "object",
"properties": {
"Starts": {
"type": [
"string",
"null"
]
},
"Vehicle": {
"$ref": "#/definitions/Vehicle"
}
},
"required": [
"Starts", "Vehcle"
]
}
},
"required": [
"Manufacturer"
],
"definitions": {
"Vehicle": {
"type": "object",
"properties": {
"Manufacturer": {
"type": [
"string",
"null"
]
},
"Model": {
"type": [
"string",
"null"
]
},
"Opening": {
"type": [
"string",
"null"
]
},
"PanelQuantity": {
"type": [
"string",
"null"
]
},
"Principle": {
"type": [
"string",
"null"
]
},
"Type": {
"type": [
"string",
"null"
]
}
},
"required": ["Manufacturer", "Model", "Opening", "Quantity", "Principle", "Type"]
}
}
}
Thanks,
Bhaskar
Sounds like you want to add "additionalProperties": false -- which will generate an error if any other properties are present that aren't defined under properties.

Getting error on null and empty string while copying a csv file from blob container to Azure SQL DB

I tried all combination on the datatype of my data but each time my data factory pipeline is giving me this error:
{
"errorCode": "2200",
"message": "ErrorCode=UserErrorColumnNameNotAllowNull,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Empty or Null string found in Column Name 2. Please make sure column name not null and try again.,Source=Microsoft.DataTransfer.Common,'",
"failureType": "UserError",
"target": "xxx",
"details": []
}
My Copy data source code is something like this:{
"name": "xxx",
"description": "uuu",
"type": "Copy",
"dependsOn": [],
"policy": {
"timeout": "7.00:00:00",
"retry": 0,
"retryIntervalInSeconds": 30,
"secureOutput": false,
"secureInput": false
},
"userProperties": [],
"typeProperties": {
"source": {
"type": "DelimitedTextSource",
"storeSettings": {
"type": "AzureBlobStorageReadSettings",
"recursive": true,
"wildcardFileName": "*"
},
"formatSettings": {
"type": "DelimitedTextReadSettings"
}
},
"sink": {
"type": "AzureSqlSink"
},
"enableStaging": false,
"translator": {
"type": "TabularTranslator",
"mappings": [
{
"source": {
"name": "populationId",
"type": "Guid"
},
"sink": {
"name": "PopulationID",
"type": "String"
}
},
{
"source": {
"name": "inputTime",
"type": "DateTime"
},
"sink": {
"name": "inputTime",
"type": "DateTime"
}
},
{
"source": {
"name": "inputCount",
"type": "Decimal"
},
"sink": {
"name": "inputCount",
"type": "Decimal"
}
},
{
"source": {
"name": "inputBiomass",
"type": "Decimal"
},
"sink": {
"name": "inputBiomass",
"type": "Decimal"
}
},
{
"source": {
"name": "inputNumber",
"type": "Decimal"
},
"sink": {
"name": "inputNumber",
"type": "Decimal"
}
},
{
"source": {
"name": "utcOffset",
"type": "String"
},
"sink": {
"name": "utcOffset",
"type": "Int32"
}
},
{
"source": {
"name": "fishGroupName",
"type": "String"
},
"sink": {
"name": "fishgroupname",
"type": "String"
}
},
{
"source": {
"name": "yearClass",
"type": "String"
},
"sink": {
"name": "yearclass",
"type": "String"
}
}
]
}
},
"inputs": [
{
"referenceName": "DelimitedTextFTDimensions",
"type": "DatasetReference"
}
],
"outputs": [
{
"referenceName": "AzureSqlTable1",
"type": "DatasetReference"
}
]
}
Can anyone please help me understand the issue. I see in some blogs they ask me use treatnullasempty but I am not allowed to modify the JSON. is there a way to do that??
I suggest to using Data Flow DerivedColumn, DerivedColumn can help you build expression to replace the null column.
For example:
Derived Column, if Column_2 is null =true, return 'dd' :
iifNull(Column_2,'dd')
Mapping the column
Reference: Data transformation expressions in mapping data flow
Hope this helps.
fixed it.it was a easy fix as one of my column in destination was marked as not null, i changed it as null and it worked.

how to create stream in ksql from topic with decimal type column

I want to create a stream from kafka topic that monitor a mysql table. mysql table has columns with decimal(16,4) type and when I create stream with this command:
create stream test with (KAFKA_TOPIC='dbServer.Kafka.DailyUdr',VALUE_FORMAT='AVRO');
stream created and run but columns with decimal(16,4) type don't appear in result stream.
source topic value schema:
{
"type": "record",
"name": "Envelope",
"namespace": "dbServer.Kafka.DailyUdr",
"fields": [
{
"name": "before",
"type": [
"null",
{
"type": "record",
"name": "Value",
"fields": [
{
"name": "UserId",
"type": "int"
},
{
"name": "NationalCode",
"type": "string"
},
{
"name": "TotalInputOcted",
"type": "int"
},
{
"name": "TotalOutputOcted",
"type": "int"
},
{
"name": "Date",
"type": "string"
},
{
"name": "Service",
"type": "string"
},
{
"name": "decimalCol",
"type": [
"null",
{
"type": "bytes",
"scale": 4,
"precision": 16,
"connect.version": 1,
"connect.parameters": {
"scale": "4",
"connect.decimal.precision": "16"
},
"connect.name": "org.apache.kafka.connect.data.Decimal",
"logicalType": "decimal"
}
],
"default": null
}
],
"connect.name": "dbServer.Kafka.DailyUdr.Value"
}
],
"default": null
},
{
"name": "after",
"type": [
"null",
"Value"
],
"default": null
},
{
"name": "source",
"type": {
"type": "record",
"name": "Source",
"namespace": "io.debezium.connector.mysql",
"fields": [
{
"name": "version",
"type": [
"null",
"string"
],
"default": null
},
{
"name": "connector",
"type": [
"null",
"string"
],
"default": null
},
{
"name": "name",
"type": "string"
},
{
"name": "server_id",
"type": "long"
},
{
"name": "ts_sec",
"type": "long"
},
{
"name": "gtid",
"type": [
"null",
"string"
],
"default": null
},
{
"name": "file",
"type": "string"
},
{
"name": "pos",
"type": "long"
},
{
"name": "row",
"type": "int"
},
{
"name": "snapshot",
"type": [
{
"type": "boolean",
"connect.default": false
},
"null"
],
"default": false
},
{
"name": "thread",
"type": [
"null",
"long"
],
"default": null
},
{
"name": "db",
"type": [
"null",
"string"
],
"default": null
},
{
"name": "table",
"type": [
"null",
"string"
],
"default": null
},
{
"name": "query",
"type": [
"null",
"string"
],
"default": null
}
],
"connect.name": "io.debezium.connector.mysql.Source"
}
},
{
"name": "op",
"type": "string"
},
{
"name": "ts_ms",
"type": [
"null",
"long"
],
"default": null
}
],
"connect.name": "dbServer.Kafka.DailyUdr.Envelope"
}
my problem is in decimalCol column
KSQL does not yet support DECIMAL data type.
There is an issue here that you can track and upvote if you think it would be useful.

avro.io.AvroTypeException: The datum [object] is not an example of the schema

I have been struggling through this issue quite for some time. I am working on AvroProducer(confluent kafka) and getting error related to schema defined.
Here is the complete stacktrace of the issue I am getting:
<!--language: lang-none-->
raise AvroTypeException(self.writer_schema, datum)
avro.io.AvroTypeException: The datum {'totalDifficulty': 2726165051, 'stateRoot': '0xf09bd6730b3ae7f5728836564837d7f776a8f7333628c8b84cb57d7c6d48ebba', 'sha3Uncles': '0x1dcc4de8dec75d7aab85b567b6ccd41ad312451b948a7413f0a142fd40d49347', 'size': 538, 'logs': [], 'gasLimit': 8000000, 'mixHash': '0x410b2b19519be16496727c93515f399072ffecf06defe4913d00eb4d10bb7351', 'logsBloom': '0x00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000', 'nonce': '0x18dc6c0d30839c91', 'proofOfAuthorityData': '0xd883010817846765746888676f312e31302e34856c696e7578', 'number': 5414, 'timestamp': 1552577641, 'difficulty': 589091, 'gasUsed': 0, 'miner': '0x48FA5EBc2f0D82B5D52faAe624Fa2426998ab492', 'hash': '0x71259991acb407a85befa8b3c5df26a94a11a6c08f92f3e3b7c9c0e8e1f5916d', 'transactionsRoot': '0x56e81f171bcc55a6ff8345e692c0f86e5b48e01b996cadc001622fb5e363b421', 'receiptsRoot': '0x56e81f171bcc55a6ff8345e692c0f86e5b48e01b996cadc001622fb5e363b421', 'transactions': [], 'parentHash': '0x9f0c25eeab86fc144296cb034c94857beed331936016d60c0986a35ac07d9c68', 'uncles': []} is not an example of the schema {
"type": "record",
"name": "value",
"namespace": "exporter.value.opsnetBlock",
"fields": [
{
"type": "int",
"name": "difficulty"
},
{
"type": "string",
"name": "proofOfAuthorityData"
},
{
"type": "int",
"name": "gasLimit"
},
{
"type": "int",
"name": "gasUsed"
},
{
"type": "string",
"name": "hash"
},
{
"type": "string",
"name": "logsBloom"
},
{
"type": "int",
"name": "size"
},
{
"type": "string",
"name": "miner"
},
{
"type": "string",
"name": "mixHash"
},
{
"type": "string",
"name": "nonce"
},
{
"type": "int",
"name": "number"
},
{
"type": "string",
"name": "parentHash"
},
{
"type": "string",
"name": "receiptsRoot"
},
{
"type": "string",
"name": "sha3Uncles"
},
{
"type": "string",
"name": "stateRoot"
},
{
"type": "int",
"name": "timestamp"
},
{
"type": "int",
"name": "totalDifficulty"
},
{
"type": "string",
"name": "transactionsRoot"
},
{
"type": {
"type": "array",
"items": "string"
},
"name": "transactions"
},
{
"type": {
"type": "array",
"items": "string"
},
"name": "uncles"
},
{
"type": {
"type": "array",
"items": {
"type": "record",
"name": "Child",
"namespace": "exporter.value.opsnetBlock",
"fields": [
{
"type": "string",
"name": "address"
},
{
"type": "string",
"name": "blockHash"
},
{
"type": "int",
"name": "blockNumber"
},
{
"type": "string",
"name": "data"
},
{
"type": "int",
"name": "logIndex"
},
{
"type": "boolean",
"name": "removed"
},
{
"type": {
"type": "array",
"items": "string"
},
"name": "topics"
},
{
"type": "string",
"name": "transactionHash"
},
{
"type": "int",
"name": "transactionIndex"
}
]
}
},
"name": "logs"
}
]
}
Can anybody please tell me where am I going wrong in this?
Thanks in advance