avro schema question: TypeError: unhashable type: 'dict' - apache-kafka

I need to write a Avro schema for the following data. The exposure is a array of arrays with 3 numbers.
{
"Response": {
"status": "",
"responseDetail": {
"request_id": "Z618978.R",
"exposure": [
[
372,
20000000.0,
31567227140.238808
]
[
373,
480000000.0,
96567227140.238808
]
[
374,
23300000.0,
251567627149.238808
]
],
"product": "ABC",
}
}
}
So I came up with a schema like the following:
{
"name": "Response",
"type":{
"name": "algoResponseType",
"type": "record",
"fields":
[
{"name": "status", "type": ["null","string"]},
{
"name": "responseDetail",
"type": {
"name": "responseDetailType",
"type": "record",
"fields":
[
{"name": "request_id", "type": "string"},
{
"name": "exposure",
"type": {
"type": "array",
"items":
{
"name": "single_exposure",
"type": {
"type": "array",
"items": "string"
}
}
}
},
{"name": "product", "type": ["null","string"]}
]
}
}
]
}
}
When I tried to register the schema. I got the following error. TypeError: unhashable type: 'dict' which means I used a list as a dictionary key.
Traceback (most recent call last):
File "sa_publisher_main4test.py", line 28, in <module>
schema_registry_client)
File "/usr/local/lib64/python3.6/site-packages/confluent_kafka/schema_registry/avro.py", line 175, in __init__
parsed_schema = parse_schema(schema_dict)
File "fastavro/_schema.pyx", line 71, in fastavro._schema.parse_schema
File "fastavro/_schema.pyx", line 204, in fastavro._schema._parse_schema
TypeError: unhashable type: 'dict'
Can anyone help point out what is causing the error?

There are a few issues.
First, at the very top level of your schema, you have the following:
{
"name": "Response",
"type": {...}
}
But this isn't right. The top level should be a record type with a field called Response. So it should look like this:
{
"name": "Response",
"type": "record",
"fields": [
{
"name": "Response",
"type": {...}
}
]
}
The second problem is that for the array of arrays, you currently have the following:
{
"name":"exposure",
"type":{
"type":"array",
"items":{
"name":"single_exposure",
"type":{
"type":"array",
"items":"string"
}
}
}
}
But instead it should look like this:
{
"name":"exposure",
"type":{
"type":"array",
"items":{
"type":"array",
"items":"string"
}
}
}
After fixing those, the schema will be able to be parsed, but your data contains an array of array of floats and your schema says it should be an array of array of string. Therefore either the schema needs to be changed to float, or the data needs to be strings.
For reference, here's an example script that works after fixing those issues:
import fastavro
s = {
"name":"Response",
"type":"record",
"fields":[
{
"name":"Response",
"type": {
"name":"algoResponseType",
"type":"record",
"fields":[
{
"name":"status",
"type":[
"null",
"string"
]
},
{
"name":"responseDetail",
"type":{
"name":"responseDetailType",
"type":"record",
"fields":[
{
"name":"request_id",
"type":"string"
},
{
"name":"exposure",
"type":{
"type":"array",
"items":{
"type":"array",
"items":"string"
}
}
},
{
"name":"product",
"type":[
"null",
"string"
]
}
]
}
}
]
}
}
]
}
data = {
"Response":{
"status":"",
"responseDetail":{
"request_id":"Z618978.R",
"exposure":[
[
"372",
"20000000.0",
"31567227140.238808"
],
[
"373",
"480000000.0",
"96567227140.238808"
],
[
"374",
"23300000.0",
"251567627149.238808"
]
],
"product":"ABC"
}
}
}
parsed_schema = fastavro.parse_schema(s)
fastavro.validate(data, parsed_schema)

The error you get is because Schema Registry doesn't accept your schema. Your top element has to be a record with "Response" field.
This schema should work, I changed array item type, as in your message you have float and not string.
{
"type": "record",
"name": "yourMessage",
"fields": [
{
"name": "Response",
"type": {
"name": "AlgoResponseType",
"type": "record",
"fields": [
{
"name": "status",
"type": [
"null",
"string"
]
},
{
"name": "responseDetail",
"type": {
"name": "ResponseDetailType",
"type": "record",
"fields": [
{
"name": "request_id",
"type": "string"
},
{
"name": "exposure",
"type": {
"type": "array",
"items": {
"type": "array",
"items": "float"
}
}
},
{
"name": "product",
"type": [
"null",
"string"
]
}
]
}
}
]
}
}
]
}
Your message is not correct, as array elements should have comma between them.
{
"Response": {
"status": "",
"responseDetail": {
"request_id": "Z618978.R",
"exposure": [
[
372,
20000000.0,
31567227140.238808
],
[
373,
480000000.0,
96567227140.238808
],
[
374,
23300000.0,
251567627149.238808
]
],
"product": "ABC",
}
}
}
As you are using fastavro, I recommend running this code to check that your message is an example of a schema.
from fastavro.validation import validate
import json
with open('schema.avsc', 'r') as schema_file:
schema = json.loads(schema_file.read())
message = {
"Response": {
"status": "",
"responseDetail": {
"request_id": "Z618978.R",
"exposure": [
[
372,
20000000.0,
31567227140.238808
],
[
373,
480000000.0,
96567227140.238808
],
[
374,
23300000.0,
251567627149.238808
]
],
"product": "ABC",
}
}
}
try:
validate(message, schema)
print('Message is matching schema')
except Exception as ex:
print(ex)

Related

AWS-API gateway -- jsonschema child object should validate when parent object exists

I need to create Jsonschema for the following JSON input. Here properties under Vehicle like( Manufacturer, Model, etc) should be required only when Vehicle object exists.
{
"Manufacturer": "",
"Characteristics": {
"Starts": "new",
"vehicle": {
"Manufacturer": "hello",
"Model": "hh",
"Opening": "",
"Quantity": "",
"Principle": "",
"Type": ""
}
}
}
I tried the following JsonSchema but this works when Vehicle object is not there but if we rename Vehicle to some other ex: Vehicle1 it doesn't give an error. Please guide me on how to fix this.
{
"$schema": "http://json-schema.org/draft-07/schema",
"type": "object",
"properties": {
"Manufacturer": {
"type": [
"string",
"null"
]
},
"Characteristics": {
"type": "object",
"properties": {
"Starts": {
"type": [
"string",
"null"
]
},
"Vehicle": {
"$ref": "#/definitions/Vehicle"
}
},
"required": [
"Starts", "Vehcle"
]
}
},
"required": [
"Manufacturer"
],
"definitions": {
"Vehicle": {
"type": "object",
"properties": {
"Manufacturer": {
"type": [
"string",
"null"
]
},
"Model": {
"type": [
"string",
"null"
]
},
"Opening": {
"type": [
"string",
"null"
]
},
"PanelQuantity": {
"type": [
"string",
"null"
]
},
"Principle": {
"type": [
"string",
"null"
]
},
"Type": {
"type": [
"string",
"null"
]
}
},
"required": ["Manufacturer", "Model", "Opening", "Quantity", "Principle", "Type"]
}
}
}
Thanks,
Bhaskar
Sounds like you want to add "additionalProperties": false -- which will generate an error if any other properties are present that aren't defined under properties.

Conversion of JSON to Avro failed: Failed to convert JSON to Avro: Unknown union branch

I am trying to send a JSON message to a Kafka topic using Kafka-rest service to serialize JSON as an Avro object, but the JSON message is failed to get accepted by Kafka-rest with the following error:
Conversion of JSON to Avro failed: Failed to convert JSON to Avro: Unknown union branch postId
I suspect that there is an issue with the Avro schema I am using as it is a nested record type with nullable fields.
Avro schema:
{
"type": "record",
"name": "ExportRequest",
"namespace": "com.example.avro.model",
"fields": [
{
"name": "context",
"type": {
"type": "map",
"values": {
"type": "string",
"avro.java.string": "String"
},
"avro.java.string": "String"
}
},
{
"name": "exportInfo",
"type": {
"type": "record",
"name": "ExportInfo",
"fields": [
{
"name": "exportId",
"type": {
"type": "string",
"avro.java.string": "String"
}
},
{
"name": "exportType",
"type": {
"type": "string",
"avro.java.string": "String"
}
},
{
"name": "exportQuery",
"type": {
"type": "record",
"name": "ExportQuery",
"fields": [
{
"name": "postExport",
"type": [
"null",
{
"type": "record",
"name": "PostExport",
"fields": [
{
"name": "postId",
"type": {
"type": "string",
"avro.java.string": "String"
}
},
{
"name": "isCommentIncluded",
"type": "boolean"
}
]
}
],
"default": null
},
{
"name": "feedExport",
"type": [
"null",
{
"type": "record",
"name": "FeedExport",
"fields": [
{
"name": "accounts",
"type": {
"type": "array",
"items": {
"type": "string",
"avro.java.string": "String"
}
}
},
{
"name": "recordTypes",
"type": {
"type": "array",
"items": {
"type": "string",
"avro.java.string": "String"
}
}
},
{
"name": "actions",
"type": {
"type": "array",
"items": {
"type": "string",
"avro.java.string": "String"
}
}
},
{
"name": "contentTypes",
"type": {
"type": "array",
"items": {
"type": "string",
"avro.java.string": "String"
}
}
},
{
"name": "startDate",
"type": "long"
},
{
"name": "endDate",
"type": "long"
},
{
"name": "advancedSearch",
"type": [
"null",
{
"type": "record",
"name": "AdvancedSearchExport",
"fields": [
{
"name": "allOfTheWords",
"type": {
"type": "array",
"items": {
"type": "string",
"avro.java.string": "String"
}
}
},
{
"name": "anyOfTheWords",
"type": {
"type": "array",
"items": {
"type": "string",
"avro.java.string": "String"
}
}
},
{
"name": "noneOfTheWords",
"type": {
"type": "array",
"items": {
"type": "string",
"avro.java.string": "String"
}
}
},
{
"name": "hashtags",
"type": {
"type": "array",
"items": {
"type": "string",
"avro.java.string": "String"
}
}
},
{
"name": "keyword",
"type": {
"type": "string",
"avro.java.string": "String"
}
},
{
"name": "exactPhrase",
"type": {
"type": "string",
"avro.java.string": "String"
}
}
]
}
],
"default": null
}
]
}
],
"default": null
}
]
}
}
]
}
}
]
}
Json message:
{
"context": {
"user_id": "1",
"group_id": "1",
"organization_id": "1"
},
"exportInfo": {
"exportId": "93874dd7-35d7-4f1f-8cf8-051c606d920b",
"exportType": "json",
"exportQuery": {
"postExport": {
"postId": "dd",
"isCommentIncluded": false
},
"feedExport": {
"accounts": [
"1677143852565319"
],
"recordTypes": [],
"actions": [],
"contentTypes": [],
"startDate": 0,
"endDate": 0,
"advancedSearch": {
"allOfTheWords": [
"string"
],
"anyOfTheWords": [
"string"
],
"noneOfTheWords": [
"string"
],
"hashtags": [
"string"
],
"keyword": "string",
"exactPhrase": "string"
}
}
}
}
}
I would appreciate it if someone could help me to understand what the issue is.
Both of your JSON and Avro looks good.
You are facing the issue because JSON doesn't conform to Avro's JSON encoding spec.
So, if you convert your JSON accordingly, it will somehow look like this
{
"context": {
"user_id": "1",
"group_id": "1",
"organization_id": "1"
},
"exportInfo": {
"exportId": "93874dd7-35d7-4f1f-8cf8-051c606d920b",
"exportType": "json",
"exportQuery": {
"postExport": {
"com.example.avro.model.PostExport": {
"postId": "dd",
"isCommentIncluded": false
}
},
"feedExport": {
"com.example.avro.model.FeedExport": {
"accounts": [
"1677143852565319"
],
"recordTypes": [],
"actions": [],
"contentTypes": [],
"startDate": 0,
"endDate": 0,
"advancedSearch": {
"com.example.avro.model.AdvancedSearchExport": {
"allOfTheWords": [
"string"
],
"anyOfTheWords": [
"string"
],
"noneOfTheWords": [
"string"
],
"hashtags": [
"string"
],
"keyword": "string",
"exactPhrase": "string"
}
}
}
}
}
}
}

Loopback 3 get relation from embedded model

I'm using loopback 3 to build a backend with mongoDB.
So i have 3 models: Object, Attachment and AwsS3.
Object has a relation Embeds2Many to Attachment.
Attachment has a relation Many2One to AwsS3.
Objects look like that in mongoDB
[
{
"fieldA": "valueA1",
"attachments": [
{
"id": 1,
"awsS3Id": "1234"
},
{
"id": 2,
"awsS3Id": "1235"
}
]
},
{
"fieldA": "valueA2",
"attachments": [
{
"id": 4,
"awsS3Id": "1236"
},
{
"id": 5,
"awsS3Id": "1237"
}
]
}
]
AwsS3 looks like that in mongoDB
[
{
"id": "1",
"url": "abc.com/1"
},
{
"id": "2",
"url": "abc.com/2"
}
]
The question is: how can i get Objects included Attachment and AwsS3.url over the RestAPI?
I have try with the include and scope filter. But it didn't work. It look like, that this function is not implemented in loopback3, right? Here is what i tried over the GET request:
{
"filter": {
"include": {
"relation": "Attachment",
"scope": {
"include": {
"relation": "awsS3",
}
}
}
}
}
With this request i only got the Objects with Attachments without anything from AwsS3.
UPDATE for the relation definitons
The relation from Object to Attachment:
"Attachment": {
"type": "embedsMany",
"model": "Attachment",
"property": "attachments",
"options": {
"validate": true,
"forceId": false
}
},
The relation from Attachment to AwsS3
in attachment.json
"relations": {
"awsS3": {
"type": "belongsTo",
"model": "AwsS3",
"foreignKey": ""
}
}
in AwsS3.json
"relations": {
"attachments": {
"type": "hasMany",
"model": "Attachment",
"foreignKey": ""
}
}
Try this filter:
{ "filter": { "include": ["awsS3", "attachments"]}}}}

swagger with list of elements in an array

I am new to swagger implementation. I have a query parameter 'Geschaeftsvorfall' which can be of string value A or P and when I hit the end point. I expect an array[validPsd2Ids] filled with integers.
I have formulated below code and I don't know how to validate it. can someone tell me if I am going wrong some where?
Also what can I do to print a List instead of array in my response?
"parameters": {
"Geschaeftsvorfall": {
"name": "Geschaeftsvorfall",
"in": "query",
"description": "Geschaeftsvorfall",
"required": true,
"type": "string",
"enum": [
"A",
"P"
]
}
},
"definitions": {
"ValidePsd2Ids": {
"type": "array",
"items": {
"properties": {
"ValidePsd2Ids": {
"type": "integer",
example: [100000005,
100000006,
100000007,
100000008,
100000009,
100000010,
100000011,
100000012,
100000013,
100000014,
100000015,
100000016,
100000017,
100000018,
100000019,
100000020,
100000021,
100000022,
100000023,
100000024,
100000025,
100000034,
100000035,
100000036,
100000037,
100000038,
100000039,
100000048,
100000049,
100000050,
100000054,
100000055,
100000056,
100000057,
100000058,
100000117,
100000163,
100000165,
100000195,
100000196,
100000197,
100000198,
100000199,
100000201,
100000214,
100000217,
100000218]
}
}
}
}
},
"paths": {
"/payments/validaccounttypes/": {
"get": {
"tags": [
"payments"
],
"summary": "Valid PSD2 relevant accounts",
"description": "Reads the list of valid PSD2 revelant IDs.",
"consumes": [
"application/json"
],
"produces": [
"application/json"
],
"parameters": [
{
"$ref": "#/parameters/Geschaeftsvorfall"
}
],
"responses": {
"200": {
"description": "OK",
"schema": {
"type": "array",
"items": {
"properties": {
"ValidePsd2Ids": {
"type": "integer"
}
}
},
"properties": {
"ValidePsd2Ids": {
"$ref": "#/definitions/ValidePsd2Ids"
}
}
}
}
}
}
}
}
The parameter definition is correct.
The response definition is not correct. You say that the response looks like
{"ValidePsd2Ids" : [1,2,3,4,5,6,7,...]}
In OpenAPI terms, this is a type: object with a property ValidePsd2Ids that contains an array of integers. This can be described as:
"definitions": {
"ValidePsd2Ids": {
"type": "object",
"properties": {
"ValidePsd2Ids": {
"type": "array",
"items": {
"type": "integer"
},
"example": [
100000005,
100000006,
100000007
]
}
}
}
},
and the responses should be:
"responses": {
"200": {
"description": "OK",
"schema": {
"$ref": "#/definitions/ValidePsd2Ids"
}
}
}

How can I use CloudKit web services to query based on a reference field?

I've got two CloudKit data objects that look somewhat like this:
Parent Object:
{
"records": [
{
"recordName": "14102C0A-60F2-4457-AC1C-601BC628BF47-184-000000012D225C57",
"recordType": "ParentObject",
"fields": {
"fsYear": {
"value": "2015",
"type": "STRING"
},
"displayOrder": {
"value": 2015221153856287200,
"type": "INT64"
},
"fjpFSGuidForReference": {
"value": "14102C0A-60F2-4457-AC1C-601BC628BF47-184-000000012D225C57",
"type": "STRING"
},
"fsDateSearch": {
"value": "2015221153856287158",
"type": "STRING"
},
},
"recordChangeTag": "id4w7ivn",
"created": {
"timestamp": 1439149087571,
"userRecordName": "_0d26968032e31bbc72c213037b6cb35d",
"deviceID": "A19CD995FDA3093781096AF5D818033A241D65C1BFC3D32EC6C5D6B3B4A9AA6B"
},
"modified": {
"timestamp": 1439149087571,
"userRecordName": "_0d26968032e31bbc72c213037b6cb35d",
"deviceID": "A19CD995FDA3093781096AF5D818033A241D65C1BFC3D32EC6C5D6B3B4A9AA6B"
}
}
],
"total":
}
Child Object:
{
"records": [
{
"recordName": "2015221153856287168",
"recordType": "ChildObject",
"fields": {
"District": {
"value": "002",
"type": "STRING"
},
"ZipCode": {
"value": "12345",
"type": "STRING"
},
"InspecReference": {
"value": {
"recordName": "14102C0A-60F2-4457-AC1C-601BC628BF47-184-000000012D225C57",
"action": "NONE",
"zoneID": {
"zoneName": "_defaultZone"
}
},
"type": "REFERENCE"
},
},
"recordChangeTag": "id4w7lew",
"created": {
"timestamp": 1439149090856,
"userRecordName": "_0d26968032e31bbc72c213037b6cb35d",
"deviceID": "A19CD995FDA3093781096AF5D818033A241D65C1BFC3D32EC6C5D6B3B4A9AA6B"
},
"modified": {
"timestamp": 1439149090856,
"userRecordName": "_0d26968032e31bbc72c213037b6cb35d",
"deviceID": "A19CD995FDA3093781096AF5D818033A241D65C1BFC3D32EC6C5D6B3B4A9AA6B"
}
}
],
"total": 1
}
I'm trying to write a query to directly access the CloudKit web service and return the Child Object based on the reference of the parent object.
My test JSON looks something like this:
{"query":{"recordType":"ChildObject","filterBy":{"fieldName":"InspecReference","fieldValue":{ "value" : "14102C0A-60F2-4457-AC1C-601BC628BF47-184-000000012D225C57", "type" : "string" },"comparator":"EQUALS"}},"zoneID":{"zoneName":"_defaultZone"}}
However, I'm getting the following error from CloudKit:
{"uuid":"33db91f3-b768-4a68-9056-216ecc033e9e","serverErrorCode":"BAD_REQUEST","reason":"BadRequestException:
Unexpected input"}
I'm guessing I have the Record Field Dictionary in the query wrong. However, the documentation isn't clear on what this should look like on a reference object.
You have to re-create the actual object of the reference. In this particular case, the JSON looks like this:
{
"query": {
"recordType": "ChildObject",
"filterBy": {
"fieldName": "InspecReference",
"fieldValue": {
"value": {
"recordName": "14102C0A-60F2-4457-AC1C-601BC628BF47-184-000000012D225C57",
"action": "NONE"
},
"type": "REFERENCE"
},
"comparator": "EQUALS"
}
},
"zoneID": {
"zoneName": "_defaultZone"
}
}