Generate Json schema from case classes (play framework) - scala

I am using Play framework's to convert between a case class and Json.
How can I extract the schema of the Json corresponding to the case class?
Edit:
If the class is case class Foo(string:Option[String], int:Option[Int])
Schema should be (approximately):
{
"$schema": "http://json-schema.org/draft-07/schema#",
"$id": "http://example.com/product.schema.json",
"title": "Foo",
"type": "object",
"properties": {
"string": {
"type": "string"
},
"int": {
"type": "int"
}
},
"required": [ ]
}

Use scala-jsonschema for that and sponsor the author of this great library.
The library supports also spray-json, circe and some other JSON parsers for Scala.

Related

JDBC sink topic with multiple structs to postgres

I am trying to sink a few topics top a postgres database. However the topic schema defines a array at the top level and within it multiple structs. Automapping does not work and I cannot find any reference how to handle this. I need all structs because they are dependent types, the second struct references the first struct as a field.
Currently it breaks when hitting the 2nd struct stating statusChangeEvent (struct) has no mapping to sql column type. This because it is using auto.create to make a table (probably called ProcessStatus) then when hitting the second entry there is no column of course.
[
{
"type": "record",
"name": "processStatus",
"namespace": "company.some.process",
"fields": [
{
"name": "code",
"doc": "The code of the processStatus",
"type": "string"
},
{
"name": "name",
"doc": "The name of the processStatus",
"type": "string"
},
{
"name": "description",
"type": "string"
},
{
"name": "isCompleted",
"type": "boolean"
},
{
"name": "isSuccessfullyCompleted",
"type": "boolean"
}
]
},
{
"type": "record",
"name": "StatusChangeEvent",
"namespace": "company.some.process",
"fields": [
{
"name": "contNumber",
"type": "string"
},
{
"name": "processId",
"type": "string"
},
{
"name": "processVersion",
"type": "int"
},
{
"name": "extProcessId",
"type": [
"null",
"string"
],
"default": null
},
{
"name": "fromStatus",
"type": "process.status"
},
{
"name": "toStatus",
"doc": "The new status of the process",
"type": "company.some.process.processStatus"
},
{
"name": "changeDateTime",
"type": "long",
"logicalType": "timestamp-millis"
},
{
"name": "isPublic",
"type": "boolean"
}
]
}
]
I am not using ksql atm. Which connector settings are suited for this task? If there is a ksql alternative it would be nice to know but the current requirement is to use the JDBC connector.
I tried using flatten but it does not support struct fields that have a schema. Which seems kind of weird. Aren't schema's the whole selling point of connect with kafka? Or is it more of a constraint you have to work around?
Aren't schema's the whole selling point of connect with kafka?
Yes, but Postgres (or the JDBC Sink, in general) doesn't really support nested objects within columns. For that, you're better off with a document database, such as using Mongo Sink Connector.
Which connector settings are suited for this task?
None, really, other than transforms. You could write your own if flatten doesn't work.
You could try pre-defining your table to use JSONB for the two status columns, however, that's more of a workaround.

JSON Schema - can array / list validation be combined with anyOf?

I have a json document I'm trying to validate with this form:
...
"products": [{
"prop1": "foo",
"prop2": "bar"
}, {
"prop3": "hello",
"prop4": "world"
},
...
There are multiple different forms an object may take. My schema looks like this:
...
"definitions": {
"products": {
"type": "array",
"items": { "$ref": "#/definitions/Product" },
"Product": {
"type": "object",
"oneOf": [
{ "$ref": "#/definitions/Product_Type1" },
{ "$ref": "#/definitions/Product_Type2" },
...
]
},
"Product_Type1": {
"type": "object",
"properties": {
"prop1": { "type": "string" },
"prop2": { "type": "string" }
},
"Product_Type2": {
"type": "object",
"properties": {
"prop3": { "type": "string" },
"prop4": { "type": "string" }
}
...
On top of this, certain properties of the individual product array objects may be indirected via further usage of anyOf or oneOf.
I'm running into issues in VSCode using the built-in schema validation where it throws errors for every item in the products array that don't match Product_Type1.
So it seems the validator latches onto that first oneOf it found and won't validate against any of the other types.
I didn't find any limitations to the oneOf mechanism on jsonschema.org. And there is no mention of it being used in the page specifically dealing with arrays here: https://json-schema.org/understanding-json-schema/reference/array.html
Is what I'm attempting possible?
Your general approach is fine. Let's take a slightly simpler example to illustrate what's going wrong.
Given this schema
{
"oneOf": [
{ "properties": { "foo": { "type": "integer" } } },
{ "properties": { "bar": { "type": "integer" } } }
]
}
And this instance
{ "foo": 42 }
At first glance, this looks like it matches /oneOf/0 and not oneOf/1. It actually matches both schemas, which violates the one-and-only-one constraint imposed by oneOf and the oneOf fails.
Remember that every keyword in JSON Schema is a constraint. Anything that is not explicitly excluded by the schema is allowed. There is nothing in the /oneOf/1 schema that says a "foo" property is not allowed. Nor does is say that "foo" is required. It only says that if the instance has a keyword "foo", then it must be an integer.
To fix this, you will need required and maybe additionalProperties depending on the situation. I show here how you would use additionalProperties, but I recommend you don't use it unless you need to because is does have some problematic properties.
{
"oneOf": [
{
"properties": { "foo": { "type": "integer" } },
"required": ["foo"],
"additionalProperties": false
},
{
"properties": { "bar": { "type": "integer" } },
"required": ["bar"],
"additionalProperties": false
}
]
}

Why OpenAPI does not define '$ref' as allowed property?

In compare to draft-07 it defines:
{
"type": ["object", "boolean"],
"properties": {
...
"$ref": {
"type": "string",
"format": "uri-reference"
},
}
...
}
Currently I am trying to write validator for openapi. But validation fails because openapi schema (yes, it is schema from google apis) does not define $ref as allowed property for schema.
Is this a typo? What is recommendation about how to check $ref property?
$ref is a JSON Reference. It's not part of the schema definition, instead it's part of the reference definition:
"reference": {
"type": "object",
"description": "A simple object to allow referencing other components in the specification, internally and externally. The Reference Object is defined by JSON Reference and follows the same structure, behavior and rules. For this specification, reference resolution is accomplished as defined by the JSON Reference specification and not by the JSON Schema specification.",
"required": [
"$ref"
],
"additionalProperties": false,
"properties": {
"$ref": {
"type": "string"
}
}
},
And then other definitions where $ref is allowed use oneOf something or reference (example):
"schemaOrReference": {
"oneOf": [
{
"$ref": "#/definitions/schema"
},
{
"$ref": "#/definitions/reference"
}
]
},
By the way, there are currently two different draft OAS3 JSON Schemas in the official OpenAPI Specification repository. Feel free to try them instead, and provide your feedback in the corresponding discussions.
[WIP] Alternative OAS3 JSON Schema (link to schema)
OpenAPI v3 JSON Schema (link to schema)

Avro to Scala case class annotation with nested types

I am using Avro serialization for messages on Kafka and processing with some custom Scala code using this annotation method currently. The following is a basic schema with a nested record:
{
"type": "record",
"name": "TestMessage",
"namespace": "",
"fields": [
{"name": "message", "type": "string"},
{
"name": "metaData",
"type": {
"type": "record",
"name": "MetaData",
"fields": [
{"name": "source", "type": "string"},
{"name": "timestamp", "type": "string"}
]
}
}
]
}
And the annotation, I believe should quite simply look like:
#AvroTypeProvider("schema-common/TestMessage.avsc")
#AvroRecord
case class TestMessage()
The message itself is something like the following:
{"message":"hello 1",
"metaData":{
"source":"postman",
"timestamp":"123456789"
}
}
However when I log the TestMessage type or view the output in a Kafka consumer in the console, all I see is:
{"message":"hello 1"}
And not the subtype I added to capture MetaData. Anything I am missing? Let me know if I can provide further information - thanks!
This should now be fixed in version 0.10.3 for Scala 2.11, and version 0.4.5 for scala 2.10
Keep in mind that for every record type in a schema, there needs to be a case class that represents it. And for Scala 2.10, the most nested classes must be defined first. A safe definition is the following:
#AvroTypeProvider("schema-common/TestMessage.avsc")
#AvroRecord
case class MetaData()
#AvroTypeProvider("schema-common/TestMessage.avsc")
#AvroRecord
case class TestMessage()

json schema issue on required property

I need to write the JSON Schema based on the specification defined by http://json-schema.org/. But I'm struggling for the required/mandatory property validation. Below is the JSON schema that I have written where all the 3 properties are mandatory but In my case either one should be mandatory. How to do this?.
{
"id": "http://example.com/searchShops-schema#",
"$schema": "http://json-schema.org/draft-04/schema#",
"title": "searchShops Service",
"description": "",
"type": "object",
"properties": {
"city":{
"type": "string"
},
"address":{
"type": "string"
},
"zipCode":{
"type": "integer"
}
},
"required": ["city", "address", "zipCode"]
}
If your goal is to tell that "I want at least one member to exist" then use minProperties:
{
"type": "object",
"etc": "etc",
"minProperties": 1
}
Note also that you can use "dependencies" to great effect if you also want additional constraints to exist when this or that member is present.
{
...
"anyOf": [
{ "required": ["city"] },
{ "required": ["address"] },
{ "required": ["zipcode"] },
]
}
Or use "oneOf" if exactly one property should be present