JSON Schema - can array / list validation be combined with anyOf? - visual-studio-code

I have a json document I'm trying to validate with this form:
...
"products": [{
"prop1": "foo",
"prop2": "bar"
}, {
"prop3": "hello",
"prop4": "world"
},
...
There are multiple different forms an object may take. My schema looks like this:
...
"definitions": {
"products": {
"type": "array",
"items": { "$ref": "#/definitions/Product" },
"Product": {
"type": "object",
"oneOf": [
{ "$ref": "#/definitions/Product_Type1" },
{ "$ref": "#/definitions/Product_Type2" },
...
]
},
"Product_Type1": {
"type": "object",
"properties": {
"prop1": { "type": "string" },
"prop2": { "type": "string" }
},
"Product_Type2": {
"type": "object",
"properties": {
"prop3": { "type": "string" },
"prop4": { "type": "string" }
}
...
On top of this, certain properties of the individual product array objects may be indirected via further usage of anyOf or oneOf.
I'm running into issues in VSCode using the built-in schema validation where it throws errors for every item in the products array that don't match Product_Type1.
So it seems the validator latches onto that first oneOf it found and won't validate against any of the other types.
I didn't find any limitations to the oneOf mechanism on jsonschema.org. And there is no mention of it being used in the page specifically dealing with arrays here: https://json-schema.org/understanding-json-schema/reference/array.html
Is what I'm attempting possible?

Your general approach is fine. Let's take a slightly simpler example to illustrate what's going wrong.
Given this schema
{
"oneOf": [
{ "properties": { "foo": { "type": "integer" } } },
{ "properties": { "bar": { "type": "integer" } } }
]
}
And this instance
{ "foo": 42 }
At first glance, this looks like it matches /oneOf/0 and not oneOf/1. It actually matches both schemas, which violates the one-and-only-one constraint imposed by oneOf and the oneOf fails.
Remember that every keyword in JSON Schema is a constraint. Anything that is not explicitly excluded by the schema is allowed. There is nothing in the /oneOf/1 schema that says a "foo" property is not allowed. Nor does is say that "foo" is required. It only says that if the instance has a keyword "foo", then it must be an integer.
To fix this, you will need required and maybe additionalProperties depending on the situation. I show here how you would use additionalProperties, but I recommend you don't use it unless you need to because is does have some problematic properties.
{
"oneOf": [
{
"properties": { "foo": { "type": "integer" } },
"required": ["foo"],
"additionalProperties": false
},
{
"properties": { "bar": { "type": "integer" } },
"required": ["bar"],
"additionalProperties": false
}
]
}

Related

How do I use an extended definition and not allow additional properties in a way that is compatible with multiple validators (JSON schema draft 7)?

I am creating a strict validator for a complex JSON file and want to re-use various definitions in order to keep the schema manageable and easier to update.
According to the documentation it is necessary to use allOf to extend a definition to add more properties. This is exactly what I've done, but I find that without use of additionalProperties set to false validation doesn't prevent arbitrary other properties being added.
The following massively cut-down schema demonstrates what I'm doing:
{
"$schema": "http://json-schema.org/draft-07/schema#",
"$id": "https://example.com/schema/2021/02/example.json",
"description": "This schema demonstrates how VSCode's JSON schema mechanism fails with allOf used to extend a definition",
"definitions": {
"valueProvider": {
"type": "object",
"properties": {
"example": {
"type": "string"
},
"alternative": {
"type": "string"
}
},
"oneOf": [
{
"required": [
"example"
]
},
{
"required": [
"alternative"
]
}
]
},
"selector": {
"type": "object",
"allOf": [
{
"$ref": "#/definitions/valueProvider"
},
{
"required": [
"operator",
"value"
],
"properties": {
"operator": {
"type": "string",
"enum": [
"IsNull",
"Equals",
"NotEquals",
"Greater",
"GreaterOrEquals",
"Less",
"LessOrEquals"
]
},
"value": {
"type": "string"
}
}
}
],
"additionalProperties": false
}
},
"properties": {
"show": {
"properties": {
"name": {
"type": "string"
},
"selector": {
"description": "This property does not function correctly in VSCode",
"allOf": [
{
"$ref": "#/definitions/selector"
},
{
"additionalProperties": false
}
]
}
},
"additionalProperties": false
}
}
}
This works a treat in IntelliJ IDEA's JSON editor (2020.3.2 ultimate edition) when editing JSON against this schema (using a schema mapping). For example, the file ex-fail.json's content of:
{
"show": {
"name": "a",
"selector": {
"example": "a",
"operator": "IsNull",
"value": "false",
"d": "a"
}
}
}
Is correctly validated, simply highlighting "d" as not allowed, thus:
However, when I use the very same schema and JSON file with VSCode (1.53.2) with vanilla configuration (except for a schema mapping) VSCode erroneously marks "example", "operator", "value" and "d" as not allowed. It looks like this in the VSCode editor:
If I remove the additionalProperties definition from the show.selector property, both IDEA and VSCode indicate that all is well, including allowing the "d" property - in doing this I can simplify that property definition to:
"selector": {
"description": "This property does not function correctly in VSCode",
"$ref": "#/definitions/selector"
}
What can I do to the schema to support both IDEA and VSCode whilst disallowing additional properties where they should not appear?
PS: The schema mapping in VSCode is simply along the lines of:
{
"json.schemas": [
{
"fileMatch": [
"*/config/ex-*.json"
],
"url": "file:///C:/my/path/to/example-schema.json"
}
]
}
You cannot do what you ask with JSON Schema draft-07 or prior.
The reason is, when $ref is used in a schema object, all other properties MUST be ignored.
An object schema with a "$ref" property MUST be interpreted as a
"$ref" reference. The value of the "$ref" property MUST be a URI
Reference. Resolved against the current URI base, it identifies the
URI of a schema to use. All other properties in a "$ref" object MUST
be ignored.
https://datatracker.ietf.org/doc/html/draft-handrews-json-schema-01#section-8.3
We changed this to not be the case for draft 2019-09.
It sounds like VSCode is merging the properties in applicators upwards to the nearest schema object (which is wrong), and IntelliJ IDEA is doing something similar but in a different way (which is also wrong).
The correct validation result for your schema and instance is VALID. See the live demo here: https://jsonschema.dev/s/C6ent
additionalProperties relies on the values of properties and patternProperties within the SAME schema object. It cannot "see through" applicators such as $ref and allOf.
For draft 2019-09, we added unevaluatedProperties, which CAN "see through" applicator keywords (although it's a little more complex than that).
Update:
After reviewing your update, sadly the same is still true.
One approach makes it sort of possible but involves some duplication, and only works when you control the schemas you are referencing.
You would need to redefine your selector property like this...
"selector": {
"description": "This property did not function correctly in VSCode",
"allOf": [
{
"$ref": "#/definitions/selector"
},
{
"properties": {
"operator": true,
"value": true,
"example": true,
"alternative": true
},
"additionalProperties": false
}
]
}
The values of a property object are schema values, and booleans are valid schemas. You don't need (or want to) deal with their validation here, only say these are the allowed ones, followed by no additionalProperties.
You'll also need to remove the additionalProperties: false from your definition of selector, as that is preventing ALL properties (which I now guess is why you saw that issue in one of the editors).
It involves some duplication, but is the only way I'm aware of that you can do this for draft-07 or previous. As I said, not a problem for draft 2019-09 or above due to new kewords.
additionalProperties is problematic because it depends on the properties and patternProperties. The result is that "additionalProperties": false effectively blocks schema composition. #Relequestual showed one alternative approach, here is another approach that is a little less verbose, but still requires duplication of property names.
draft-06 and up
{
"allOf": [{ "$ref": "#/definitions/base" }],
"properties": {
"bar": { "type": "number" }
},
"propertyNames": { "enum": ["foo", "bar"] },
"definitions": {
"base": {
"properties": {
"foo": { "type": "string" }
}
}
}
}

MongoDB Stitch GraphQL Custom Mutation Resolver returning null

GraphQL is a newer feature for MongoDB Stitch, and I know it is in beta, so thank you for your help in advance. I am excited about using GraphQL directly in Stitch so I am hoping that maybe I just overlooked something.
The documentation for the return Payload displays the use of bsonType, but when actually entering the JSON Schema for the payload type it asks for you to use "type" instead of "bsonType". It still works using "bsonType" to me which is odd as long as at least one of the properties uses "type".
Below is the function:
const mongodb = context.services.get("mongodb-atlas");
const collection = mongodb.db("<database>").collection("<collection>");
const query = { _id: BSON.ObjectId(input.id) }
const update = {
"$push": {
"notes": {
"createdBy": context.user.id,
"createdAt": new Date,
"text": input.text
}
}
};
const options = { returnNewDocument: true }
collection.findOneAndUpdate(query, update, options).then(updatedDocument => {
if(updatedDocument) {
console.log(`Successfully updated document: ${updatedDocument}.`)
} else {
console.log("No document matches the provided query.")
}
return {
_id: updatedDocument._id,
notes: updatedDocument.notes
}
})
.catch(err => console.error(`Failed to find and update document: ${err}`))
}
Here is the Input Type in the customer resolver:
"type": "object",
"title": "AddNoteToLeadInput",
"required": [
"id",
"text"
],
"properties": {
"id": {
"type": "string"
},
"text": {
"type": "string"
}
}
}
Below is the Payload Type:
{
"type": "object",
"title": "AddNoteToLeadPayload",
"properties": {
"_id": {
"type": "objectId"
},
"notes": {
"type": "array",
"items": {
"type": "object",
"properties": {
"createdAt": {
"type": "string"
},
"createdBy": {
"type": "string"
},
"text": {
"type": "string"
}
}
}
}
}
}
When entering the wrong "type" the error states:
Expected valid values are:[array boolean integer number null object string]
When entering the wrong "bsonType" the error states:
Expected valid values are:[string object array objectId boolean bool null regex date timestamp int long decimal double number binData]
I've tried every combination I can think of including changing all "bsonType" to "type". I also tried changing the _id to a string when using "type" or objectId when "bsonType". No matter what combination I try when I use the mutation it does what it is supposed to and adds the note into the lead, but the return payload always displays null. I need it to return the _id and note so that it will update the InMemoryCache in Apollo on the front end.
I noticed that you might be missing a return before your call to collection.findOneAndUpdate()
I tried this function (similar to yours) and got GraphiQL to return values (with String for all the input and payload types)
exports = function(input){
const mongodb = context.services.get("mongodb-atlas");
const collection = mongodb.db("todo").collection("dreams");
const query = { _id: input.id }
const update = {
"$push": {
"notes": {
"createdBy": context.user.id,
"createdAt": "6/10/10/10",
"text": input.text
}
}
};
const options = { returnNewDocument: true }
return collection.findOneAndUpdate(query, update, options).then(updatedDocument => {
if(updatedDocument) {
console.log(`Successfully updated document: ${updatedDocument}.`)
} else {
console.log("No document matches the provided query.")
}
return {
_id: updatedDocument._id,
notes: updatedDocument.notes
}
})
.catch(err => console.error(`Failed to find and update document: ${err}`))
}
Hi Bernard – There is an unfortunate bug in the custom resolver form UI at the moment which doesn't allow you to only use bsonType in the input/payload types – we are working on addressing this. In actually you should be able to use either type/bsonType or a mix of the two as long as they agree with your data. I think that the payload type definition you want is likely:
{
"type": "object",
"title": "AddNoteToLeadPayload",
"properties": {
"_id": {
"bsonType": "objectId"
},
"notes": {
"type": "array",
"items": {
"type": "object",
"properties": {
"createdAt": {
"bsonType": "date"
},
"createdBy": {
"type": "string"
},
"text": {
"type": "string"
}
}
}
}
}
}
If that doesn't work, it might be helpful to give us a sample of the data that you would like returned.

Ingesting multi-valued dimension from comma sep string

I have event data from Kafka with the following structure that I want to ingest in Druid
{
"event": "some_event",
"id": "1",
"parameters": {
"campaigns": "campaign1, campaign2",
"other_stuff": "important_info"
}
}
Specifically, I want to transform the dimension "campaigns" from a comma-separated string into an array / multi-valued dimension so that it can be nicely filtered and grouped by.
My ingestion so far looks as follows
{
"type": "kafka",
"dataSchema": {
"dataSource": "event-data",
"parser": {
"type": "string",
"parseSpec": {
"format": "json",
"timestampSpec": {
"column": "timestamp",
"format": "posix"
},
"flattenSpec": {
"fields": [
{
"type": "root",
"name": "parameters"
},
{
"type": "jq",
"name": "campaigns",
"expr": ".parameters.campaigns"
}
]
}
},
"dimensionSpec": {
"dimensions": [
"event",
"id",
"campaigns"
]
}
},
"metricsSpec": [
{
"type": "count",
"name": "count"
}
],
"granularitySpec": {
"type": "uniform",
...
}
},
"tuningConfig": {
"type": "kafka",
...
},
"ioConfig": {
"topic": "production-tracking",
...
}
}
Which however leads to campaigns being ingested as a string.
I could neither find a way to generate an array out of it with a jq expression in flattenSpec nor did I find something like a string split expression that may be used as a transformSpec.
Any suggestions?
Try setting useFieldDiscover: false in your ingestion spec. when this flag is set to true (which is default case) then it interprets all fields with singular values (not a map or list) and flat lists (lists of singular values) at the root level as columns.
Here is a good example and reference link to use flatten spec:
https://druid.apache.org/docs/latest/ingestion/flatten-json.html
Looks like since Druid 0.17.0, Druid expressions support typed constructors for creating arrays, so using expression string_to_array should do the trick!

elasticsearch 6.2 How to specify child and parent fields within one mapping(_doc)

Since 6.2 no longer support multiple mapping type. I have to migrate existing multi type index into _doc type single mapping. However I am not sure how to map current child properties in this single mapping.
"mappings": {
"_doc": {
"properties": {
"join_field": {
"type": "join",
"relations": {
"question": "answer"
}
},
"text": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
Now I wish to have more fields in answer type as well as in question type.But I have no clue how to do that.

json schema issue on required property

I need to write the JSON Schema based on the specification defined by http://json-schema.org/. But I'm struggling for the required/mandatory property validation. Below is the JSON schema that I have written where all the 3 properties are mandatory but In my case either one should be mandatory. How to do this?.
{
"id": "http://example.com/searchShops-schema#",
"$schema": "http://json-schema.org/draft-04/schema#",
"title": "searchShops Service",
"description": "",
"type": "object",
"properties": {
"city":{
"type": "string"
},
"address":{
"type": "string"
},
"zipCode":{
"type": "integer"
}
},
"required": ["city", "address", "zipCode"]
}
If your goal is to tell that "I want at least one member to exist" then use minProperties:
{
"type": "object",
"etc": "etc",
"minProperties": 1
}
Note also that you can use "dependencies" to great effect if you also want additional constraints to exist when this or that member is present.
{
...
"anyOf": [
{ "required": ["city"] },
{ "required": ["address"] },
{ "required": ["zipcode"] },
]
}
Or use "oneOf" if exactly one property should be present