Which analyzer to use on specific strings?

Which analyzer to use on specific strings? - mongodb

I have a document in my collection with a property name like this:
name: [{Value: "steel 0.8x1000x2000mm"}]
Now I'm trying to create a search index for it, so far looks like this:
...
"name": {
"fields": {
"Value": [
{
"analyzer": "lucene.finnish",
"searchAnalyzer": "lucene.finnish",
"type": "string"
},
{
"dynamic": true,
"type": "document"
}
]
},
"type": "document"
},
...
And it works pretty fine except for such documents. The issue is that the query 0.8x1000x2000 doesn't match anything, though 0.8x1000x2000mm works fine.
I guess I'm using the wrong analyzer, but can't really figure out which one should I. Or I should make a custom one?

Related

How do I use an extended definition and not allow additional properties in a way that is compatible with multiple validators (JSON schema draft 7)?

I am creating a strict validator for a complex JSON file and want to re-use various definitions in order to keep the schema manageable and easier to update.
According to the documentation it is necessary to use allOf to extend a definition to add more properties. This is exactly what I've done, but I find that without use of additionalProperties set to false validation doesn't prevent arbitrary other properties being added.
The following massively cut-down schema demonstrates what I'm doing:
{
"$schema": "http://json-schema.org/draft-07/schema#",
"$id": "https://example.com/schema/2021/02/example.json",
"description": "This schema demonstrates how VSCode's JSON schema mechanism fails with allOf used to extend a definition",
"definitions": {
"valueProvider": {
"type": "object",
"properties": {
"example": {
"type": "string"
},
"alternative": {
"type": "string"
}
},
"oneOf": [
{
"required": [
"example"
]
},
{
"required": [
"alternative"
]
}
]
},
"selector": {
"type": "object",
"allOf": [
{
"$ref": "#/definitions/valueProvider"
},
{
"required": [
"operator",
"value"
],
"properties": {
"operator": {
"type": "string",
"enum": [
"IsNull",
"Equals",
"NotEquals",
"Greater",
"GreaterOrEquals",
"Less",
"LessOrEquals"
]
},
"value": {
"type": "string"
}
}
}
],
"additionalProperties": false
}
},
"properties": {
"show": {
"properties": {
"name": {
"type": "string"
},
"selector": {
"description": "This property does not function correctly in VSCode",
"allOf": [
{
"$ref": "#/definitions/selector"
},
{
"additionalProperties": false
}
]
}
},
"additionalProperties": false
}
}
}
This works a treat in IntelliJ IDEA's JSON editor (2020.3.2 ultimate edition) when editing JSON against this schema (using a schema mapping). For example, the file ex-fail.json's content of:
{
"show": {
"name": "a",
"selector": {
"example": "a",
"operator": "IsNull",
"value": "false",
"d": "a"
}
}
}
Is correctly validated, simply highlighting "d" as not allowed, thus:
However, when I use the very same schema and JSON file with VSCode (1.53.2) with vanilla configuration (except for a schema mapping) VSCode erroneously marks "example", "operator", "value" and "d" as not allowed. It looks like this in the VSCode editor:
If I remove the additionalProperties definition from the show.selector property, both IDEA and VSCode indicate that all is well, including allowing the "d" property - in doing this I can simplify that property definition to:
"selector": {
"description": "This property does not function correctly in VSCode",
"$ref": "#/definitions/selector"
}
What can I do to the schema to support both IDEA and VSCode whilst disallowing additional properties where they should not appear?
PS: The schema mapping in VSCode is simply along the lines of:
{
"json.schemas": [
{
"fileMatch": [
"*/config/ex-*.json"
],
"url": "file:///C:/my/path/to/example-schema.json"
}
]
}

You cannot do what you ask with JSON Schema draft-07 or prior.
The reason is, when $ref is used in a schema object, all other properties MUST be ignored.
An object schema with a "$ref" property MUST be interpreted as a
"$ref" reference. The value of the "$ref" property MUST be a URI
Reference. Resolved against the current URI base, it identifies the
URI of a schema to use. All other properties in a "$ref" object MUST
be ignored.
https://datatracker.ietf.org/doc/html/draft-handrews-json-schema-01#section-8.3
We changed this to not be the case for draft 2019-09.
It sounds like VSCode is merging the properties in applicators upwards to the nearest schema object (which is wrong), and IntelliJ IDEA is doing something similar but in a different way (which is also wrong).
The correct validation result for your schema and instance is VALID. See the live demo here: https://jsonschema.dev/s/C6ent
additionalProperties relies on the values of properties and patternProperties within the SAME schema object. It cannot "see through" applicators such as $ref and allOf.
For draft 2019-09, we added unevaluatedProperties, which CAN "see through" applicator keywords (although it's a little more complex than that).
Update:
After reviewing your update, sadly the same is still true.
One approach makes it sort of possible but involves some duplication, and only works when you control the schemas you are referencing.
You would need to redefine your selector property like this...
"selector": {
"description": "This property did not function correctly in VSCode",
"allOf": [
{
"$ref": "#/definitions/selector"
},
{
"properties": {
"operator": true,
"value": true,
"example": true,
"alternative": true
},
"additionalProperties": false
}
]
}
The values of a property object are schema values, and booleans are valid schemas. You don't need (or want to) deal with their validation here, only say these are the allowed ones, followed by no additionalProperties.
You'll also need to remove the additionalProperties: false from your definition of selector, as that is preventing ALL properties (which I now guess is why you saw that issue in one of the editors).
It involves some duplication, but is the only way I'm aware of that you can do this for draft-07 or previous. As I said, not a problem for draft 2019-09 or above due to new kewords.

additionalProperties is problematic because it depends on the properties and patternProperties. The result is that "additionalProperties": false effectively blocks schema composition. #Relequestual showed one alternative approach, here is another approach that is a little less verbose, but still requires duplication of property names.
draft-06 and up
{
"allOf": [{ "$ref": "#/definitions/base" }],
"properties": {
"bar": { "type": "number" }
},
"propertyNames": { "enum": ["foo", "bar"] },
"definitions": {
"base": {
"properties": {
"foo": { "type": "string" }
}
}
}
}

How to filter OData collection where attribute does not exist?

I have an OData collection where the data looks like this:
{
"#odata.context": "http://localhost:5488/odata/$metadata#folders",
"value": [
{
"name": "samples",
"_id": "79a91bc9-9083-4442-ac8d-ad30777ac8c8",
"creationDate": "2019-08-05T04:39:00.670Z",
"modificationDate": "2019-08-05T04:39:00.670Z",
"shortid": "18xQnNv"
},
{
"name": "Population",
"folder": {
"shortid": "18xQnNv"
},
"_id": "7406269b-669c-41ce-92f3-f540792df07e",
"creationDate": "2019-08-05T04:39:00.750Z",
"modificationDate": "2019-08-05T04:39:00.750Z",
"shortid": "0ppeLV"
},
{
"name": "Invoice",
"folder": {
"shortid": "18xQnNv"
},
"_id": "525aff6a-6b10-4ad6-93ce-e9c753e8ade0",
"creationDate": "2019-08-05T04:39:00.790Z",
"modificationDate": "2019-08-05T04:39:00.790Z",
"shortid": "G3i2B3"
},
{
"name": "Default",
"_id": "58daf5aa-1f13-4ff9-be1f-8cb11a812485",
"creationDate": "2019-08-07T22:56:45.160Z",
"modificationDate": "2019-08-07T22:56:45.160Z",
"shortid": "Sm8LpmP"
}
]
}
I want to exclude the objects which have the attribute "folder". I've tried using a GET request: http://localhost:5488/odata/folders?$filter=folder eq null with no luck. Is this even possible and is there a way to filter my request like this?

You might be able to use the all lambda operator to accomplish this. The operator all will always return true on empty collections. So if you make a condition that no folder attribute that actually exists will ever evaluate to true on, then the result should be a filter of only those objects that have an empty attribute.
This is just a theory. You'll need to test, but it would maybe look something like this on your sample.
http://localhost:5488/odata/folders?$filter=folder/all(f:f/shortid eq 'xxxxxx')
You didn't mention the version of OData your working with but lambda expressions are at least V4 and later. Possibly earlier, not sure.

How to update the Embedded Data which is inside of another Embedded Data?

I have document like below in MongoDB:
{
"_id": "test",
"tasks": [
{
"Name": "Task1",
"Parameter": [
{
"Name": "para1",
"Type": "String",
"Value": "*****"
},
{
"Name": "para2",
"Type": "String",
"Value": "*****"
}
]
},
{
"Name": "Task2",
"Parameter": [
{
"Name": "para1",
"Type": "String",
"Value": "*****"
},
{
"Name": "para2",
"Type": "String",
"Value": "*****"
}
]
}
]
}
There is Embedded Data Structure (Parameter) inside of another Embedded Data Structure (Tasks). Now I want to update the para1 in Task1's Parameter.
I have tried many ways but I can only use query tasks.Parameter.name to find the para1 but cannot update it. the example in the doc are using .$. to update the value in a Embedded Data Structure but it doesn't work in my case.
Anyone have any ideas ?

MongoDB currently only supports the positional operator once, and only for the top level array. There is a ticket SERVER-831 to change this behavior for your use case. You can follow the issue there and up vote it.
However, you might be able to change your approach to accomplish what you want to do. One way is to change your schema. Collapse the tasks name into the array so the document looks like this:
{
_id:test,
tasks:
[
{
Task:1
Name:para1,
Type:String,
Value:*****
},
{
Task:1
Name:para2,
Type:String,
Value:*****
},
{
Task:2
Name:para1,
Type:String,
Value:*****
},
{
Task:2
Name:para2,
Type:String,
Value:*****
}
]
}
Another approach that may work for you is to use $pull and $push. For instance something like this to replace a task (this assumes that tasks.Parameter.Name is unique to an array of Parameters):
db.test2.update({$and: [{"tasks.Name": "Task3"}, {"tasks.Parameter.Name":"para1"}]}, {$pull: {"tasks.$.Parameter": {"Name": "para1"}}})
db.test2.update({"tasks.Name": "Task3"}, {$push: {"tasks.$.Parameter": {"Name": "para3", Type: "String", Value: 1}}})
With this solution you need to be careful with regard to concurrency, as there will be a brief moment where the document doesn't exist.

Filtering nested results an OData Query

I have a OData query returning a bunch of items. The results come back looking like this:
{
"d": {
"__metadata": {
"id": "http://dev.sp.swampland.local/_api/SP.UserProfiles.PeopleManager/GetPropertiesFor(accountName=#v)",
"uri": "http://dev.sp.swampland.local/_api/SP.UserProfiles.PeopleManager/GetPropertiesFor(accountName=#v)",
"type": "SP.UserProfiles.PersonProperties"
},
"UserProfileProperties": {
"results": [
{
"__metadata": {
"type": "SP.KeyValue"
},
"Key": "UserProfile_GUID",
"Value": "66a0c6c2-cbec-4abb-9e25-cc9e924ad390",
"ValueType": "Edm.String"
},
{
"__metadata": {
"type": "SP.KeyValue"
},
"Key": "ADGuid",
"Value": "System.Byte[]",
"ValueType": "Edm.String"
},
{
"__metadata": {
"type": "SP.KeyValue"
},
"Key": "SID",
"Value": "S-1-5-21-2355771569-1952171574-2825027748-500",
"ValueType": "Edm.String"
}
]
}
}
}
In reality, there's a lot of items (100+) coming back in the UserProfileProperties collection however I'm only looking for a few where the KEY matches a few items but I can't figure out exactly what I need my filter to be. I've tried $filter=UserProfileProperties/Key eq 'SID' but that still gives me everything. Also trying to figure out how to pull back multiple items.
Ideas?

I believe you forgot about how each of the results have a key, not the UserProfileProperties so UserProfileProperties/Key doesn't actually exist. Instead because result is an array you must check either a certain position (eq. result(1)) or use the oData functions any or all.
Try $filter=UserProfileProperties/results/any(r: r/Key eq 'SID') if you want all the profiles where just one of the keys is SID or use
$filter=UserProfileProperties/results/all(r: r/Key eq 'SID') if you want the profiles where every result has a key equaling SID.

Remove entry in an array of a MongoDB document

Say I have a document that looks something like this:
{
"_id": ObjectId("50b6a7416cb035b629000001"),
"businesses": [{
"name": "Biz1",
"id": ObjectId("50b6bc953e47dc923e000001")
}, {
"name": "Biz2",
"id": ObjectId("50b6ccebae0513bf52000001")
}, {
"name": "Biz3",
"id": ObjectId("50b6d015c58b414156000001")
}, {
"name": "Biz4",
"id": ObjectId("50b6d0c8a4cdd5e356000001")
}]
}
I want to remove
{
"name": "Biz3",
"id": ObjectId("50b6d015c58b414156000001")
}
from the array of businesses. I tried this (using business name instead of id for clarity):
db.users.update({'businesses.name':'Biz3'},{$pull:{'businesses.name':'Biz3'}})
but of course it didn't work. I know that the query part is correct because I get the document back when I do this:
db.users.find({'businesses.name' : 'Biz3'})
So the problem is with the update part.

Just ran a quick lil test and this works
I think trying db.users.update({'businesses.name':'Biz3'},{$pull:{'businesses':{'name':'Biz3'}}}) should do it

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse