POSTGRES jsonb document GIN Indexes are created on what all objects? - postgresql

I am using Postgres DB 13.5. From pgdocs -
The technical difference between a jsonb_ops and a jsonb_path_ops GIN
index is that the former creates independent index items for each key
and value in the data, while the latter creates index items only for
each value in the data. Basically, each jsonb_path_ops index item
is a hash of the value and the key(s) leading to it; for example to
index {"foo": {"bar": "baz"}}
Understanding the above in detail is important for me coz my jdata (document) is big with many keys and nested objects. Consider my json data that is stored as jsonb in a column named jdata looks like below -
{
"supplier": {
"id": "3c67b6eb-3b0d-492d-8736-66df107b83b3",
"customer": {
"type": "pro",
"name": "John George",
"address": [
{
"add-id": "098ad4df-2a90-4fda-8f92-dbe8d7196732",
"addressActive": true,
"street": "abc street",
"zip": 94044,
"staying-since": "long long",
"accessibility": {
"traffic": "heavy/congested",
"bestwaytoreach": {
"weekdays": {
"bart/metro/calltrain": true,
"price": {
"off-peak-hours": "affordable",
"peak-hours": "high"
},
"journey-time": "super-fast"
}
},
"weekends": {
"byroad": {
"ok": true,
"distance": "long",
"has-tolls": {
"true": true,
"toll-price": "relatively-high"
},
"journey-speed": "fast"
}
}
}
},
{
"add-id": "ddd1d2a0-9050-4bcf-a3ad-2e608d65e468",
"addressActive": true,
"street": "xyz street",
"zip": 10001,
"staying-since": "moved recently",
"accessibility": {
"traffic": "heavy/congested",
"bestwaytoreach": {
"weekdays": {
"subway": true,
"price": {
"off-peak-hours": "affordable",
"peak-hours": "high"
},
"journey-speed": "super-fast"
}
},
"weekends": {
"byroad": {
"ok": true,
"distance": "moderate",
"tolls": {
"has-tolls": true,
"toll-price": "relatively-high"
},
"journey-time": "super-fast"
}
}
}
}
],
"firstName": "John",
"lastName": "CRAWFORD",
"emailAddresses": {
"personal": [
"johnreplies#jg.com",
"ursjohn#jg.com",
"1234#jg.com"
],
"official": [
{
"repies-in": "1 day",
"email": "jg#jg.com"
},
{
"check's regularly": true,
"repies-in": "1 Hour",
"email": "jg-watching#jg.com"
}
]
},
"cities": [
"NYC",
"LA",
"SF",
"DC"
],
"splCustFlag": null,
"isPerson": true,
"isEntity": false,
"allowEmailSolicit": "Y",
"allowPhoneSolicit": "Y",
"taxPayer": true,
"suffix": null,
"title": null,
"birthDate": "05/10/1993",
"loyaltyPrograms": null,
"phoneNumbers-summary": [
1234567890,
1234567899,
1234567898,
1234567897
],
"phoneNumbers": [
{
"description": null,
"extension": null,
"number": 1234567890,
"countryCode": null,
"type": "Business"
},
{
"description": null,
"extension": null,
"number": 1234567899,
"countryCode": null,
"type": "Home"
}
],
"data-privacy": {
"required": true,
"laws": [
"CCPA",
"GDPR"
]
}
}
}
}
Now if I create GIN jsonb_ops index for jdata column - I want to clarify what all keys and values will be part of index.
For example - "staying-since" is a key nested at below path and it's part of "address" array too. But it's still a key, thought nested deep in the document. So will it be part of the index.
{
"supplier": {
"customer": {
"address": [
{
"staying-since": "long long" ...
And similarly "long long" is a value of a deeply nested key. Will it also be indexed.
And if GIN jsonb_path_ops index is created for jdata column --
Will a hash of "long long" value along with the path that leads to it will also be indexed.
hash(
"supplier": {
"customer": {
"address":[{"staying-since": "long long"}]
}
}
)
will the above also gets index.
I am aware about the operators that are supported by the GIN index types and am aware about the usage of these operators -
jsonb_ops ? ?& ?| #> #? ##
jsonb_path_ops #> #? ##

Related

Rules for real time database chat when using an external database

I am trying to secure my real time db. I have the following database structure:
{
"chats": {
"-NMhLlfSU-HYmjmXBzmH": {
"lastMessage": "",
"lastSender": "",
"seen": true,
"timestamp": 1674724449157
},
"members": {
"-NMhLlfSU-HYmjmXBzmH": {
"63cc6d925b51cb7a423393cc": true,
"63d240635b51cb7a423397d5": true
},
},
"users": {
"63cc6d925b51cb7a423393cc": {
"city": "Ituzaingó, Buenos Aires Province, Argentina",
"contacts": {
"63d240635b51cb7a423397d5": true
},
"name": "Joaquin varela",
"picture": "https://cdn.pixabay.com/photo/2015/10/05/22/37/blank-profile-picture-973460_1280.png"
},
"63d240635b51cb7a423397d5": {
"city": "Madrid",
"contacts": {
"63cc6d925b51cb7a423393cc": true
},
"name": "Test Test",
"picture": "https://cdn.pixabay.com/photo/2015/10/05/22/37/blank-profile-picture-973460_1280.png"
},
}
I am trying to implement the rules for it. The only problem is, my auth.uid is not the same as my user_id
Is there any way to secure my database? Maybe passing some user_id argument but I don't know how.
I hope you can help me. Thanks in advance!

MongoDB Atlas Search not showing results when typing few characters

The problem I am facing is that I want to develop an autocomplete search bar using Mean Stack like the one in this site, but when I type, for example, 'ag' it's not returning the right location that should be 'Aguascalientes'.
I have two different search indexes set up and a different query for each.
First Index:
{
"mappings": {
"dynamic": false,
"fields": {
"name": {
"foldDiacritics": false,
"maxGrams": 7,
"minGrams": 3,
"tokenization": "edgeGram",
"type": "autocomplete"
},
"searchName": {
"foldDiacritics": false,
"maxGrams": 7,
"minGrams": 3,
"tokenization": "edgeGram",
"type": "autocomplete"
}
}
}
}
First Query:
[
{
$search: {
index: "autocomplete2",
compound: {
must: [
{
text: {
query: search,
path: "searchName",
fuzzy: {
maxEdits: 2,
},
},
},
],
},
},
},
{
$limit: 10,
},
]
The first ones are not returning any document at all. But the second example is:
{
"mappings": {
"dynamic": false,
"fields": {
"name": {
"analyzer": "lucene.standard",
"type": "string"
},
"searchName": {
"analyzer": "lucene.standard",
"type": "string"
}
}
}
}
Query:
[
{
$search: {
index: 'default',
compound: {
must: [
{
text: {
query: search,
path: 'name',
fuzzy: {
maxEdits: 1,
},
},
},
{
text: {
query: search,
path: 'searchName',
fuzzy: {
maxEdits: 1,
},
},
},
],
},
},
},
{
$limit: 5,
},
]
The second example is only returning documents if the search term 'aguascalient' but is not returning any document if the search term is shorter like the site. Maybe it has something to do with the fuzzy edits but if I set it up to greater than 2 I get an error.
Also the order is not right, it returns first the CITY and second the STATE but I need the STATE first because the search term is more similar than the city. Let me explain, search field for STATE is only 'Aguascalientes' but search field cities is 'Aguascalientes Aguascalientes' so I don't know why is not working properly. Maybe in that case I should give weights accordingly but I'm not sure if it's the right approach to solve this.
My data structure:
{
"_id": "638d0ffc34ad076c6bd12cb6",
"depth": 2,
"label": "CITY",
"location_id": "V1-C-247",
"name": "Aguascalientes",
"parent": "Aguascalientes",
"fullName": "Aguascalientes, Aguascalientes",
"parentId": "V1-B-61",
"searchName": "Aguascalientes Aguascalientes",
}
{
"_id": "638d0ffc34ad076c6bd12cb6",
"depth": 1,
"label": "STATE",
"location_id": "V1-C-248",
"name": "Aguascalientes",
"parent": null,
"fullName": "Aguascalientes",
"parentId": null,
"searchName": "Aguascalientes",
}
For the first index + query setup:
First, you are indexing the name field but are not searching on it. I will remove it from the code snippets for readability, but you can add it back to your index definition if you find you need to search on it.
There are two problems with the this index + query setup if you want to return results with a query for "ag". You have searchName defined as a field mapping of type autocomplete, but you also need to use the autocomplete operator in your query:
[
{
$search: {
index: "autocomplete2",
compound: {
must: [
{
autocomplete: {
query: search,
path: "searchName",
},
},
],
},
},
},
{
$limit: 10,
},
]
Second, in your index definition field mapping for searchName, you have minGram set to 3 and maxGram set to 7. Based on the documentation for the autocomplete field mapping, this means that your data will be tokenized into sequences of character lengths between 3 to 7, using the selected tokenization strategy. Since you have selected edgeGram, the tokens generated by the text "Aguascalientes" will be tokenized starting from the left edge, resulting in tokens "agu", "agua", "aguas", "aguasc", "aguasca". Since the search term "ag" does not match any of the tokens, nothing is returned. So, you must change the minGram to 2 to get the token "ag":
{
"mappings": {
"dynamic": false,
"fields": {
"searchName": {
"foldDiacritics": false,
"maxGrams": 7,
"minGrams": 2,
"tokenization": "edgeGram",
"type": "autocomplete"
}
}
}
}
Finally, if you want the document with an exact match to return over a partial match, ie. "Aguascalientes" should return before "Aguascalientes Aguascalientes", you need to implement exact matching. Here is a MongoDB blog post outlining a few options.
One option that I tried: In the index, use a keyword analyzer on the "searchName" field typed as a string data type. In the query, use the text operator nested in a should clause so that exact matches will return higher than other results.
Index:
{
"mappings": {
"dynamic": false,
"fields": {
"searchName": [
{
"foldDiacritics": false,
"maxGrams": 7,
"type": "autocomplete"
},
{
"analyzer": "lucene.keyword",
"searchAnalyzer": "lucene.keyword",
"type": "string"
}
]
}
}
}
Query:
[
{
$search: {
compound: {
must: [
{
autocomplete: {
query: search,
path: "searchName"
}
}
],
should:[
{
text: {
query: search,
path: "searchName"
}
}
],
},
},
},
]

How can i remove from a jsonb all elements with value is null?

I have a table "points" and the columns "node_id", "tags", etc.
How can i remove from the jsonb element "tags" all elements with the null value ?
{
"addInfo": {
"payment": {
"payment:dkv": null,
"payment:uta": null,
},
"fueltype": {
"fuel:diesel": "yes",
"fuel:octane_91": null,
"fuel:octane_95": "yes",
"fuel:octane_98": null
},
"operating": {
"name": "Raiffeisen",
"brand": "Raiffeisen",
"operator": null,
"opening_hours": "24/7"
}
}
}
i want to get this form:
{
"addInfo": {
"payment": {},
"fueltype": {
"fuel:diesel": "yes",
"fuel:octane_95": "yes"
},
"operating": {
"name": "Raiffeisen",
"brand": "Raiffeisen",
"opening_hours": "24/7"
}
}
}
I try until with this example code, it works but is not smart. I use twice jsonb_strip_nulls and replace and convert between text and jsonb. Is any other way to get the same smarter ?
SELECT node_id,
nullif(jsonb_strip_nulls(replace("addInfo" ::text, '{}', 'null')
::jsonb) ::text,
'{}') ::jsonb
FROM (SELECT jsonb_strip_nulls(
jsonb_build_object('addInfo',
jsonb_build_object('EXAMPLE....'))) "addInfo"
FROM points p
WHERE p.tags notnull
AND p.tags - >> 'amenity' = 'fuel') foo;
And how can i have the original sorting:
1 operating
2 payment
3 fueltype
json_strip_nulls and jsonb_strip_nulls this functions deletes all object fields that have null values from the given JSON value. Null values that are not object fields are untouched. The best side of these functions is that they are recursive, so the function will be deleted null values in sub JSON objects too.
ATTENTION!!! - your JSON string code has invalid, I removed one , character in your JSON and commented on this.
select jsonb_strip_nulls(
'{
"addInfo": {
"payment": {
"payment:dkv": null,
"payment:uta": null /* in here I removed character: "," */
},
"fueltype": {
"fuel:diesel": "yes",
"fuel:octane_91": null,
"fuel:octane_95": "yes",
"fuel:octane_98": null
},
"operating": {
"name": "Raiffeisen",
"brand": "Raiffeisen",
"operator": null,
"opening_hours": "24/7"
}
}
}')
Works fine!!! Result:
{
"addInfo": {
"payment": {},
"fueltype": {
"fuel:diesel": "yes",
"fuel:octane_95": "yes"
},
"operating": {
"name": "Raiffeisen",
"brand": "Raiffeisen",
"opening_hours": "24/7"
}
}
}

How to search mongodb collection map JSON

I have the JSON below in mongodb and would like write a bson.M filter to get a specific JSON in collection.
JSONs in collection:
{
"Id": "3fa85f64",
"Type": "DDD",
"Status": "PRESENT",
"List": [{
"dd": "55",
"cc": "33"
}],
"SeList": {
"comm_1": {
"seId": "comm_1",
"serName": "nmf-comm"
},
"comm_2": {
"seId": "comm_2",
"serName": "aut-comm"
}
}
}
{
"Id": "3fa8556",
"Type": "CCC",
"Status": "PRESENT",
"List": [{
"dd": "22",
"cc": "34"
}],
"SeList": {
"dnn_1": {
"seId": "dnn_1",
"serName": "dnf-comm"
},
"dnn_2": {
"seId": "dnn_2",
"serName": "dn2-comm"
}
}
}
I have written below the bson.M filter to select the first JSON but did not work because I do not know how to handle the map keys in the "SeList.serName". The keys comm_1, comm_2, dnn_1, etc could be any string.
filter := bson.M{"Type": DDD, "Status": "PRESENT", "SeList.serName": nmf-comm} // does not work because the "SeList.serName" is not correct.
I need help about how to select any JSON based on the example filter above.

MongoDb Query returning unwanted documents

I have a database containing documents of two structures:
{
"name": "",
"name_ar": "",
"description": "",
"bla1": {
"name": "",
"link": "",
"Logo": ""
},
"bla2": {
"name": "",
"id": ""
}
}
and
{
"name": "",
"name_ar": "",
"description": "",
"bla1": {
"name": [],
"link": "",
"Logo": ""
},
"bla2": {
"name": "",
"id": ""
}
}
I want to query my collection to get documents with "bla1.name" exactly equal to something. However using the following query:
{$and: [{'bla1.name': {'$type': 'string'}}, {"bla1.name":'something'}]}
returns all documents (even where "bla1.name" is an array) containing the name: 'something'.
What am I doing wrong?
From the MongoDB docs:
$type now works with arrays in the same way it works with other BSON types. Previous versions only matched documents where the field contained a nested array.
That means: If an array has at least one element with the given type it gets selected.
If you want to exclude arrays as type you have to extend your query. As the query already matches strings, you can exclude the type selection for string:
$and: [
// not necessary any more, as this selection is already implied by the last part
// {
// "bla1.name": {
// "$type": "string"
// }
// },
{
"bla1.name": {
$not: {
"$type": "array"
}
}
}, {
"bla1.name": "something"
}
]
See the official docs: https://docs.mongodb.com/manual/reference/operator/query/type/#behavior
Here is a working demo on the Mongo playground: https://mongoplayground.net/p/3ri7Bjfrae8