Postgresql doesn't use GIN index for "?" JSON operator - postgresql

By some reason index is not used for "?" operator.
Let's take this sample https://schinckel.net/2014/05/25/querying-json-in-postgres/ :
CREATE TABLE json_test (
id serial primary key,
data jsonb
);
INSERT INTO json_test (data) VALUES
('{}'),
('{"a": 1}'),
('{"a": 2, "b": ["c", "d"]}'),
('{"a": 1, "b": {"c": "d", "e": true}}'),
('{"b": 2}');
And create an index.
create index json_test_index on public.json_test using gin (data jsonb_path_ops) tablespace pg_default;
Then take a look at plan of the following query:
SELECT * FROM json_test WHERE data ? 'a';
There will be Seq Scan while I would expect an index scan. Could please somebody advise what's wrong here?

From the docs: "The non-default GIN operator class jsonb_path_ops supports indexing the #> operator only." It doesn't support the ? operator.
So use the default operator for jsonb instead (called "jsonb_ops", if you wish to spell it out explicitly).
But if your table only has 5 rows, it probably won't use the index anyway, unless you force it by set enable_seqscan = off.

Related

How to create index for postgresql jsonb field (array data) and text field

Please let me know how to create index for below query.
SELECT * FROM customers
WHERE identifiers #>
'[{"systemName": "SAP", "systemReference": "33557"}]'
AND country_code = 'IN';
identifiers is jsonb type and data is as below.
[{"systemName": "ERP", "systemReference": "TEST"}, {"systemName": "FEED", "systemReference": "2733"}, {"systemName": "SAP", "systemReference": "33557"}]
country_code is varchar type.
Either create a GIN index on identifiers ..
CREATE INDEX customers_identifiers_idx ON customers
USING GIN(identifiers);
.. or a composite index with identifiers and country_code.
CREATE INDEX customers_country_code_identifiers_idx ON customers
USING GIN(identifiers,country_code gin_trgm_ops);
The second option will depend on the values distribution of country_code.
Demo: db<>fiddle
You can create gin index for jsonb typed columns in Postgresql. gin index has built-in operator classes to handle jsonb operators. Learn more about gin index here https://www.postgresql.org/docs/12/gin-intro.html
For varchar types, btree index is good enough.

How to Index and make WHERE clause case insensitive?

Have this table in PostgreSQL 12, no index
CREATE TABLE tbl
(
...
foods json NOT NULL
)
sample record:
foods:
{
"fruits": [" 2 orange ", "1 apple in chocolate", " one pint of berry"],
"meat": ["some beef", "ground beef", "chicken",...],
"veg": ["cucumber"]
}
Need to select all records who satisfy:
fruits contains orange.
AND meat contains beef or chicken.
select * from tbl where foods->> 'fruits' LIKE '%ORANGE%' and (foods->> 'meat' LIKE '%beef%' or foods->> 'meat' LIKE '%chicken%')
Is it an optimized query? (I'm from RDBMS world)
How to index for faster response and not overkill, also how to make PostgreSQL case insensitive?
This will make you unhappy.
You would need two trigram GIN indexes to speed this up:
CREATE EXTENSION pg_trgm;
CREATE INDEX ON tbl USING gin ((foods ->> 'fruits') gin_trgm_ops);
CREATE INDEX ON tbl USING gin ((foods ->> 'meat') gin_trgm_ops);
These indexes can become large and will impact data modification performance.
Then you need to rewrite your query to use ILIKE.
Finally, the query might be slower than you want, because it will use three index scans and a (potentially expensive) bitmap heap scan.
But with a data structure like that and substring matches, you cannot do better.

PostgreSQL jsonb - omit multiple nested keys

The task is to remove multiple nested keys from jsonb field.
Is there any way to shorten this expression without writing a custom function?
SELECT jsonb '{"a": {"b":1, "c": 2, "d": 3}}' #- '{a,b}' #- '{a,d}';
suppose we need to delete more than 2 keys
There is no way to shorten the expression. If your goal is to pass to the query a single array of keys to be deleted you can use jsonb_set() with jsonb_each():
with my_table(json_col) as (
values
(jsonb '{"a": {"b":1, "c": 2, "d": 3}}')
)
select jsonb_set(json_col, '{a}', jsonb_object_agg(key, value))
from my_table
cross join jsonb_each(json_col->'a')
where key <> all('{b, d}') -- input
group by json_col -- use PK here if exists
jsonb_set
-----------------
{"a": {"c": 2}}
(1 row)
The solution is obviously more expensive but may be handy when dealing with many keys to be deleted.
NVM, figured it out)
For this particular case, we can re-assign property with removed keys (flat):
SELECT jsonb_build_object('a', ('{ "b":1, "c": 2, "d": 3 }' - ARRAY['b','d']));
More general approach:
SELECT json_col || jsonb_build_object('<key>',
((json_col->'<key>') - ARRAY['key-1', 'key-2', 'key-n']));
Not very useful for deep paths, but works ok with 1-level nesting.

Postgres jsonb index on nested integer field

I've got the following data structure in my postgres database - a jsonb column called customer
{
"RequestId": "00000000-0000-0000-0000-000000000000",
"Customer": {
"Status": "A",
"AccountId": 14603582,
"ProfileId": 172,
"ReferralTypeId": 15
}
"Contact": {
"Telephone": "",
"Email": ""
}
}
I want to create an index on the ProfileId field, which is an integer.
I've been unable to find an example of how to create an index on a nested field.
The query I'm executing (which takes ~300s) is:
select id, customer from where customer #> '{"Customer":{"ProfileId": 172}}'
The operator classes jsonb_path_ops and jsonb_ops for GIN indexes support the #> operator.
So your query should be able to use the following index
create index on the_table using gin (customer);
which uses the default jsonb_ops operator.
According to the manual the jsonb_path_ops operator is faster but only supports the #> operator. So if that is the only type of condition you have (for that column), using jsonb_path_ops might be more efficient:
create index on the_table using gin (customer jsonb_path_ops);

Postgres jsonb query missing index?

We have the following json documents stored in our PG table (identities) in a jsonb column 'data':
{
"email": {
"main": "mainemail#email.com",
"prefix": "aliasPrefix",
"prettyEmails": ["stuff1", "stuff2"]
},
...
}
I have the following index set up on the table:
CREATE INDEX ix_identities_email_main
ON identities
USING gin
((data -> 'email->main'::text) jsonb_path_ops);
What am I missing that is preventing the following query from hitting that index?? It does a full seq scan on the table... We have tens of millions of rows, so this query is hanging for 15+ minutes...
SELECT * FROM identities WHERE data->'email'->>'main'='mainemail#email.com';
If you use JSONB data type for your data column, in order to index ALL "email" entry values you need to create following index:
CREATE INDEX ident_data_email_gin_idx ON identities USING gin ((data -> 'email'));
Also keep in mind that for JSONB you need to use appropriate list of operators;
The default GIN operator class for jsonb supports queries with the #>,
?, ?& and ?| operators
Following queries will hit this index:
SELECT * FROM identities
WHERE data->'email' #> '{"main": "mainemail#email.com"}'
-- OR
SELECT * FROM identities
WHERE data->'email' #> '{"prefix": "aliasPrefix"}'
If you need to search against array elements "stuff1" or "stuff2", index above will not work , you need to explicitly add expression index on "prettyEmails" array element values in order to make query work faster.
CREATE INDEX ident_data_prettyemails_gin_idx ON identities USING gin ((data -> 'email' -> 'prettyEmails'));
This query will hit the index:
SELECT * FROM identities
WHERE data->'email' #> '{"prettyEmails":["stuff1"]}'