Postgres jsonb index on nested integer field - postgresql

I've got the following data structure in my postgres database - a jsonb column called customer
{
"RequestId": "00000000-0000-0000-0000-000000000000",
"Customer": {
"Status": "A",
"AccountId": 14603582,
"ProfileId": 172,
"ReferralTypeId": 15
}
"Contact": {
"Telephone": "",
"Email": ""
}
}
I want to create an index on the ProfileId field, which is an integer.
I've been unable to find an example of how to create an index on a nested field.
The query I'm executing (which takes ~300s) is:
select id, customer from where customer #> '{"Customer":{"ProfileId": 172}}'

The operator classes jsonb_path_ops and jsonb_ops for GIN indexes support the #> operator.
So your query should be able to use the following index
create index on the_table using gin (customer);
which uses the default jsonb_ops operator.
According to the manual the jsonb_path_ops operator is faster but only supports the #> operator. So if that is the only type of condition you have (for that column), using jsonb_path_ops might be more efficient:
create index on the_table using gin (customer jsonb_path_ops);

Related

How to create index for postgresql jsonb field (array data) and text field

Please let me know how to create index for below query.
SELECT * FROM customers
WHERE identifiers #>
'[{"systemName": "SAP", "systemReference": "33557"}]'
AND country_code = 'IN';
identifiers is jsonb type and data is as below.
[{"systemName": "ERP", "systemReference": "TEST"}, {"systemName": "FEED", "systemReference": "2733"}, {"systemName": "SAP", "systemReference": "33557"}]
country_code is varchar type.
Either create a GIN index on identifiers ..
CREATE INDEX customers_identifiers_idx ON customers
USING GIN(identifiers);
.. or a composite index with identifiers and country_code.
CREATE INDEX customers_country_code_identifiers_idx ON customers
USING GIN(identifiers,country_code gin_trgm_ops);
The second option will depend on the values distribution of country_code.
Demo: db<>fiddle
You can create gin index for jsonb typed columns in Postgresql. gin index has built-in operator classes to handle jsonb operators. Learn more about gin index here https://www.postgresql.org/docs/12/gin-intro.html
For varchar types, btree index is good enough.

Postgres - How to perform a LIKE query on a JSONB field?

I have a jsonb field called passengers, with the following structure:
note that persons is an array
{
"adults": {
"count": 2,
"persons": [
{
"age": 45,
"name": "Prof. Kyleigh Walsh II",
"birth_date": "01-01-1975"
},
{
"age": 42,
"name": "Milford Wiza",
"birth_date": "02-02-1978"
}
]
}
}
How may I perform a query against the name field of this JSONB? For example, to select all rows which match the name field Prof?
Here's my rudimentary attempt:
SELECT passengers from opportunities
WHERE 'passengers->adults' != NULL
AND 'passengers->adults->persons->name' LIKE '%Prof';
This returns 0 rows, but as you can see I have one row with the name Prof. Kyleigh Walsh II
This: 'passengers->adults->persons->name' LIKE '%Prof'; checks if the string 'passengers->adults->persons->name' ends with Prof.
Each key for the JSON operator needs to be a separate element, and the column name must not be enclosed in single quotes. So 'passengers->adults->persons->name' needs to be passengers -> 'adults' -> 'persons' -> 'name'
The -> operator returns a jsonb value, you want a text value, so the last operator should be ->>
Also != null does not work, you need to use is not null.
SELECT passengers
from opportunities
WHERE passengers -> 'adults' is not NULL
AND passengers -> 'adults' -> 'persons' ->> 'name' LIKE 'Prof%';
The is not null condition isn't really necessary, because that is implied with the second condition. The second condition could be simplified to:
SELECT passengers
from opportunities
WHERE passengers #>> '{adults,persons,name}' LIKE 'Prof%';
But as persons is an array, the above wouldn't work and you need to use a different approach.
With Postgres 9.6 you will need a sub-query to unnest the array elements (and thus iterate over each one).
SELECT passengers
from opportunities
WHERE exists (select *
from jsonb_array_elements(passengers -> 'adults' -> 'persons') as p(person)
where p.person ->> 'name' LIKE 'Prof%');
To match a string at the beginning with LIKE, the wildcard needs to be at the end. '%Prof' would match 'Some Prof' but not 'Prof. Kyleigh Walsh II'
With Postgres 12, you could use a SQL/JSON Path expression:
SELECT passengers
from opportunities
WHERE passengers #? '$.adults.persons[*] ? (#.name like_regex "Prof.*")'

Upserting a postgres jsonb based on multiple properties in jsonb field

I am trying to upsert into a table with jsonb field based on multiple json properties in the jsonb field using below query
insert into testtable(data) values('{
"key": "Key",
"id": "350B79AD",
"value": "Custom"
}')
On conflict(data ->>'key',data ->>'id')
do update set data =data || '{"value":"Custom"}'
WHERE data ->> 'key' ='Key' and data ->> 'appid'='350B79AD'
Above query throws error as below
ERROR: syntax error at or near "->>"
LINE 8: On conflict(data ->>'key',data ->>'id')
am I missing something obvious here?
I suppose you want to insert unique id and key combination value into the table. Then you need a unique constraint for them :
create unique index on testtable ( (data->>'key'), (data->>'id') );
and also use extra parentheses for the on conflict clause as tuple :
on conflict( (data->>'key'), (data->>'id') )
and qualify the jsonb column name ( data ) by table name (testtable) whenever you meet after do update set or after where clauses as testtable.data. So, convert your statement to :
insert into testtable(data) values('{
"key": "Key",
"id": "350B79AD",
"value": "Custom1"
}')
on conflict( (data->>'key'), (data->>'id') )
do update set data = testtable.data || '{"value":"Custom2"}'
where testtable.data ->> 'key' ='Key' and testtable.data ->> 'id'='350B79AD';
btw, data ->> 'appid'='350B79AD' converted to data ->> 'id'='350B79AD' ( appid -> id )
Demo

Postgresql doesn't use GIN index for "?" JSON operator

By some reason index is not used for "?" operator.
Let's take this sample https://schinckel.net/2014/05/25/querying-json-in-postgres/ :
CREATE TABLE json_test (
id serial primary key,
data jsonb
);
INSERT INTO json_test (data) VALUES
('{}'),
('{"a": 1}'),
('{"a": 2, "b": ["c", "d"]}'),
('{"a": 1, "b": {"c": "d", "e": true}}'),
('{"b": 2}');
And create an index.
create index json_test_index on public.json_test using gin (data jsonb_path_ops) tablespace pg_default;
Then take a look at plan of the following query:
SELECT * FROM json_test WHERE data ? 'a';
There will be Seq Scan while I would expect an index scan. Could please somebody advise what's wrong here?
From the docs: "The non-default GIN operator class jsonb_path_ops supports indexing the #> operator only." It doesn't support the ? operator.
So use the default operator for jsonb instead (called "jsonb_ops", if you wish to spell it out explicitly).
But if your table only has 5 rows, it probably won't use the index anyway, unless you force it by set enable_seqscan = off.

Postgres jsonb query missing index?

We have the following json documents stored in our PG table (identities) in a jsonb column 'data':
{
"email": {
"main": "mainemail#email.com",
"prefix": "aliasPrefix",
"prettyEmails": ["stuff1", "stuff2"]
},
...
}
I have the following index set up on the table:
CREATE INDEX ix_identities_email_main
ON identities
USING gin
((data -> 'email->main'::text) jsonb_path_ops);
What am I missing that is preventing the following query from hitting that index?? It does a full seq scan on the table... We have tens of millions of rows, so this query is hanging for 15+ minutes...
SELECT * FROM identities WHERE data->'email'->>'main'='mainemail#email.com';
If you use JSONB data type for your data column, in order to index ALL "email" entry values you need to create following index:
CREATE INDEX ident_data_email_gin_idx ON identities USING gin ((data -> 'email'));
Also keep in mind that for JSONB you need to use appropriate list of operators;
The default GIN operator class for jsonb supports queries with the #>,
?, ?& and ?| operators
Following queries will hit this index:
SELECT * FROM identities
WHERE data->'email' #> '{"main": "mainemail#email.com"}'
-- OR
SELECT * FROM identities
WHERE data->'email' #> '{"prefix": "aliasPrefix"}'
If you need to search against array elements "stuff1" or "stuff2", index above will not work , you need to explicitly add expression index on "prettyEmails" array element values in order to make query work faster.
CREATE INDEX ident_data_prettyemails_gin_idx ON identities USING gin ((data -> 'email' -> 'prettyEmails'));
This query will hit the index:
SELECT * FROM identities
WHERE data->'email' #> '{"prettyEmails":["stuff1"]}'