Postgres: How to string pattern match query a json column? - postgresql

I have a column with json type but I'm wondering how to select filter it i.e.
select * from fooTable where myjson like "orld";
How would I query for a substring match like the above. Say searching for "orld" under "bar" keys?
{ "foo": "hello", "bar": "world"}
I took a look at this documentation but it is quite confusing.
https://www.postgresql.org/docs/current/static/datatype-json.html

Use the ->> operator to get json attributes as text, example
with my_table(id, my_json) as (
values
(1, '{ "foo": "hello", "bar": "world"}'::json),
(2, '{ "foo": "hello", "bar": "moon"}'::json)
)
select t.*
from my_table t
where my_json->>'bar' like '%orld'
id | my_json
----+-----------------------------------
1 | { "foo": "hello", "bar": "world"}
(1 row)
Note that you need a placeholder % in the pattern.

Related

How to use wildcard in the path to search jsonb values for postgres?

Using postgres version 10.13
This is my datatable jsongraphs
id
jsongraph
1
{ "data": {"scopes_by_id": { "121": { "id": 121, "pk": 121, "name": "Prework" } }, "commonsites_by_id": {"123": {"id": 123, "pk": 123, "name": "Somewhere over the rainbow"}}}}
2
{ "data": {"scopes_by_id": { "156": { "id": 156, "pk": 156, "name": "ABC" } }, "commonsites_by_id": {"123": {"id": 123, "pk": 123, "name": "Somewhere over the rainbow"}}}}
I want the distinct values of scope id and site id which should be (121, 123), (156,123)
So I tried
SELECT DISTINCT
jsongraph->'data'->'scopes_by_id'->>'pk' ,
jsongraph->'data'->'commonsites_by_id'->>'pk' from jsongraphs;
This won't work because the path should be like data->scopes_by_id->121->>pk but I cannot know beforehand the value of 121 in between.
Is there a way to get the values of what I need by filling in some kind of wildcard in the path?
E.g.data->scopes_by_id->{*}->>pk like that?
ANd because this is legacy data, it's also hard to change the data itself.
As the nesting level seems to be fixed, you could do something like this:
select j.id, scopes.*, commonsites.*
from jsongraphs j
cross join lateral (
select jsonb_agg(j.jsongraph #> array['data','scopes_by_id', t1.scope_id, 'pk']) as scope_ids
from jsonb_each_text(j.jsongraph #> '{data,scopes_by_id}') as t1(scope_id)
) scopes
cross join lateral (
select jsonb_agg(j.jsongraph #> array['data','commonsites_by_id', t2.site_id, 'pk']) as common_ids
from jsonb_each_text(j.jsongraph #> '{data,commonsites_by_id}') as t2(site_id)
) commonsites
order by id;
The sub-queries extract all key below the respective part (e.g. scopes_by_id) and then uses the #>' operator to access the path for each id inside the original JSON value. And finally all PK values are aggregated back into a single array.
This returns the PK values from each part separately as an array in order to handle the situation where you have a different number of "scope ids" and "commonsite ids"
If you just want "the first" id from each section, you can remove the aggregation and use a LIMIT clause:
select j.id, scopes.*, commonsites.*
from jsongraphs j
cross join lateral (
select j.jsongraph #> array['data','scopes_by_id', t1.scope_id, 'pk'] as scope_id
from jsonb_each_text(j.jsongraph #> '{data,scopes_by_id}') as t1(scope_id)
limit 1
) scopes
cross join lateral (
select j.jsongraph #> array['data','commonsites_by_id', t2.site_id, 'pk'] as common_id
from jsonb_each_text(j.jsongraph #> '{data,commonsites_by_id}') as t2(site_id)
limit 1
) commonsites
order by id;
Not sure on which level you want to apply the "distinct" part for this.
In Postgres 12 or later, you could achieve the same with:
select id,
jsonb_path_query_array(j.jsongraph, 'strict $.data.scopes_by_id.**.pk') as scopes,
jsonb_path_query_array(j.jsongraph, 'strict $.data.commonsites_by_id.**.pk') as common
from jsongraphs ;
order by id;
Online example

Different path formats for PostgreSQL JSONB functions

I'm confused by how path uses different formats depending on the function in the PostgreSQL JSONB documentation.
If I had a PostgreSQL table foo that looks like
pk
json_obj
0
{"values": [{"id": "a_b", "value": 5}, {"id": "c_d", "value": 6]}
1
{"values": [{"id": "c_d", "value": 7}, {"id": "e_f", "value": 8]}
Why does this query give me these results?
SELECT json_obj, -- {"values": [{"id": "a_b", "value": 5}, {"id": "c_d", "value": 6]}
json_obj #? '$.values[*].id', -- true
json_obj #> '$.values[*].id', -- ERROR: malformed array literal
json_obj #> '{values, 0, id}', -- "a_b"
JSONB_SET(json_obj, '$.annotations[*].id', '"hi"') -- ERROR: malformed array literal
FROM foo;
Specifically, why does #? support $.values[*].id (described on that page in another section) but JSONB_SET uses some other path format {bar,3,baz}?
Ultimately, what I would like to do and don't know how, is to remove non-alphanumeric characters (e.g. underscores in this example) in all id values represented by the path $.values[*].id.
The reason is that the operators have different data types on the right hand side.
SELECT oprname, oprright::regtype
FROM pg_operator
WHERE oprleft = 'jsonb'::regtype
AND oprname IN ('#?', '#>');
oprname | oprright
---------+----------
#> | text[]
#? | jsonpath
(2 rows)
Similarly, the second argument of jsonb_set is a text[].
Now '$.values[*].id' is a valid jsonpath, but not a valid text[] literal.
Thanks for the answers and comments about why the data types were different.
I wanted to post how I solved my problem:
Ultimately, what I would like to do and don't know how, is to remove
non-alphanumeric characters (e.g. underscores in this example) in all
id values represented by the path $.values[*].id.
WITH unnested AS (
SELECT f.pk, JSONB_ARRAY_ELEMENTS(f.json_obj -> 'values') AS value
FROM foo f
),
updated_values AS (
SELECT un.pk, JSONB_SET(un.value, '{id}', TO_JSONB(LOWER(REGEXP_REPLACE(un.value ->> 'id', '[^a-zA-Z0-9]', '', 'g'))), FALSE) AS new_value
FROM unnested un
WHERE value -> 'id' IS NOT NULL -- Had some values that didn't have 'id' keys
)
UPDATE foo f2
SET json_obj = JSONB_SET(f2.json_obj, '{values}', (SELECT JSONB_AGG(uv.new_value) FROM updated_values uv WHERE uv.pk = f2.pk), FALSE)
WHERE JSONB_PATH_EXISTS(f2.json_obj, '$.values[*].id') -- Had some values that didn't have 'id' keys

How to query deep jsonb in Postgres?

I have a jsonb column (called info) in Postgres which structure looks like this:
{ name: 'john', house_id: null, extra_attrs: [{ attr_id: 4, attr_value: 'a value' }, { attr_id: 5, attr_value: 'another value' }] }
It can have N extra_attrs but we know that each of them will have just two keys: the attr_id and the attr_value.
Now, what is the best way to query for info that has extra_attrs with a specific attr_id and attr_value. I have done it like this, and it works:
Given the following data structure to query for:
[{ attr_id: 4, values: ['a value', 'something else'] }, { attr_id: 5, values: ['another value'] }]
The following query works:
select * from people
where (info #> '{"extra_attrs": [{ "attr_id": 4, "attr_value": "a value" }]} OR info #> '{"extra_attrs": [{ "attr_id": 4, "attr_value": "something else" }]) AND info #> '{"extra_attrs": [{ "attr_id": 5, "attr_value": "another value" }]}
I am wondering if there is a better way to do so or this is fine.
One alternate method would involve json functions and transforming data to apply the filter on:
SELECT people.info
FROM people,
LATERAL (SELECT DISTINCT True is_valid
FROM jsonb_array_elements(info->'extra_attrs') y
WHERE (y->>'attr_id', y->>'attr_value') IN (
('4', 'a value'),
('4', 'something else'),
('5','another value')
)
) y
WHERE is_valid
I believe this method more convenient for dynamic filters since the id/value pairs are added in only 1 place.
A similar (and perhaps slightly faster) method would use WHERE EXISTS and compare json documents like below.
SELECT people.info
FROM people
WHERE EXISTS (SELECT TRUE
FROM jsonb_array_elements(info->'extra_attrs') attrs
WHERE attrs #> ANY(ARRAY[
JSONB '{ "attr_id": 4, "attr_value": "a value" }',
JSONB '{ "attr_id": 4, "attr_value": "something else" }',
JSONB '{ "attr_id": 5, "attr_value": "another value" }'
]
)
)

Updating PostgreSQL JSONB key value by adding 1 to existing key value

Im getting error while updating JSON data
CREATE TABLE testTable
AS
SELECT $${
"id": 1,
"value": 100
}$$::jsonb AS jsondata;
and I want to update value to 101 by adding 1, after visiting many websites I found this statement
UPDATE testTable
SET jsondata = JSONB_SET(jsondata, '{value}', (jsondata->>'value')::int + 1);
but above one is giving error "cannot convert jsonb to int"
and my expected output is
{
"id": 1,
"value": 101
}
Look at the signature of jsonb_set (using \df jsonb_set)
Schema | Name | Result data type | Argument data types | Type
------------+-----------+------------------+----------------------------------------------------------------------------------------+--------
pg_catalog | jsonb_set | jsonb | jsonb_in jsonb, path text[], replacement jsonb, create_if_missing boolean DEFAULT true | normal
What you want is this..
UPDATE testTable
SET jsondata = jsonb_set(
jsondata,
ARRAY['value'],
to_jsonb((jsondata->>'value')::int + 1)
)
;

Query rows for matching JSONB column where key ends with a name and key value is a specific value

Given the following rows with a jsonb column details. How do I write a query so that records where the key name ends with _col with value B are selected. So records with ids 1, 2.
id | details
1 | { "one_col": "A", "two_col": "B" }
2 | { "three_col": "B" }
3 | { another: "B" }
So far I've only find ways to match based on the value, not the key.
Use the function jsonb_each_text() which gives json objects as pairs (key, value):
with the_data(id, details) as (
values
(1, '{ "one_col": "A", "two_col": "B" }'::jsonb),
(2, '{ "three_col": "B" }'),
(3, '{ "another": "B" }')
)
select t.*
from the_data t,
lateral jsonb_each_text(details)
where key like '%_col'
and value = 'B';
id | details
----+----------------------------------
1 | {"one_col": "A", "two_col": "B"}
2 | {"three_col": "B"}
(2 rows)