PostgresSQL nested jsonb query for multiple key/value pairs - postgresql

Starting out with JSONB data type and I'm hoping someone can help me out.
I have a table (properties) with two columns (id as primary key and data as jsonb).
The data structure is:
{
"ProductType": "ABC",
"ProductName": "XYZ",
"attributes": [
{
"name": "Color",
"type": "STRING",
"value": "Silver"
},
{
"name": "Case",
"type": "STRING",
"value": "Shells"
},
...
]
}
I would like to get all rows where an attribute has a specific value i.e. return all rows where Case = 'Shells' and/or Color = 'Red'.
I have tried the following but I'm not able to apply two conditions like Case = 'Shells' and Color = 'Silver'.
I can get rows for when a single attributes' name and value matches conditions but I can't figure out how to get it to work for multiple attributes.
EDIT 1:
I'm able to achieve the results using the following query:
WITH properties AS (
select *
from (
values
(1, '{"ProductType": "ABC","ProductName": "XYZ","attributes": [{"name": "Color","type": "STRING","value": "Silver"},{"name": "Case","type": "STRING","value": "Shells"}]}'::jsonb),
(2, '{"ProductType": "ABC","ProductName": "XYZ","attributes": [{"name": "Color","type": "STRING","value": "Red"},{"name": "Case","type": "STRING","value": "Shells"}]}'::jsonb)
) s(id, data)
)
select
*
from (
SELECT
id,
jsonb_object_agg(attr ->> 'name', attr -> 'value') as aggr
FROM properties m,
jsonb_array_elements(data -> 'attributes') as attr
GROUP BY id
) a
where aggr ->> 'Color' = 'Red' and aggr ->> 'Case' LIKE 'Sh%'
I could potentially have millions of these records so I suppose my only concern now is if this is efficient and if not, is there a better way?

step-by-step demo:db<>fiddle
SELECT
id
FROM properties m,
jsonb_array_elements(data -> 'attributes') as attr
GROUP BY id
HAVING jsonb_object_agg(attr ->> 'name', attr -> 'value') #> '{"Color":"Silver", "Case":"Shells"}'::jsonb
The problem is, that jsonb_array_elements() moves both related values into different records. However, this call is necessary to fetch the values. So, you need to reaggregate the values after you were able to read them. This would make it possible to check them in a related manner.
This can be achieved by using the jsonb_object_agg() aggregation function. The trick here is that we create an object with attributes like "name":"value". So, with that, we can easily check if all required attributes are in the JSON object using the #> operator.
Concerning "Edit 1"
demo:db<>fiddle
You can do this:
SELECT
*
FROM (
SELECT
id,
jsonb_object_agg(attr ->> 'name', attr -> 'value') as obj
FROM properties m,
jsonb_array_elements(data -> 'attributes') as attr
GROUP BY id
) s
WHERE obj ->> 'Color' = 'Silver'
AND obj ->> 'Case' LIKE 'Sh%'
Create the new JSON structure as described above for all JSONs
Filter this result afterward.
Alternatively you can use jsonb_object_agg() in the HAVING clause as often as you need it. I guess you need to check which way is more performant in your case:
SELECT
id
FROM properties m,
jsonb_array_elements(data -> 'attributes') as attr
GROUP BY id
HAVING
jsonb_object_agg(attr ->> 'name', attr -> 'value') ->> 'Color' = 'Silver'
AND
jsonb_object_agg(attr ->> 'name', attr -> 'value') ->> 'Case' LIKE 'Sh%'

Related

Filter by a nested jsonb array column against the array of the same signature

I have the attendee table that has a jsonb array field called eventfilters.
Given the query filter array of the same signature (form) as the eventfilters, I need to select the attendees filtered against this eventfilters field by a query filter value.
Below is the example of both the query filter and eventfilters field:
The eventfilters field looks like:
[
{
"field": "Org Type",
"selected": ["B2C", "Both", "Nonprofit"]
},
{
"field": "Job Role",
"selected": ["Customer Experience", "Digital Marketing"]
},
{
"field": "Industry Sector",
"selected": ["Advertising", "Construction / Development"]
}
]
The query filter can look like this:
[
{
"field": "Org Type",
"selected": ["B2C", "Nonprofit"]
},
{
"field": "Industry Sector",
"selected": ["Advertising"]
}
]
So, both the eventfilters field and the query filter are always of the same signature:
Array<{"field": text, "selected": text[]}>
Given the query filter and eventfilters from the above, the filtering logic would be the following:
Select all attendees that have the eventfilters field such that:
the selected array with the field: "Org Type" of the attendee (eventfilters) contains any of the values that are present in the selected array with the field "Org Type" of the query filter;
and
the selected array with the field: "Industry Sector" of the attendee (eventfilters) contains any of the values that are present in the selected array with the field "Industry Sector" of the query filter.
The query filter array can be of different length and have different elements, but always with the same signature (form).
What I could come up with is the logic stated above but not with the and for each element in the query filter, but with the or:
select distinct attendee.id,
attendee.email,
attendee.eventfilters
from attendee cross join lateral jsonb_array_elements(attendee.eventfilters) single_filter
where (
((single_filter ->> 'field')::text = 'Org Type' and (single_filter ->> 'selected')::jsonb ?| array ['B2C', 'Nonprofit'])
or ((single_filter ->> 'field')::text = 'Industry Sector' and (single_filter ->> 'selected')::jsonb ?| array ['Advertising'])
);
Basically I need to change that or in the where clause in the query above to and, but this, obviously, will not work.
The where clause will be generated dynamically.
Here's the example of how I generate it now (it's javascript, but I hope you can grasp the idea):
function buildEventFiltersWhereSql(eventFilters) {
return eventFilters.map((filter) => {
const selectedArray = filter.selected.map((s) => `'${s}'`).join(', ');
return `((single_filter ->> 'field')::text = '${filter.field}' and (single_filter ->> 'selected')::jsonb ?| array[${selectedArray}])`;
}).join('\nor ');
}
The simple swap of or and and in logic seems very different in the implementation. I guess it would be easier to implement it with the use of jsonpath, but my postgres version is 11 :(
How can I implement such filtering?
PS: create table and insert code for reproducing: https://pastebin.com/1tsHyJV0
So, I poked around a little bit and got the following query:
with unnested_filters as (
select distinct attendee.id,
jsonb_agg((single_filter ->> 'selected')::jsonb) filter (where (single_filter ->> 'field')::text = 'Org Type') over (partition by attendee.id) as "Org Type",
jsonb_agg((single_filter ->> 'selected')::jsonb) filter (where (single_filter ->> 'field')::text = 'Industry Sector') over (partition by attendee.id) as "Industry Sector",
attendee.email,
attendee.eventid,
attendee.eventfilters
from attendee
cross join lateral jsonb_array_elements(attendee.eventfilters) single_filter
where eventid = 1
) select * from unnested_filters where (
(unnested_filters."Org Type" #>> '{0}')::jsonb ?| array['B2C', 'Both', 'Nonprofit']
and (unnested_filters."Industry Sector" #>> '{0}')::jsonb ?| array['Advertising']
);
It's a bit weird, especially the part with jsonb_agg, but seems like it works.

Update table with newly added column containing data from the same table old column, but modified (flattened) jsonb

So i've came across issue with having to migrate data from one column to "clone" of itself with different jsonb schema -> i need to parse the json from
["keynamed": [...{"type": "type_info", "value": "value_in_here"}]]into something plain object with key:value - dictionary like {"type_info": "value_in_here" ,...}
so far i've tried with subqueries and json functions in subquery + switch case to map "type" to "type_info" and then use jsonb_build_object(), but this takes data from the wole table and i need to have it on update with data from row - is there anything simpler than doing N subqueries closest way i've came with is:
select
jsonb_object_agg(t.k, t.v):: jsonb as _json
from
(
select
jsonb_build_object(type_, _value) as _json
from
(
select
_value,
CASE _type
...
END type_
from
(
select
(datasets ->> 'type') as _type,
datasets -> 'value' as _value
from
(
select
jsonb_array_elements(
values
-> 'keynamed'
) as datasets
from
table
) s
) s
) s
) s,
jsonb_each(_json) as t(k, v);
But i have no idea how to make it row specyfic and apply to simple update like:
UPDATE table
SET table.new_field = (subquery with parsed dict in json)
Any ideas/tips how to solve it with plain PSQL without any external support?
The expected output of the table would be:
id | old_value | new_value
----------------+-------------------------------------+------------------------------------
1 | ["keynamed": [...{"type": "type_info", "value": "value_in_here"}]] | {"type_info": "value_in_here" ,...}
According to postgres documents you can use update with select table and use join pattern update document
Sample:
UPDATE accounts SET contact_first_name = first_name,
contact_last_name = last_name
FROM salesmen WHERE salesmen.id = accounts.sales_id;
If I understand correctly, below query can help you. but I can't test because I haven't sample data and I don't know this query has syntax error or not.
update table t
set new_value = tmp._json
from (
select
id,
jsonb_object_agg(t.k, t.v):: jsonb as _json
from
(
select
id,
jsonb_build_object(type_, _value) as _json
from
(
select
id,
_value,
CASE _type
...
END type_
from
(
select
id,
(datasets ->> 'type') as _type,
datasets -> 'value' as _value
from
(
select
id,
jsonb_array_elements(
values
-> 'keynamed'
) as datasets
from
table
) s
) s
) s
) s,
jsonb_each(_json) as t(k, v)
group by id) tmp
where tmp.id = t.id;

Updating JSON column in Postgres?

I've got a JSON column in Postgres, that contains data in the following structure
{
"listings": [{
"id": "KTyneMdrAhAEKyC9Aylf",
"active": true
},
{
"id": "ZcjK9M4tuwhWWdK8WcfX",
"active": false
}
]
}
I need to do a few things, all of which I am unsure how to do
Add a new object to the listings array
Remove an object from the listings array based on its id
Update an object in the listings array based on its id
I am also using Sequelize if there are any built in functions to do this (I can't see anything obvious)
demo:db<>fiddle
Insert (jsonb_insert()):
UPDATE mytable
SET mydata = jsonb_insert(mydata, '{listings, 0}', '{"id":"foo", "active":true}');
Update (expand array, change value with jsonb_set(), reaggregate):
UPDATE mytable
SET mydata = jsonb_set(mydata, '{listings}', s.a)
FROM
(
SELECT
jsonb_agg(
CASE WHEN elems ->> 'id' = 'foo' THEN jsonb_set(elems, '{active}', 'false')
ELSE elems
END
) as a
FROM
mytable,
jsonb_array_elements(mydata -> 'listings') AS elems
) s;
Delete (expand array, filter relevant elements, reaggregate):
UPDATE mytable
SET mydata = jsonb_set(mydata, '{listings}', s.a)
FROM
(
SELECT
jsonb_agg(elems) as a
FROM
mytable,
jsonb_array_elements(mydata -> 'listings') AS elems
WHERE elems ->> 'id' != 'foo'
) s;
Updating and deleting can be done in one line like the inserting. But in that case you have to use the array element index instead of a certain value. If you want to change an array element with a value you need to read it first. This is only possible with a prior expanding.

Find element in array of jsonb documents by value as case insensitive

Assume I have userinfo table with person column containing the following jsonb object:
{
"skills": [
{
"name": "php"
},
{
"name": "Python"
}
]
}
In order to get Python skill i would write the following query
select * from userinfo
where person -> 'skills' #> '[{"name":"Python"}]'
It works well, but if specify '[{"name":"python"}]' as lower case it doesn't return me what i want.
How can i write case insensitive query there?
Postgre version is 11.2
you can do that when unnesting with an exists predicate:
select u.*
from userinfo u
where exists (select *
from jsonb_array_elements(u.person -> 'skills') as s(j)
-- make sure to only do this for rows that actually contain an array
where jsonb_typeof(u.person -> 'skills') = 'array'
and lower(s.j ->> 'name') = 'python');
Online example: https://rextester.com/XKVUA73952
demo:db<>fiddle
AFAIK there is no in-built JSON function for that. So, you have to convert the JSON-String to lower case (meaning casting into type text, lower-case it, recast it into type jsonb):
WHERE lower(person::text)::jsonb -> 'skills' #> '[{"name":"python"}]'
select * from (select 1 as id, 'fname' as firstname, 'sname' as surname, jsonb_array_elements('{
"skills": [{"name": "php"},{"name": "Python"}]}'::jsonb->'skills') skill) p
where p.skill->>'name' ilike 'python';
to suit the tables in the question it'd be something like
select * from (select *, jsonb_array_elements(person->'skills') skill from userinfo) u
where u.skill->>'name' ilike 'python';
Just a note, this will return multiple entries for the same userinfo if you start looking for multiple skills .. if you use the above, you'd want to group by the fields you want returned or select distinct id, username etc.
like (assuming there's an id column in the userinfo table)
select distinct id from (select *, jsonb_array_elements(person->'skills') skill from userinfo) u
where u.skill->>'name' ilike 'python' or u.skill->>'name' ilike 'php';
it all depends what you want to do

PostgreSQL - jsonb_each

I have just started to play around with jsonb on postgres and finding examples hard to find online as it is a relatively new concept.I am trying to use jsonb_each_text to printout a table of keys and values but get a csv's in a single column.
I have the below json saved as as jsonb and using it to test my queries.
{
"lookup_id": "730fca0c-2984-4d5c-8fab-2a9aa2144534",
"service_type": "XXX",
"metadata": "sampledata2",
"matrix": [
{
"payment_selection": "type",
"offer_currencies": [
{
"currency_code": "EUR",
"value": 1220.42
}
]
}
]
}
I can gain access to offer_currencies array with
SELECT element -> 'offer_currencies' -> 0
FROM test t, jsonb_array_elements(t.json -> 'matrix') AS element
WHERE element ->> 'payment_selection' = 'type'
which gives a result of "{"value": 1220.42, "currency_code": "EUR"}", so if i run the below query I get (I have to change " for ')
select * from jsonb_each_text('{"value": 1220.42, "currency_code": "EUR"}')
Key | Value
---------------|----------
"value" | "1220.42"
"currency_code"| "EUR"
So using the above theory I created this query
SELECT jsonb_each_text(data)
FROM (SELECT element -> 'offer_currencies' -> 0 AS data
FROM test t, jsonb_array_elements(t.json -> 'matrix') AS element
WHERE element ->> 'payment_selection' = 'type') AS dummy;
But this prints csv's in one column
record
---------------------
"(value,1220.42)"
"(currency_code,EUR)"
The primary problem here, is that you select the whole row as a column (PostgreSQL allows that). You can fix that with SELECT (jsonb_each_text(data)).* ....
But: don't SELECT set-returning functions, that can often lead to errors (or unexpected results). Instead, use f.ex. LATERAL joins/sub-queries:
select first_currency.*
from test t
, jsonb_array_elements(t.json -> 'matrix') element
, jsonb_each_text(element -> 'offer_currencies' -> 0) first_currency
where element ->> 'payment_selection' = 'type'
Note: function calls in the FROM clause are implicit LATERAL joins (here: CROSS JOINs).
WITH testa AS(
select jsonb_array_elements
(t.json -> 'matrix') -> 'offer_currencies' -> 0 as jsonbcolumn from test t)
SELECT d.key, d.value FROM testa
join jsonb_each_text(testa.jsonbcolumn) d ON true
ORDER BY 1, 2;
tetsa get the temporal jsonb data. Then using lateral join to transform the jsonb data to table format.