JSONB Data Type Modification in Postgresql - postgresql

I have a doubt with modification of jsonb data type in postgres
Basic setup:-
array=> ["1", "2", "3"]
and now I have a postgresql database with an id column and a jsonb datatype column named lets just say cards.
id cards
-----+---------
1 {"1": 3, "4": 2}
thats the data in the table named test
Question:
How do I convert the cards of id->1 FROM {"1": 3, "4": 2} TO {"1": 4, "4":2, "2": 1, "3": 1}
How I expect the changes to occur:
From the array, increment by 1 all elements present inside the array that exist in the cards jsonb as a key thus changing {"1": 3} to {"1": 4} and insert the values that don't exist as a key in the cards jsonb with a value of 1 thus changing {"1":4, "4":2} to {"1":4, "4":2, "2":1, "3":1}
purely through postgres.
Partial Solution
I asked a senior for support regarding my question and I was told this:-
Roughly (names may differ): object keys to explode cards, array_elements to explode the array, left join them, do the calculation, re-aggregate the object. There may be a more direct way to do this but the above brute-force approach will work.
So I tried to follow through it using these two functions json_each_text(), json_array_elements_text() but ended up stuck halfway into this as well as I was unable to understand what they meant by left joining two columns:-
SELECT jsonb_each_text(tester_cards) AS each_text, jsonb_array_elements_text('[["1", 1], ["2", 1], ["3", 1]]') AS array_elements FROM tester WHERE id=1;
TLDR;
Update statement that checks whether a range of keys from an array exist or not in the jsonb data and automatically increments by 1 or inserts respectively the keys into the jsonb with a value of 1
Now it might look like I'm asking to be spoonfed but I really haven't managed to find anyway to solve it so any assistance would be highly appreciated 🙇

The key insight is that with jsonb_each and jsonb_object_agg you can round-trip a JSON object in a subquery:
SELECT id, (
SELECT jsonb_object_agg(key, value)
FROM jsonb_each(cards)
) AS result
FROM test;
(online demo)
Now you can JOIN these key-value pairs against the jsonb_array_elements of your array input. Your colleague was close, but not quite right: it requires a full outer join, not just a left (or right) join to get all the desired object keys for your output, unless one of your inputs is a subset of the other.
SELECT id, (
SELECT jsonb_object_agg(COALESCE(obj_key, arr_value), …)
FROM jsonb_array_elements_text('["1", "2", "3"]') AS arr(arr_value)
FULL OUTER JOIN jsonb_each(cards) AS obj(obj_key, obj_value) ON obj_key = arr_value
) AS result
FROM test;
(online demo)
Now what's left is only the actual calculation and the conversion to an UPDATE statement:
UPDATE test
SET cards = (
SELECT jsonb_object_agg(
COALESCE(key, arr_value),
COALESCE(obj_value::int, 0) + (arr_value IS NOT NULL)::int
)
FROM jsonb_array_elements_text('["1", "2", "3"]') AS arr(arr_value)
FULL OUTER JOIN jsonb_each_text(cards) AS obj(key, obj_value) ON key = arr_value
);
(online demo)

Related

PostgreSQL: json object where keys are unique array elements and values are the count of times they appear in the array

I have an array of strings, some of which may be repeated. I am trying to build a query which returns a single json object where the keys are the distinct values in the array, and the values are the count of times each value appears in the array.
I have built the following query;
WITH items (item) as (SELECT UNNEST(ARRAY['a','b','c','a','a','a','c']))
SELECT json_object_agg(distinct_values, counts) item_counts
FROM (
SELECT
sub2.distinct_values,
count(items.item) counts
FROM (
SELECT DISTINCT items.item AS distinct_values
FROM items
) sub2
JOIN items ON items.item = sub2.distinct_values
GROUP BY sub2.distinct_values, items.item
) sub1
DbFiddle
Which provides the result I'm looking for: { "a" : 4, "b" : 1, "c" : 2 }
However, it feels like there's probably a better / more elegant / less verbose way of achieving the same thing, so I wondered if any one could point me in the right direction.
For context, I would like to use this as part of a bigger more complex query, but I didn't want to complicate the question with irrelevant details. The array of strings is what one column of the query currently returns, and I would like to convert it into this JSON blob. If it's easier and quicker to do it in code then I can, but I wanted to see if there was an easy way to do it in postgres first.
I think a CTE and json_object_agg() is a little bit of a shortcut to get you there?
WITH counter AS (
SELECT UNNEST(ARRAY['a','b','c','a','a','a','c']) AS item, COUNT(*) AS item_count
GROUP BY 1
ORDER BY 1
)
SELECT json_object_agg(item, item_count) FROM counter
Output:
{"a":4,"b":1,"c":2}

SQL query to filter where all array items in JSONB array meet condition

I made a similar post before, but deleted it as it had contextual errors.
One of the tables in my database includes a JSONB column which includes an array of JSON objects. It's not dissimilar to this example of a session table which I've mocked up below.
id
user_id
snapshot
inserted_at
1
37
{cart: [{product_id: 1, price_in_cents: 3000, name: "product A"}, {product_id: 2, price_in_cents: 2500, name: "product B"}]}
2022-01-01 20:00:00.000000
2
24
{cart: [{product_id: 1, price_in_cents: 3000, name: "product A"}, {product_id: 3, price_in_cents: 5500, name: "product C"}]}
2022-01-02 20:00:00.000000
3
88
{cart: [{product_id: 4, price_in_cents: 1500, name: "product D"}, {product_id: 2, price_in_cents: 2500, name: "product B"}]}
2022-01-03 20:00:00.000000
The query I've worked with to retrieve records from this table is as follows.
SELECT sessions.*
FROM sessions
INNER JOIN LATERAL (
SELECT *
FROM jsonb_to_recordset(sessions.snapshot->'cart')
AS product(
"product_id" integer,
"name" varchar,
"price_in_cents" integer
)
) AS cart ON true;
I've been trying to update the query above to retrieve only the records in the sessions table for which ALL of the products in the cart have a price_in_cents value of greater than 2000.
To this point, I've not had any success on forming this query but I'd be grateful if anyone here can point me in the right direction.
You can use a JSON path expression:
select *
from sessions
...
where not sessions.snapshot ## '$.cart[*].price_in_cents <= 2000'
There is no JSON path expression that would check that all array elements are greater 2000. So this returns those rows where no element is smaller than 2000 - because that can be expressed with a JSON path expression.
Here is one possible solution based on the idea of your original query.
Each element of the cart JSON array object is joined to its sessions parent row. You 're left adding the WHERE clause conditions now that the wanted JSON array elements are exposed.
SELECT *
FROM (
SELECT
sess.id,
sess.user_id,
sess.inserted_at,
cart_items.cart_name,
cart_items.cart_product_id,
cart_items.cart_price_in_cents
FROM sessions sess,
LATERAL (SELECT (snapshot -> 'cart') snapshot_cart FROM sessions WHERE id = sess.id) snap_arr,
LATERAL (SELECT
(value::jsonb ->> 'name')::text cart_name,
(value::jsonb -> 'product_id')::int cart_product_id,
(value::jsonb -> 'price_in_cents')::int cart_price_in_cents
FROM JSONB_ARRAY_ELEMENTS(snap_arr.snapshot_cart)) cart_items
) session_snapshot_cart_product;
Explanation :
From the sessions table, the cart array is exctracted and joined per sessions row
The necessary items of the cart JSON array is then unnested by the second join using the JSONB_ARRAY_ELEMENTS(jsonb) function
The following worked well for me and allowed me the flexibility to use different comparison operators other than just ones such as == or <=.
In one of the scenarios I needed to construct, I also needed to have my WHERE in the subquery also compare against an array of values using the IN comparison operator, which was not viable using some of the other solutions that were looked at.
Leaving this here in case others run into the same issue as I did, or if others find better solutions or want to propose suggestions to build upon this one.
SELECT *
FROM sessions
WHERE NOT EXISTS (
SELECT sessions.*
FROM sessions
INNER JOIN LATERAL (
SELECT *
FROM jsonb_to_recordset(sessions.snapshot->'cart')
AS product(
"product_id" integer,
"name" varchar,
"price_in_cents" integer
)
) AS cart ON true
WHERE name ILIKE "Product%";
)

postgres remove specific element from jsonb array

I am using postgres 10
I have a JsonArray in a jsonb column named boards.
I have a GIN index on the jsonb column.
The column values look like this:
[{"id": "7beacefa-9ac8-4fc6-9ee6-8ff6ab1a097f"},
{"id": "1bc91c1c-b023-4338-bc68-026d86b0a140"}]
I want to delete in all the rows in the column the element
{"id": "7beacefa-9ac8-4fc6-9ee6-8ff6ab1a097f"} if such exists(update the column).
I saw that it is possible to delete an element by position with operator #- (e.g. #-'{1}') and I know you can get the position of an element using "with ordinality" but i cant manage to combine the two things.
How can i update the jsonarray?
One option would be using an update statement containing a query selecting all the sub-elements except {"id": "7beacefa-9ac8-4fc6-9ee6-8ff6ab1a097f"} by using an inequality, and then applying jsonb_agg() function to aggregate those sub-elements :
UPDATE user_boards
SET boards = (SELECT jsonb_agg(j.elm)
FROM user_boards u
CROSS JOIN jsonb_array_elements(boards) j(elm)
WHERE j.elm->>'id' != '7beacefa-9ac8-4fc6-9ee6-8ff6ab1a097f'
AND u.ID = user_boards.ID
GROUP BY ID)
where ID is an assumed identity(unique) column of the table.
Demo

Select greatest number from a json list of variable length with PostgreSQL

I have a column (let's call it jsn) in my database with json object (actually stored as plain text for reasons). This json object looks like this:
{"a":
{"b":[{"num":123, ...},
{"num":456, ...},
...,
{"num":789, ...}],
...
},
...
}
I'm interested in the biggest "num" inside that list of objects "b" inside the object "a".
If the list if of known length I can do it like this:
SELECT
GREATEST((jsn::json->'a'->'b'->>0)::int,
(jsn::json->'a'->'b'->>1)::int,
... ,
(jsn::json->'a'->'b'->>N)::int))
FROM table
Note that I'm new to PostgreSQL (and database querying in general!) so that may be a rubbish way to do it. In any case it works. What I can't figure out is how to make this work when the list, 'b', is of arbitrary and unknown length.
In case it is relevant, I am using PostgreSQL 10 hosted on AWS RDS and running queries using pgAdmin 4.
You need to unnest the array then you can apply a max() on the result:
select max((n.x -> 'num')::int)
from the_table t
cross join jsonb_array_elements(t.jsn::jsonb -> 'a' -> 'b') as n(x);
you probably want to add a group by, so that you can distinguish rom which row the max value came from. Assuming your table has a column id that is unique:
select id, max((n.x -> 'num')::int)
from the_table t
cross join jsonb_array_elements(t.jsn::jsonb -> 'a' -> 'b') as n(x)
group by id;

PostgreSQL: Find and delete duplicated jsonb data, excluding a key/value pair when comparing

I have been searching all over to find a way to do this.
I am trying to clean up a table with a lot of duplicated jsonb fields.
There are some examples out there, but as a little twist, I need to exclude one key/value pair in the jsonb field, to get the result I need.
Example jsonb
{
"main": {
"orders": {
"order_id": "1"
"customer_id": "1",
"update_at": "11/23/2017 17:47:13"
}
}
Compared to:
{
"main": {
"orders": {
"order_id": "1"
"customer_id": "1",
"updated_at": "11/23/2017 17:49:53"
}
}
If I can exclude the "updated_at" key when comparing, the query should find it a duplicate and this, and possibly other, duplicated entries should be deleted, keeping only one, the first "original" one.
I have found this query, to try and find the duplicates. But it doesn't take my situation into account. Maybe someone can help structuring this to meet the requirements.
SELECT t1.jsonb_field
FROM customers t1
INNER JOIN (SELECT jsonb_field, COUNT(*) AS CountOf
FROM customers
GROUP BY jsonb_field
HAVING COUNT(*)>1
) t2 ON t1.jsonb_field=t2.jsonb_field
WHERE
t1.customer_id = 1
Thanks in advance :-)
If the Updated at is always at the same path, then you can remove it:
SELECT t1.jsonb_field
FROM customers t1
INNER JOIN (SELECT jsonb_field, COUNT(*) AS CountOf
FROM customers
GROUP BY jsonb_field
HAVING COUNT(*)>1
) t2 ON
t1.jsonb_field #-'{main,orders,updated_at}'
=
t2.jsonb_field #-'{main,orders,updated_at}'
WHERE
t1.customer_id = 1
See https://www.postgresql.org/docs/9.5/static/functions-json.html
additional operators
EDIT
If you dont have #- you might just cast to text, and do a regex replace
regexp_replace(t1.jsonb_field::text, '"update_at": "[^"]*?"','')::jsonb
=
regexp_replace(t2.jsonb_field::text, '"update_at": "[^"]*?"','')::jsonb
I even think, you don't need to cast it back to jsonb. But to be save.
Mind the regex matche ANY "update_at" field (by key) in the json. It should not match data, because it would not match an escaped closing quote \", nor find the colon after it.
Note the regex actually should be '"update_at": "[^"]*?",?'
But on sql fiddle that fails. (maybe depends on the postgresbuild..., check with your version, because as far as regex go, this is correct)
If the comma is not removed, the cast to json fails.
you can try '"update_at": "[^"]*?",'
no ? : that will remove the comma, but fail if update_at was the last in the list.
worst case, nest the 2
regexp_replace(
regexp_replace(t1.jsonb_field::text, '"update_at": "[^"]*?",','')
'"update_at": "[^"]*?"','')::jsonb
for postgresql 9.4
Though sqlfidle only has 9.3 and 9.6
9.3 is missing the json_object_agg. But the postgres doc says it is in 9.4. So this should work
It will only work, if all records have objects under the important keys.
main->orders
If main->orders is a json array, or scalar, then this may give an error.
Same if {"main": [1,2]} => error.
Each json_each returns a table with a row for each key in the json
json_object_agg aggregates them back to a json array.
The case statement filters the one key on each level that needs to be handled.
In the deepest nest level, it filters out the updated_at row.
On sqlfidle set query separator to '//'
If you use psql client, replace the // with ;
create or replace function foo(x json)
returns jsonb
language sql
as $$
select json_object_agg(key,
case key when 'main' then
(select json_object_agg(t2.key,
case t2.key when 'orders' then
(select json_object_agg(t3.key, t3.value)
from json_each(t2.value) as t3
WHERE t3.key <> 'updated_at'
)
else t2.value
end)
from json_each(t1.value) as t2
)
else t1.value
end)::jsonb
from json_each(x) as t1
$$ //
select foo(x)
from
(select '{ "main":{"orders":{"order_id": "1", "customer_id": "1", "updated_at": "11/23/2017 17:49:53" }}}'::json as x) as t1
x (the argument) may need to be jsonb, if that is your datatype