Merging an array of JSON objects to one JSON column in Postgres - postgresql

I have two tables, products and products_ext that can be reduced essentially
to this basic form:
CREATE TABLE "products" (
"product_id" TEXT PRIMARY KEY
);
CREATE TABLE "products_ext" (
"product_id" TEXT NOT NULL,
"key" TEXT NOT NULL,
"value" JSON,
PRIMARY KEY ("product_id", "key"),
FOREIGN KEY ("product_id") REFERENCES "products"("product_id")
ON DELETE CASCADE
ON UPDATE CASCADE
);
Let us assume mock data
INSERT INTO "products" ("product_id") VALUES
('test1'),
('test2'),
('test3');
INSERT INTO "products_ext" (product_id, "key", "value") VALUES
('test1', 'foo', '"Foo"'),
('test1', 'bar', '"Bar"'),
('test2', 'foo', '"Foo"');
I can use a query
SELECT
"P"."product_id",
ARRAY(
SELECT
json_build_object(
"E"."key",
"E"."value"
)
FROM "products_ext" AS "E"
WHERE "E"."product_id" = "P"."product_id"
)
FROM
"products" AS "P";
which yields
product_id | array
------------+-----------------------------------------------
test1 | {"{\"foo\" : \"Foo\"}","{\"bar\" : \"Bar\"}"}
test2 | {"{\"foo\" : \"Foo\"}"}
but I cannot make it to yield a merged JSON. Is there an easy way in Postgres 10
to merge an array of multiple JSONs as one JSON that would yield?
product_id | json
------------+----------------------------------------
test1 | {\"foo\" : \"Foo\", \"bar\" : \"Bar\"}
test2 | {\"foo\" : \"Foo\"}
test3 | {}
Primary key pair "product_id" and "key" already make sure that there are no
key collisions. There may be rows in the products that do not have any data in products_ext and in those cases an empty JSON object should be provided.

demo:db<>fiddle
Use json_object_agg():
SELECT
p.product_id AS product_id,
json_object_agg(e.key, e.value)
FROM
products AS p
JOIN
products_ext AS e ON p.product_id = e.product_id
GROUP BY p.product_id;
Edit for empty product_ext values:
demo:db<>fiddle
SELECT
p.product_id AS product_id,
COALESCE(
json_object_agg(e.key, e.value) FILTER (WHERE e.key IS NOT NULL),
'{}'
)
FROM
products AS p
LEFT JOIN
products_ext AS e ON p.product_id = e.product_id
GROUP BY p.product_id;

Related

Postgres returning records when query result should be empty

Suppose the following,
CREATE SCHEMA IF NOT EXISTS my_schema;
CREATE TABLE IF NOT EXISTS my_schema.user (
id serial PRIMARY KEY,
chat_ids BIGINT[] NOT NULL
);
CREATE TABLE IF NOT EXISTS my_schema.chat (
id serial PRIMARY KEY,
chat_id_value BIGINT UNIQUE NOT NULL
);
INSERT INTO my_schema.chat VALUES
(1, 12321);
INSERT INTO my_schema.user VALUES
(1, '{12321}');
When I query for a user record with a nonexisting chat, I still receive a result:
SELECT u.id,
(
SELECT TO_JSON(COALESCE(ARRAY_AGG(c.*) FILTER (WHERE c IS NOT NULL), '{}'))
FROM my_schema.chat as c
WHERE c.chat_id_value = ANY (ARRAY[ 1234 ]::int[])
) AS chat_ids
FROM my_schema.user as u
Clearly, there is no my_schema.chat record with with chat_id_value = 1234.
I've tried adding,
. . .
FROM my_schema.user as u
WHERE chat_ids != '{}'
But this still yields the same result:
[
{
"id": 1,
"chat_ids": []
}
]
I've tried WHERE ARRAY_LENGTH(chat_ids, 1) != 0, WHERE CARDINALITY(chat_ids) != 0, none return the expected result.
Oddly enough, WHERE ARRAY_LENGTH(chat_ids, 1) != 1 works, implying the length of chat_ids is 1 when it's actually 0? Very confusing.
What am I doing wrong here? The expected result should be [].
If the subselect on my_schema.chat returns no result, you will get NULL, which coalesce will turn into {}. Moreover, the inner query is not correlated to the outer query, so you will get the same result for each row in my_schema."user". You should use an inner join:
SELECT u.id,
TO_JSON(COALESCE(ARRAY_AGG(c.*) FILTER (WHERE c IS NOT NULL), '{}'))
FROM my_schema.user as u
JOIN my_schema.chat as c
ON c.chat_id_value = ANY (u.chat_ids);
I don't think that your data model is good. You should avoid arrays and use a junction table instead. It will make for better performance and simpler queries.
You can do it as follows :
WITH cte as (
SELECT TO_JSON(ARRAY_AGG(c.*) FILTER (WHERE c IS NOT NULL)) as to_json
FROM my_schema.chat as c
inner join my_schema.user u on c.chat_id_value = ANY (u.chat_ids)
WHERE c.chat_id_value = ANY (ARRAY[ 12321]::int[])
)
select *
from cte where to_json is not null;
This will force not to show any result if the query don't match !
Demo here

PostgreSQL find by value in array in jsonb data

How can I get records from table where array in column value contains any value to find.
Well, the column can contain any data type of array, objects, strings, etc and null value. And arrays in column can contain any serializable data type
id|value |
--+------------+
1|null |
2|[0.05, 0.11]|
You can use a JSON path expression:
select *
from the_table
where value ## '$[*] == 0.11'
If the column doesn't contain an array, you can use
select *
from the_table
where value ## '$.* == 0.11'
This assumes value is defined as jsonb (which it should be). If it's not, you have to cast it value::jsonb
Online example
Some samples:
-- sample 1
with sample_data as (
select 1 as "id", null::jsonb as "value"
union all
select 2 as "id", '[0.05, 0.11]'::jsonb as "value"
)
select a2.pval::float4 from sample_data as a1
cross join jsonb_array_elements(a1."value") as a2(pval)
--Return:
0.05
0.11
-- sample 2
with sample_data as (
select 1 as "id", null::jsonb as "value"
union all
select 2 as "id", '[0.05, 0.11]'::jsonb as "value"
)
select a2.pval::float4 from sample_data as a1
cross join jsonb_array_elements(a1."value") as a2(pval)
where a2.pval::float4 > 0.1
--Return:
0.11

Aggregate function for corresponding row

I have the following table with a combined primary key of id and ts to implement historization:
create table "author" (
"id" bigint not null,
"ts" timestamp not null default now(),
"login" text unique not null,
primary key ("id", "ts")
);
Now I am interested only in the latest login value. Therefor I group by id:
select "id", max("ts"), "login" from "author" group by "id";
But this throws an error: login should be used in an aggregate function.
id and max("ts") uniquely identify a row because the tupple (id, ts) is the primary key. I need the login which matches the row identified by id and max("ts").
I can write a sub-select to find the login:
select ao."id", max(ao."ts"),
(select ai.login from "author" ai
where ao."id" = ai."id" and max(ao."ts") = ai.ts)
from "author" ao
group by "id";
This works but it is quite noisy and not very clever, because it searches the whole table although searching the group would be sufficient.
Does an aggregate function exist, which avoids the sub-select and gives me the remaining login, which belongs to id and max("ts")?
You have to identify the correct key to get the value you like from the table.
The correct key is:
select "id", max("ts") from "author" group by "id";
And using this to get the login you want:
select a1."id", a1.ts, a1.login
from "author" a1
inner join (select "id", max("ts") maxts, "login" from "author" group by "id") a2
ON a1.id = a2.id AND a1.ts = a2.maxts;
Alternatively using window functions:
SELECT "id", "ts", login
FROM (
select "id", "ts", CASE WHEN "ts" = max("ts") OVER (PARTITION BY "id") THEN 1 ELSE 0 END as isMax, "login" from "author" group by "id"
) dt
WHERE isMax = 1
There's a few other ways to skin this cat, but that's basically the gist.

Json array manipulation with postgres

I am very new to postgres, I am trying to manipulate data present in two tables and insert the resultant into a new table.
The first table looks like:
create table table1
(
column1 json,
column2 json
)
Data in the first table is
column1:
{"source" : ["s1", "s2"], "channels":["c1", "c2"]}
column2:
{"c1" : ["k1", "k2"], "c2":["k3", "k4"]}
The second table looks very alike
create table table2
(
column1 json,
column2 json
)
Data in the second table is
column1:
{"source" : ["s2", "s3"], "channels":["c2", "c3"]}
column2:
{"c2" : ["k1", "k2", "k5"], "c3":["k6", "k7", k8]}
I want to combine the data of both table (1 and 2) into one into json with unique array values and put it into a third table
The third table has the same structure as that of table 1 and table 2 and should have data as below
Data in the third table should be :
column1:
{"source" : ["s1", "s2", "s3"], "channels":["c1", "c2", "c3"]}
column2:
{"c1":["k1", k2"], "c2":["k1", "k2", "k3", "k4", "k5"], "c3":["k6", "k7", "k8"]}
I tried many different types of query structure but somehow I am unable to accomplish the above task. One among them is,
SELECT array_cat(ARRAY(SELECT json_extract_path_text(a.column1, 'source')), ARRAY(SELECT json_extract_path_text(b.column1, 'source'))) AS txt_arr FROM table1 a, table2 b;
Please dont mind the above query, its not even half-way correct.
since I am new to postgres, I would really appreciate any sort of help.
Thank you
If you must...
select (
select json_object_agg(key, vals)
from (
select key, json_agg(value) vals
from (
select j.key, v.value
from table1
left join lateral json_each(column1) j on (true)
left join lateral json_array_elements_text(j.value) v on (true)
union
select j.key, v.value
from table2
left join lateral json_each(column1) j on (true)
left join lateral json_array_elements_text(j.value) v on (true)
) t1
group by key
) t1
) column1,
(
select json_object_agg(key, vals)
from (
select key, json_agg(value) vals
from (
select j.key, v.value
from table1
left join lateral json_each(column2) j on (true)
left join lateral json_array_elements_text(j.value) v on (true)
union
select j.key, v.value
from table2
left join lateral json_each(column2) j on (true)
left join lateral json_array_elements_text(j.value) v on (true)
) t1
group by key
) t1
) column2
I don't think it can be simplified, due the heterogeneity of the data.

How to merge JSONB field in a tree structure?

I have a table in Postgres which stores a tree structure. Each node has a jsonb field: params_diff:
CREATE TABLE tree (id INT, parent_id INT, params_diff JSONB);
INSERT INTO tree VALUES
(1, NULL, '{ "some_key": "some value" }'::jsonb)
, (2, 1, '{ "some_key": "other value", "other_key": "smth" }'::jsonb)
, (3, 2, '{ "other_key": "smth else" }'::jsonb);
The thing I need is to select a node by id with additional generated params field which contains the result of merging all params_diff from the whole parents chain:
SELECT tree.*, /* some magic here */ AS params FROM tree WHERE id = 3;
id | parent_id | params_diff | params
----+-----------+----------------------------+-------------------------------------------------------
3 | 2 | {"other_key": "smth else"} | {"some_key": "other value", "other_key": "smth else"}
Generally, a recursive CTE can do the job. Example:
Use table alias in another query to traverse a tree
We just need a more magic to decompose, process and re-assemble the JSON result. I am assuming from your example, that you want each key once only, with the first value in the search path (bottom-up):
WITH RECURSIVE cte AS (
SELECT id, parent_id, params_diff, 1 AS lvl
FROM tree
WHERE id = 3
UNION ALL
SELECT t.id, t.parent_id, t.params_diff, c.lvl + 1
FROM cte c
JOIN tree t ON t.id = c.parent_id
)
SELECT id, parent_id, params_diff
, (SELECT json_object(array_agg(key ORDER BY lvl)
, array_agg(value ORDER BY lvl))::jsonb
FROM (
SELECT key, value
FROM (
SELECT DISTINCT ON (key)
p.key, p.value, c.lvl
FROM cte c, jsonb_each_text(c.params_diff) p
ORDER BY p.key, c.lvl
) sub1
ORDER BY lvl
) sub2
) AS params
FROM cte
WHERE id = 3;
How?
Walk the tree with a classic recursive CTE.
Create a derived table with all keys and values with jsonb_each_text() in a LATERAL JOIN, remember the level in the search path (lvl).
Use DISTINCT ON to get the "first" (lowest lvl) value for each key. Details:
Select first row in each GROUP BY group?
Sort and aggregate resulting keys and values and feed the arrays to json_object() to build the final params value.
SQL Fiddle (only as far as pg 9.3 can go with json instead of jsonb).