How to handle differening parent json keys in postgresql jsonb - postgresql

I have json records ingested in jsonb format that have varying parent keys i want to access- most of the parent keys refer to a document schema
SELECT id, COALESCE(data->'TEXPORT'->'FORM_SECTION'->'F03_2014',
data->'TEXPORT'->'FORM_SECTION'->'F02_2014',
data->'TEXPORT'->'FORM_SECTION'->'NOTICE_UUID',
data->'TEXPORT'->'FORM_SECTION'->'F01_2014',
data->'TEXPORT'->'FORM_SECTION'->'F14_2014',
data->'TEXPORT'->'FORM_SECTION'->'F21_2014',
data->'TEXPORT'->'FORM_SECTION'->'F15_2014')->'OBJECT'->'SHORT_DESCR'->'P' from json_table
How can i make this cleaner and how do i do multiple coalesces? Ie. sometimes the SHORT_DESCR key is called something else also

You can write your own helper function:
CREATE FUNCTION first_property(value jsonb, VARIADIC keys text[]) RETURNS jsonb AS $$
SELECT value -> key
FROM UNNEST(keys) WITH ORDINALITY AS _(key, i)
WHERE value ? key
ORDER BY i
LIMIT 1;
$$ LANGUAGE SQL;
(Online demo)
With that, you can shorten your query to
SELECT
id,
first_property(
data->'TEXPORT'->'FORM_SECTION',
'F03_2014', 'F02_2014', 'NOTICE_UUID', 'F01_2014', 'F14_2014', 'F21_2014', 'F15_2014'
)->'OBJECT'->'SHORT_DESCR'->'P'
FROM json_table
and you can call it multiple times, like
SELECT
id,
first_property(
first_property(
data->'TEXPORT'->'FORM_SECTION',
'F03_2014', 'F02_2014', 'NOTICE_UUID', 'F01_2014', 'F14_2014', 'F21_2014', 'F15_2014'
)->'OBJECT',
'SHORT_DESCR', 'SDCR', 'DESC'
)->'P'
FROM json_table

Related

PostgreSQL -- JOIN UNNEST output with CTE INSERT ID -- INSERT many to many

In a PostgreSQL function, is it possible to join the result of UNNEST, which is an integer array from function input, with an ID returned from a CTE INSERT?
I have PostgreSQL tables like:
CREATE TABLE public.message (
id SERIAL PRIMARY KEY,
content TEXT
);
CREATE TABLE public.message_tag (
id SERIAL PRIMARY KEY,
message_id INTEGER NOT NULL CONSTRAINT message_tag_message_id_fkey REFERENCES public.message(id) ON DELETE CASCADE,
tag_id INTEGER NOT NULL CONSTRAINT message_tag_tag_id_fkey REFERENCES public.tag(id) ON DELETE CASCADE
);
I want to create a PostgreSQL function which takes input of content and an array of tag_id. This is for graphile. I want to do it all in one function, so I get a mutation.
Here's what I got so far. I don't know how to join an UNNEST across an id returned from a CTE.
CREATE FUNCTION public.create_message(content text, tags Int[])
RETURNS public.message
AS $$
-- insert to get primary key of message, for many to many message_id
WITH moved_rows AS (
INSERT INTO public.message (content)
RETURNING *;
)
-- many to many relation
INSERT INTO public.message_tag
SELECT moved_rows.id as message_id, tagInput.tag_id FROM moved_rows, UNNEST(tags) as tagInput;
RETURNING *
$$ LANGUAGE sql VOLATILE STRICT;
You're not that far from your goal:
the semicolon placement in the CTE is wrong
the first INSERT statement lacks a SELECT or VALUES clause to specify what should be inserted
the INSERT into tag_message should specify the columns in which to insert (especially if you have that unnecessary serial id)
you specified a relation alias for the UNNEST call already, but none for the column tag_id
your function was RETURNING a set of message_tag rows but was specified to return a single message row
To fix these:
CREATE FUNCTION public.create_message(content text, tags Int[])
RETURNS public.message
AS $$
-- insert to get primary key of message, for many to many message_id
WITH moved_rows AS (
INSERT INTO public.message (content)
VALUES ($1)
RETURNING *
),
-- many to many relation
_ AS (
INSERT INTO public.message_tag (message_id, tag_id)
SELECT moved_rows.id, tagInput.tag_id
FROM moved_rows, UNNEST($2) as tagInput(tag_id)
)
TABLE moved_rows;
$$ LANGUAGE sql VOLATILE STRICT;
(Online demo)

Update hstore values with other hstore values

I have a summary table that is updated with new data on a regulary basis. One of the columns is of type hstore. When I update with new data I want to add the value of a key to the existing value of the key if the key exists, otherwise I want to add the pair to the hstore.
Existing data:
id sum keyvalue
--------------------------------------
1 2 "key1"=>"1","key2"=>"1"
New data:
id sum keyvalue
--------------------------------------------------
1 3 "key1"=>"1","key2"=>"1","key3"=>"1"
Wanted result:
id sum keyvalue
--------------------------------------------------
1 5 "key1"=>"2","key2"=>"2","key3"=>"1"
I want to do this in a on conflict part of an insert.
The sum part was easy. But I have not found how to concatenate the hstore in this way.
There is nothing built int. You have to write a function that accepts to hstore values and merges them in the way you want.
create function merge_and_increment(p_one hstore, p_two hstore)
returns hstore
as
$$
select hstore_agg(hstore(k,v))
from (
select k, sum(v::int)::text as v
from (
select *
from each(p_one) as t1(k,v)
union all
select *
from each(p_two) as t2(k,v)
) x
group by k
) s
$$
language sql;
The hstore_agg() function isn't built-in as well, but it's easy to define it:
create aggregate hstore_agg(hstore)
(
sfunc = hs_concat(hstore, hstore),
stype = hstore
);
So the result of this:
select merge_and_increment(hstore('"key1"=>"1","key2"=>"1"'), hstore('"key1"=>"1","key2"=>"1","key3"=>"1"'))
is:
merge_and_increment
-------------------------------------
"key1"=>"2", "key2"=>"2", "key3"=>"1"
Note that the function will fail miserably if there are values that can't be converted to an integer.
With an insert statement you can use it like this:
insert into the_table (id, sum, data)
values (....)
on conflict (id) do update
set sum = the_table.sum + excluded.sum,
data = merge_and_increment(the_table.data, excluded.data);
demo:db<>fiddle
CREATE OR REPLACE FUNCTION sum_hstore(_old hstore, _new hstore) RETURNS hstore
AS $$
DECLARE
_out hstore;
BEGIN
SELECT
hstore(array_agg(key), array_agg(value::text))
FROM (
SELECT
key,
SUM(value::int) AS value
FROM (
SELECT * FROM each('"key1"=>"1","key2"=>"1"'::hstore)
UNION ALL
SELECT * FROM each('"key1"=>"1","key2"=>"1","key3"=>"1"')
) s
GROUP BY key
) s
INTO _out;
RETURN _out;
END;
$$
LANGUAGE plpgsql;
each() expands the key/value pairs into one row per pair with columns key and value
convert type text into type int and group/sum the values
Aggregate into a new hstore value using the hstore(array, array) function. The array elements are the values of the key column and the values of the value column.
You can do such an update:
UPDATE mytable
SET keyvalue = sum_hstore(keyvalue, '"key1"=>"1","key2"=>"1","key3"=>"1"')
WHERE id = 1;

Avoid putting PostgreSQL function result into one field

The end result of what I am after is a query that calls a function and that function returns a set of records that are in their own separate fields. I can do this but the results of the function are all in one field.
ie: http://i.stack.imgur.com/ETLCL.png and the results I am after are: http://i.stack.imgur.com/wqRQ9.png
Here's the code to create the table
CREATE TABLE tbl_1_hm
(
tbl_1_hm_id bigserial NOT NULL,
tbl_1_hm_f1 VARCHAR (250),
tbl_1_hm_f2 INTEGER,
CONSTRAINT tbl_1_hm PRIMARY KEY (tbl_1_hm_id)
)
-- do that for a few times to get some data
INSERT INTO tbl_1_hm (tbl_1_hm_f1, tbl_1_hm_f2)
VALUES ('hello', 1);
CREATE OR REPLACE FUNCTION proc_1_hm(id BIGINT)
RETURNS TABLE(tbl_1_hm_f1 VARCHAR (250), tbl_1_hm_f2 int AS $$
SELECT tbl_1_hm_f1, tbl_1_hm_f2
FROM tbl_1_hm
WHERE tbl_1_hm_id = id
$$ LANGUAGE SQL;
--And here is the current query I am running for my results:
SELECT t1.tbl_1_hm_id, proc_1_hm(t1.tbl_1_hm_id) AS t3
FROM tbl_1_hm AS t1
Thanks for having a read. Please if you want to haggle about the semantics of what I am doing by hitting the same table twice or my naming convention --> this is a simplified test.
When a function returns a set of records, you should treat it as a table source:
SELECT t1.tbl_1_hm_id, t3.*
FROM tbl_1_hm AS t1, proc_1_hm(t1.tbl_1_hm_id) AS t3;
Note that functions are implicitly using a LATERAL join (scroll down to sub-sections 4 and 5) so you can use fields from tables listed previously without having to specify an explicit JOIN condition.

Merging Concatenating JSON(B) columns in query

Using Postgres 9.4, I am looking for a way to merge two (or more) json or jsonb columns in a query. Consider the following table as an example:
id | json1 | json2
----------------------------------------
1 | {'a':'b'} | {'c':'d'}
2 | {'a1':'b2'} | {'f':{'g' : 'h'}}
Is it possible to have the query return the following:
id | json
----------------------------------------
1 | {'a':'b', 'c':'d'}
2 | {'a1':'b2', 'f':{'g' : 'h'}}
Unfortunately, I can't define a function as described here. Is this possible with a "traditional" query?
In Postgres 9.5+ you can merge JSONB like this:
select json1 || json2;
Or, if it's JSON, coerce to JSONB if necessary:
select json1::jsonb || json2::jsonb;
Or:
select COALESCE(json1::jsonb||json2::jsonb, json1::jsonb, json2::jsonb);
(Otherwise, any null value in json1 or json2 returns an empty row)
For example:
select data || '{"foo":"bar"}'::jsonb from photos limit 1;
?column?
----------------------------------------------------------------------
{"foo": "bar", "preview_url": "https://unsplash.it/500/720/123"}
Kudos to #MattZukowski for pointing this out in a comment.
Here is the complete list of build-in functions that can be used to create json objects in PostgreSQL. http://www.postgresql.org/docs/9.4/static/functions-json.html
row_to_json and json_object doest not allow you to define your own keys, so it can't be used here
json_build_object expect you to know by advance how many keys and values our object will have, that's the case in your example, but should not be the case in the real world
json_object looks like a good tool to tackle this problem but it forces us to cast our values to text so we can't use this one either
Well... ok, wo we can't use any classic functions.
Let's take a look at some aggregate functions and hope for the best... http://www.postgresql.org/docs/9.4/static/functions-aggregate.html
json_object_agg Is the only aggregate function that build objects, that's our only chance to tackle this problem. The trick here is to find the correct way to feed the json_object_agg function.
Here is my test table and data
CREATE TABLE test (
id SERIAL PRIMARY KEY,
json1 JSONB,
json2 JSONB
);
INSERT INTO test (json1, json2) VALUES
('{"a":"b", "c":"d"}', '{"e":"f"}'),
('{"a1":"b2"}', '{"f":{"g" : "h"}}');
And after some trials and errors with json_object here is a query you can use to merge json1 and json2 in PostgreSQL 9.4
WITH all_json_key_value AS (
SELECT id, t1.key, t1.value FROM test, jsonb_each(json1) as t1
UNION
SELECT id, t1.key, t1.value FROM test, jsonb_each(json2) as t1
)
SELECT id, json_object_agg(key, value)
FROM all_json_key_value
GROUP BY id
For PostgreSQL 9.5+, look at Zubin's answer.
This function would merge nested json objects
create or replace function jsonb_merge(CurrentData jsonb,newData jsonb)
returns jsonb
language sql
immutable
as $jsonb_merge_func$
select case jsonb_typeof(CurrentData)
when 'object' then case jsonb_typeof(newData)
when 'object' then (
select jsonb_object_agg(k, case
when e2.v is null then e1.v
when e1.v is null then e2.v
when e1.v = e2.v then e1.v
else jsonb_merge(e1.v, e2.v)
end)
from jsonb_each(CurrentData) e1(k, v)
full join jsonb_each(newData) e2(k, v) using (k)
)
else newData
end
when 'array' then CurrentData || newData
else newData
end
$jsonb_merge_func$;
Looks like nobody proposed this kind of solution yet, so here's my take, using custom aggregate functions:
create or replace aggregate jsonb_merge_agg(jsonb)
(
sfunc = jsonb_concat,
stype = jsonb,
initcond = '{}'
);
create or replace function jsonb_concat(a jsonb, b jsonb) returns jsonb
as 'select $1 || $2'
language sql
immutable
parallel safe
;
Note: this is using || which replaces existing values at same path instead of deeply merging them.
Now jsonb_merge_agg is accessible like so:
select jsonb_merge_agg(some_col) from some_table group by something;
Also you can tranform json into text, concatenate, replace and convert back to json. Using the same data from Clément you can do:
SELECT replace(
(json1::text || json2::text),
'}{',
', ')::json
FROM test
You could also concatenate all json1 into single json with:
SELECT regexp_replace(
array_agg((json1))::text,
'}"(,)"{|\\| |^{"|"}$',
'\1',
'g'
)::json
FROM test
This is a very old solution, since 9.4 you should use json_object_agg and simple || concatenate operator. Keeping here just for reference.
However this question is answered already some time ago; the fact that when json1 and json2 contain the same key; the key appears twice in the document, does not seem to be best practice.
Therefore u can use this jsonb_merge function with PostgreSQL 9.5:
CREATE OR REPLACE FUNCTION jsonb_merge(jsonb1 JSONB, jsonb2 JSONB)
RETURNS JSONB AS $$
DECLARE
result JSONB;
v RECORD;
BEGIN
result = (
SELECT json_object_agg(KEY,value)
FROM
(SELECT jsonb_object_keys(jsonb1) AS KEY,
1::int AS jsb,
jsonb1 -> jsonb_object_keys(jsonb1) AS value
UNION SELECT jsonb_object_keys(jsonb2) AS KEY,
2::int AS jsb,
jsonb2 -> jsonb_object_keys(jsonb2) AS value ) AS t1
);
RETURN result;
END;
$$ LANGUAGE plpgsql;
The following query returns the concatenated jsonb columns, where the keys in json2 are dominant over the keys in json1:
select id, jsonb_merge(json1, json2) from test
FYI, if someone's using jsonb in >= 9.5 and they only care about top-level elements being merged without duplicate keys, then it's as easy as using the || operator:
select '{"a1": "b2"}'::jsonb || '{"f":{"g" : "h"}}'::jsonb;
?column?
-----------------------------
{"a1": "b2", "f": {"g": "h"}}
(1 row)
Try this, if anyone having an issue for merging two JSON object
select table.attributes::jsonb || json_build_object('foo',1,'bar',2)::jsonb FROM table where table.x='y';
CREATE OR REPLACE FUNCTION jsonb_merge(pCurrentData jsonb, pMergeData jsonb, pExcludeKeys text[])
RETURNS jsonb IMMUTABLE LANGUAGE sql
AS $$
SELECT json_object_agg(key,value)::jsonb
FROM (
WITH to_merge AS (
SELECT * FROM jsonb_each(pMergeData)
)
SELECT *
FROM jsonb_each(pCurrentData)
WHERE key NOT IN (SELECT key FROM to_merge)
AND ( pExcludeKeys ISNULL OR key <> ALL(pExcludeKeys))
UNION ALL
SELECT * FROM to_merge
) t;
$$;
SELECT jsonb_merge('{"a": 1, "b": 9, "c": 3, "e":5}'::jsonb, '{"b": 2, "d": 4}'::jsonb, '{"c","e"}'::text[]) as jsonb
works well as an alternative to || when recursive deep merge is required (found here) :
create or replace function jsonb_merge_recurse(orig jsonb, delta jsonb)
returns jsonb language sql as $$
select
jsonb_object_agg(
coalesce(keyOrig, keyDelta),
case
when valOrig isnull then valDelta
when valDelta isnull then valOrig
when (jsonb_typeof(valOrig) <> 'object' or jsonb_typeof(valDelta) <> 'object') then valDelta
else jsonb_merge_recurse(valOrig, valDelta)
end
)
from jsonb_each(orig) e1(keyOrig, valOrig)
full join jsonb_each(delta) e2(keyDelta, valDelta) on keyOrig = keyDelta
$$;

Ordering a query on a field in a return record

I've got a query that calls a function in its select clause. The function returns a record type. In the calling query, I want to order by one of the fields in the returned record and if possible I'd also like to return the fields of the record as fields of the calling query. To make this clear, here's a simplified version of the code:
CREATE OR REPLACE FUNCTION getStatus(lastContact timestamptz, lastAlTime timestamptz, lastGps timestamptz, out status varchar, out toelichting varchar, out colorLevel integer)
RETURNS record AS
$BODY$
BEGIN
status := 'controle_status_ok';
toelichting := '';
colorLevel := 3;
END
$BODY$
LANGUAGE 'plpgsql' VOLATILE
COST 100;
ALTER FUNCTION DMI_Controle_getStatus(timestamptz, timestamptz, timestamptz, out varchar, out varchar, out integer) OWNER TO xyz;
Using this function, I want to have a query like this one:
SELECT
id,
name,
getStatus(tabel3.lastcontact, tabel4.lastchanged, tabel5.lastfound) as status
FROM
tabel1
left join tabel2 on ...
left join tabel3 on ...
left join tabel4 on ...
left join tabel5 on ...
ORDER BY
status
Postgres comes with the following error:
ERROR: could not identify an ordering operator for type record
HINT: Use an explicit ordering operator or modify the query.
The question: how should I order by the value of colorLevel that's been returned by getStatus?
Additional question: can I return the three fields of the getStatus function at fields of the query that calls the getStatus function?
Use
ORDER BY (status).colorlevel
to reference a column of your record type.
As an aside: I used lower case(colorlevel instead of colorLevel) because identifiers are cast to lower case if not double-quoted anyway, and using mixed case identifiers is generally a bad idea in PostgreSQL.
As to your additional question, similar syntax requirement. I also use a subquery to optimize the query:
SELECT id
, name
, (x.status).status
, (x.status).toelichting
, (x.status).colorLevel
FROM tabel
, (SELECT getStatus(now(), now(), now()) as status) x
ORDER BY (x.status).colorlevel
Read about accessing composite types in the manual.
Answer after additional input
To use columns from your tables, put it all in the a subquery. I am trying to avoid to call the function multiple times, because that may be expensive.
SELECT
id,
name,
(status).status,
(status).toelichting,
(status).colorLevel
FROM (
SELECT
id,
name,
getStatus(tabel3.lastcontact, tabel4.lastchanged, tabel5.lastfound) as status
FROM
tabel1
left join tabel2 on ...
left join tabel3 on ...
left join tabel4 on ...
left join tabel5 on ...
) x
ORDER BY
(status).colorlevel