How to return different format of records from a single PL/pgSQL function? - postgresql

I am a frontend developer but I started to write backend stuff. I have spent quite some amount of time trying to figure out how to solve this. I really need some help.
Here are the simplified definitions and relations of two tables:
Relationship between tables
CREATE TABLE IF NOT EXISTS items (
item_id uuid NOT NULL DEFAULT gen_random_uuid() ,
parent_id uuid DEFAULT NULL ,
parent_table parent_tables NOT NULL
);
CREATE TABLE IF NOT EXISTS collections (
collection_id uuid NOT NULL DEFAULT gen_random_uuid() ,
parent_id uuid DEFAULT NULL
);
Our product is an online document collaboration tool, page can have nested pages.
I have a piece of PostgreSQL code for getting all of its ancestor records for given item_ids.
WITH RECURSIVE ancestors AS (
SELECT *
FROM items
WHERE item_id in ( ${itemIds} )
UNION
SELECT i.*
FROM items i
INNER JOIN ancestors a ON a.parent_id = i.item_id
)
SELECT * FROM ancestors
It works fine for nesting regular pages, But if I am going to support nesting collection pages, which means some items' parent_id might refer to "collection" table's collection_id, this code will not work anymore. According to my limited experience, I don't think pure SQL code can solve it. I think writing a PL/pgSQL function might be a solution, but I need to get all ancestor records to given itemIds, which means returning a mix of items and collections records.
So how to return different format of records from a single PL/pgSQL function? I did some research but haven't found any example.

You can make it work by returning a superset as row: comprised of item and collection. One of both will be NULL for each result row.
WITH RECURSIVE ancestors AS (
SELECT 0 AS lvl, i.parent_id, i.parent_table, i AS _item, NULL::collections AS _coll
FROM items i
WHERE item_id IN ( ${itemIds} )
UNION ALL -- !
SELECT lvl + 1, COALESCE(i.parent_id, c.parent_id), COALESCE(i.parent_table, 'i'), i, c
FROM ancestors a
LEFT JOIN items i ON a.parent_table = 'i' AND i.item_id = a.parent_id
LEFT JOIN collections c ON a.parent_table = 'c' AND c.collection_id = a.parent_id
WHERE a.parent_id IS NOT NULL
)
SELECT lvl, _item, _coll
FROM ancestors
-- ORDER BY ?
db<>fiddle here
UNION ALL, not UNION.
Assuming a collection's parent is always an item, while an item can go either way.
We need LEFT JOIN on both potential parent tables to stay in the race.
I added an optional lvl to keep track of the level of hierarchy.
About decomposing row types:
Combine postgres function with query
Record returned from function has columns concatenated

Related

PGSQL - How to efficiently flatten key/value table [duplicate]

Does any one know how to create crosstab queries in PostgreSQL?
For example I have the following table:
Section Status Count
A Active 1
A Inactive 2
B Active 4
B Inactive 5
I would like the query to return the following crosstab:
Section Active Inactive
A 1 2
B 4 5
Is this possible?
Install the additional module tablefunc once per database, which provides the function crosstab(). Since Postgres 9.1 you can use CREATE EXTENSION for that:
CREATE EXTENSION IF NOT EXISTS tablefunc;
Improved test case
CREATE TABLE tbl (
section text
, status text
, ct integer -- "count" is a reserved word in standard SQL
);
INSERT INTO tbl VALUES
('A', 'Active', 1), ('A', 'Inactive', 2)
, ('B', 'Active', 4), ('B', 'Inactive', 5)
, ('C', 'Inactive', 7); -- ('C', 'Active') is missing
Simple form - not fit for missing attributes
crosstab(text) with 1 input parameter:
SELECT *
FROM crosstab(
'SELECT section, status, ct
FROM tbl
ORDER BY 1,2' -- needs to be "ORDER BY 1,2" here
) AS ct ("Section" text, "Active" int, "Inactive" int);
Returns:
Section | Active | Inactive
---------+--------+----------
A | 1 | 2
B | 4 | 5
C | 7 | -- !!
No need for casting and renaming.
Note the incorrect result for C: the value 7 is filled in for the first column. Sometimes, this behavior is desirable, but not for this use case.
The simple form is also limited to exactly three columns in the provided input query: row_name, category, value. There is no room for extra columns like in the 2-parameter alternative below.
Safe form
crosstab(text, text) with 2 input parameters:
SELECT *
FROM crosstab(
'SELECT section, status, ct
FROM tbl
ORDER BY 1,2' -- could also just be "ORDER BY 1" here
, $$VALUES ('Active'::text), ('Inactive')$$
) AS ct ("Section" text, "Active" int, "Inactive" int);
Returns:
Section | Active | Inactive
---------+--------+----------
A | 1 | 2
B | 4 | 5
C | | 7 -- !!
Note the correct result for C.
The second parameter can be any query that returns one row per attribute matching the order of the column definition at the end. Often you will want to query distinct attributes from the underlying table like this:
'SELECT DISTINCT attribute FROM tbl ORDER BY 1'
That's in the manual.
Since you have to spell out all columns in a column definition list anyway (except for pre-defined crosstabN() variants), it is typically more efficient to provide a short list in a VALUES expression like demonstrated:
$$VALUES ('Active'::text), ('Inactive')$$)
Or (not in the manual):
$$SELECT unnest('{Active,Inactive}'::text[])$$ -- short syntax for long lists
I used dollar quoting to make quoting easier.
You can even output columns with different data types with crosstab(text, text) - as long as the text representation of the value column is valid input for the target type. This way you might have attributes of different kind and output text, date, numeric etc. for respective attributes. There is a code example at the end of the chapter crosstab(text, text) in the manual.
db<>fiddle here
Effect of excess input rows
Excess input rows are handled differently - duplicate rows for the same ("row_name", "category") combination - (section, status) in the above example.
The 1-parameter form fills in available value columns from left to right. Excess values are discarded.
Earlier input rows win.
The 2-parameter form assigns each input value to its dedicated column, overwriting any previous assignment.
Later input rows win.
Typically, you don't have duplicates to begin with. But if you do, carefully adjust the sort order to your requirements - and document what's happening.
Or get fast arbitrary results if you don't care. Just be aware of the effect.
Advanced examples
Pivot on Multiple Columns using Tablefunc - also demonstrating mentioned "extra columns"
Dynamic alternative to pivot with CASE and GROUP BY
\crosstabview in psql
Postgres 9.6 added this meta-command to its default interactive terminal psql. You can run the query you would use as first crosstab() parameter and feed it to \crosstabview (immediately or in the next step). Like:
db=> SELECT section, status, ct FROM tbl \crosstabview
Similar result as above, but it's a representation feature on the client side exclusively. Input rows are treated slightly differently, hence ORDER BY is not required. Details for \crosstabview in the manual. There are more code examples at the bottom of that page.
Related answer on dba.SE by Daniel Vérité (the author of the psql feature):
How do I generate a pivoted CROSS JOIN where the resulting table definition is unknown?
SELECT section,
SUM(CASE status WHEN 'Active' THEN count ELSE 0 END) AS active, --here you pivot each status value as a separate column explicitly
SUM(CASE status WHEN 'Inactive' THEN count ELSE 0 END) AS inactive --here you pivot each status value as a separate column explicitly
FROM t
GROUP BY section
You can use the crosstab() function of the additional module tablefunc - which you have to install once per database. Since PostgreSQL 9.1 you can use CREATE EXTENSION for that:
CREATE EXTENSION tablefunc;
In your case, I believe it would look something like this:
CREATE TABLE t (Section CHAR(1), Status VARCHAR(10), Count integer);
INSERT INTO t VALUES ('A', 'Active', 1);
INSERT INTO t VALUES ('A', 'Inactive', 2);
INSERT INTO t VALUES ('B', 'Active', 4);
INSERT INTO t VALUES ('B', 'Inactive', 5);
SELECT row_name AS Section,
category_1::integer AS Active,
category_2::integer AS Inactive
FROM crosstab('select section::text, status, count::text from t',2)
AS ct (row_name text, category_1 text, category_2 text);
DB Fiddle here:
Everything works: https://dbfiddle.uk/iKCW9Uhh
Without CREATE EXTENSION tablefunc; you get this error: https://dbfiddle.uk/j8W1CMvI
ERROR: function crosstab(unknown, integer) does not exist
LINE 4: FROM crosstab('select section::text, status, count::text fro...
^
HINT: No function matches the given name and argument types. You might need to add explicit type casts.
Solution with JSON aggregation:
CREATE TEMP TABLE t (
section text
, status text
, ct integer -- don't use "count" as column name.
);
INSERT INTO t VALUES
('A', 'Active', 1), ('A', 'Inactive', 2)
, ('B', 'Active', 4), ('B', 'Inactive', 5)
, ('C', 'Inactive', 7);
SELECT section,
(obj ->> 'Active')::int AS active,
(obj ->> 'Inactive')::int AS inactive
FROM (SELECT section, json_object_agg(status,ct) AS obj
FROM t
GROUP BY section
)X
Sorry this isn't complete because I can't test it here, but it may get you off in the right direction. I'm translating from something I use that makes a similar query:
select mt.section, mt1.count as Active, mt2.count as Inactive
from mytable mt
left join (select section, count from mytable where status='Active')mt1
on mt.section = mt1.section
left join (select section, count from mytable where status='Inactive')mt2
on mt.section = mt2.section
group by mt.section,
mt1.count,
mt2.count
order by mt.section asc;
The code I'm working from is:
select m.typeID, m1.highBid, m2.lowAsk, m1.highBid - m2.lowAsk as diff, 100*(m1.highBid - m2.lowAsk)/m2.lowAsk as diffPercent
from mktTrades m
left join (select typeID,MAX(price) as highBid from mktTrades where bid=1 group by typeID)m1
on m.typeID = m1.typeID
left join (select typeID,MIN(price) as lowAsk from mktTrades where bid=0 group by typeID)m2
on m1.typeID = m2.typeID
group by m.typeID,
m1.highBid,
m2.lowAsk
order by diffPercent desc;
which will return a typeID, the highest price bid and the lowest price asked and the difference between the two (a positive difference would mean something could be bought for less than it can be sold).
There's a different dynamic method that I've devised, one that employs a dynamic rec. type (a temp table, built via an anonymous procedure) & JSON. This may be useful for an end-user who can't install the tablefunc/crosstab extension, but can still create temp tables or run anon. proc's.
The example assumes all the xtab columns are the same type (INTEGER), but the # of columns is data-driven & variadic. That said, JSON aggregate functions do allow for mixed data types, so there's potential for innovation via the use of embedded composite (mixed) types.
The real meat of it can be reduced down to one step if you want to statically define the rec. type inside the JSON recordset function (via nested SELECTs that emit a composite type).
dbfiddle.uk
https://dbfiddle.uk/N1EzugHk
Crosstab function is available under the tablefunc extension. You'll have to create this extension one time for the database.
CREATE EXTENSION tablefunc;
You can use the below code to create pivot table using cross tab:
create table test_Crosstab( section text,
status text,
count numeric)
insert into test_Crosstab values ( 'A','Active',1)
,( 'A','Inactive',2)
,( 'B','Active',4)
,( 'B','Inactive',5)
select * from crosstab(
'select section
,status
,count
from test_crosstab'
)as ctab ("Section" text,"Active" numeric,"Inactive" numeric)

How to identify and list Circular Reference elements in Postgres

I have an employee , manager hierarchy which could end up being circular.
Ex:
28397468N>88518119N>87606705N>28397468N
Create Table emp_manager ( Emp_id varchar(30), Manager_id varchar(30));
Insert into emp_manager values ('28397468N','88518119N');
Insert into emp_manager values ('88518119N','87606705N');
Insert into emp_manager values ('87606705N','28397468N');
My requirement is:
When my proc is called and there are circular hierarchies in the emp_manager table, we should return an error listing the employees in the hierarchy.
The below link contains some useful info:
https://mccalljt.io/blog/2017/01/postgres-circular-references/
I have modified it as below:
select * from (
WITH RECURSIVE circular_managers(Emp_id, Manager_id, depth, path, cycle) AS (
SELECT u.Emp_id, u.Manager_id, 1,
ARRAY[u.Emp_id],
false
FROM emp_manager u
UNION ALL
SELECT u.Emp_id, u.Manager_id, cm.depth + 1,
(path || u.Emp_id)::character varying(32)[],
u.Emp_id = ANY(path)
FROM emp_manager u, circular_managers cm
WHERE u.Emp_id = cm.Manager_id AND NOT cycle
)
select
distinct (path) d
FROM circular_managers
WHERE cycle
AND path[1] = path[array_upper(path, 1)]) cm
BUT, the problem is, it is returning all combinations of the hierarchy:
{28397468N,88518119N,87606705N,28397468N}
{87606705N,28397468N,88518119N,87606705N}
{88518119N,87606705N,28397468N,88518119N}
I need a simple answer like this:
28397468N>88518119N>87606705N>28397468N
even this will do:
28397468N>88518119N>87606705N
Please help!
So all references:
{28397468N,88518119N,87606705N,28397468N}
{87606705N,28397468N,88518119N,87606705N}
{88518119N,87606705N,28397468N,88518119N}
are correct but just start from different element.
I need a simple answer like this: 28397468N>88518119N>87606705N>28397468N
So what's needed is a filter for the same circle refs.
Let's do that in a way:
sort distinct items in arrays
aggregate them back - so for all references it will be '{28397468N,87606705N,88518119N}'
use produced value for DISTINCT FIRST_VALUE
WITH D (circle_ref ) AS (
VALUES
('{28397468N,88518119N,87606705N,28397468N}'::text[]),
('{87606705N,28397468N,88518119N,87606705N}'::text[]),
('{88518119N,87606705N,28397468N,88518119N}'::text[])
), ordered AS (
SELECT
D.circle_ref,
(SELECT ARRAY_AGG(DISTINCT el ORDER BY el) FROM UNNEST(D.circle_ref) AS el ) AS ordered_circle
FROM
D
)
SELECT DISTINCT
FIRST_VALUE (circle_ref) OVER (PARTITION BY ordered_circle ORDER BY circle_ref) AS circle_ref
FROM
ordered;
circle_ref
{28397468N,88518119N,87606705N,28397468N}
DB Fiddle: https://www.db-fiddle.com/f/6ytb2v11s8T95PPLoTZZed/0
To prevent circular references, you can use a closure table and a trigger - as explained in https://stackoverflow.com/a/38701519/5962802
The closure table will also allow you to easily get all subordinates for a given supervisor (no matter how deep in the hierarchy) - or all direct bosses of a given employee (up to the root).
Before using the rebuild_tree stored procedure you will have to remove all circular references from the hierarchy.

How can I restrict a result to only include rows where one specific field is unique with UNION Select statement in BigQuery?

I have the following code. I try to stitch the two tables together, but restrict it to only add duplicate Opportunity_ID once, and then from the second table (OpportunitiesUpdates).
SELECT
Opportunity.Account_Name,
Opportunity.Opportunity_Name,
Opportunity.Opportunity_Owner,
Opportunity.Opportunity_ID
FROM
Opportunity
UNION DISTINCT
SELECT
OpportunityUpdates.Account_Name,
OpportunityUpdates.Opportunity_Name,
OpportunityUpdates.Opportunity_Owner,
OpportunityUpdates.Opportunity_ID
FROM
OpportunityUpdates
WHERE OpportunityUpdates.Opportunity_ID <> Opportunity.Opportunity_ID
This code consolidates all records from both tables (by Opportunity_ID) and gives priority to the OpportunityUpdates table based on Opportunity_ID.
It assumes that the same Opportunity_ID could be in either table ("duplicates"), but that within each table an Opportunity_ID is unique. It also assumes that Opportunity_ID is not nullable (never null).
SELECT DISTINCT
IF(ou.Opportunity_ID IS NOT NULL, ou.Account_Name, o.Account_Name) Account_Name,
IF(ou.Opportunity_ID IS NOT NULL, ou.Opportunity_Name, o.Opportunity_Name) Opportunity_Name,
IF(ou.Opportunity_ID IS NOT NULL, ou.Opportunity_Owner, o.Opportunity_Owner) Opportunity_Owner,
COALESCE(ou.Opportunity_ID, o.Opportunity_ID) Opportunity_ID
FROM OpportunityUpdates ou
FULL OUTER JOIN
Opportunity o
ON o.Opportunity_ID = ou.Opportunity_ID

Building a jsonb object without using subquery in postgresql 12

I have a table called users that has a jsonb column called 'history'. This is an array of objects, one of the elements is called uid which is the id of the person visiting the page as follows:
[ {"ip":"...","uid":2} , {"ip":"...","uid":4} , ... ]
I'm running a query that appends the jsonb object with the field uname to make understanding who 'uid' is a bit easier which will produce:
[ {"ip":"...","uid":2,"uname":"bob"} , {"ip":"...","uid":4,"uname":"dave"} , ... ]
I'm currently doing this using the following query (say, where uid=2):
SELECT json_agg(history2||jsonb_build_object('uname',uname::text)) FROM
(SELECT jsonb_array_elements(history) AS history2 FROM users WHERE uid=2) AS table1
LEFT JOIN users AS table2 ON history2->>'uid'=table2.uid
I'm using the subquery to return a table of json objects that's then joined to the user table again to get the username.
My question is: Is there a way of doing this without having the subquery? I've read that lateral joins could be used but all my attempts at this don't seem to work.
Thanks in advance.
You can move jsonb_array_elements into the FROM clause with an outer join:
SELECT jsonb_agg(h.item||jsonb_build_object('uname', u.uname))
FROM users u
LEFT JOIN jsonb_array_elements(u.history) as h(item) on h.item ->> 'uid' = u.uid::text
WHERE u.uid = 2

Fetch rows from postgres table which contains a specific id in jsonb[] column

I have a details table with adeet column defined as jsonb[]
a sample value stored in adeet column is as below image
Sample data stored in DB :
I want to return the rows which satisfies id=26088 i.e row 1 and 3
I have tried array operations and json operations but it does'nt work as required. Any pointers
Obviously the type of the column adeet is not of type JSON/JSONB, but maybe VARCHAR and we should fix the format so as to convert into a JSONB type. I used replace() and r/ltrim() funcitons for this conversion, and preferred to derive an array in order to use jsonb_array_elements() function :
WITH t(jobid,adeet) AS
(
SELECT jobid, replace(replace(replace(adeet,'\',''),'"{','{'),'}"','}')
FROM tab
), t2 AS
(
SELECT jobid, ('['||rtrim(ltrim(adeet,'{'), '}')||']')::jsonb as adeet
FROM t
)
SELECT t.*
FROM t2 t
CROSS JOIN jsonb_array_elements(adeet) j
WHERE (j.value ->> 'id')::int = 26088
Demo
You want to combine JSONB's <# operator with the generic-array ANY construct.
select * from foobar where '{"id":26088}' <# ANY (adeet);