How would you read a csv in a stored procedure such that the csv needs data extraction? - postgresql

The csv has urls of images in the format -
www.domain.com/table_id/x_y_height_width.jpg
We want to extract table_id, x, y, height and width from these urls in a stored procedure and then use these parameters in multiple sql queries.
How can we do that?

regexp_split_to_array and split_part functions
create or replace function split_url (
_url text, out table_id int, out x int, out y int, out height int, out width int
) as $$
select
a[2]::int,
split_part(a[3], '_', 1)::int,
split_part(a[3], '_', 2)::int,
split_part(a[3], '_', 3)::int,
split_part(split_part(a[3], '_', 4), '.', 1)::int
from (values
(regexp_split_to_array(_url, '/'))
) rsa(a);
$$ language sql immutable;
select *
from split_url('www.domain.com/234/34_12_400_300.jpg');
table_id | x | y | height | width
----------+----+----+--------+-------
234 | 34 | 12 | 400 | 300
To use the function with other tables do lateral:
with t (url) as ( values
('www.domain.com/234/34_12_400_300.jpg'),
('www.examplo.com/984/12_90_250_360.jpg')
)
select *
from
t
cross join lateral
split_url(url)
;
url | table_id | x | y | height | width
---------------------------------------+----------+----+----+--------+-------
www.domain.com/234/34_12_400_300.jpg | 234 | 34 | 12 | 400 | 300
www.examplo.com/984/12_90_250_360.jpg | 984 | 12 | 90 | 250 | 360

Related

Oracle round function with scientific notation

Dpes anyone know how to get this to work in Oracle? The value is what is in the table and I need to round this to 2 digits:
Select round(3.00345678908765E+60, 2) from dual
You need to shift the rounding by -60 decimal places (so from +2 to -58):
Select round( 3.00345678908765E+60, -58) from dual
Outputs:
| ROUND(3.00345678908765E+60,-58) |
| ------------------------------------------------------------: |
| 3000000000000000000000000000000000000000000000000000000000000 |
If you want to round to 3 significant figures:
SELECT ROUND( value, 2 - FLOOR( LOG( 10, value ) ) ) FROM table_name;
Which, for the sample data:
CREATE TABLE table_name ( value ) AS
SELECT 3.00345678908765E+60 FROM DUAL UNION ALL
SELECT 3.00555678908765E+60 FROM DUAL UNION ALL
SELECT 3.05345678908765E+58 FROM DUAL;
Outputs:
| ROUND(VALUE,2-FLOOR(LOG(10,VALUE))) |
| ------------------------------------------------------------: |
| 3000000000000000000000000000000000000000000000000000000000000 |
| 3010000000000000000000000000000000000000000000000000000000000 |
| 30500000000000000000000000000000000000000000000000000000000 |
Or you can use TO_CHAR to convert the value to a string rounded in scientific notation:
SELECT TO_CHAR( value, 'FM0.00EEEE' ) AS as_string,
TO_NUMBER(TO_CHAR( value, 'FM0.00EEEE' )) AS as_number
FROM table_name;
Outputs:
AS_STRING | AS_NUMBER
:-------- | ------------------------------------------------------------:
3.00E+60 | 3000000000000000000000000000000000000000000000000000000000000
3.01E+60 | 3010000000000000000000000000000000000000000000000000000000000
3.05E+58 | 30500000000000000000000000000000000000000000000000000000000
db<>fiddle here

CTE RECURSIVE optimization, how to?

I need to optimize the performance of a commom WITH RECURSIVE query... We can limit the depth of the tree and decompose in many updates, and can also change representation (use array)... I try some options but perhaps there are a "classic optimization solution" that I'm not realizing.
All details
There are a t_up table, to be updated, with a composit primary key (pk1,pk2), one attribute attr and an array of references to primary keys... And a unnested representation t_scan, with the references; like this:
pk1 | pk2 | attr | ref_pk1 | ref_pk2
n | 123 | 1 | |
n | 456 | 2 | |
r | 123 | 1 | w | 123
w | 123 | 5 | n | 456
r | 456 | 2 | n | 123
r | 123 | 1 | n | 111
n | 111 | 4 | |
... | ...| ... | ... | ...
There are no loops.
UPDATE t_up SET x = pairs
FROM (
WITH RECURSIVE tree as (
SELECT pk1, pk2, attr, ref_pk1, ref_pk2,
array[array[0,0]]::bigint[] as all_refs
FROM t_scan
UNION ALL
SELECT c.pk1, c.pk2, c.attr, c.ref_pk1, c.ref_pk2
,p.all_refs || array[c.attr,c.pk2]
FROM t_scan c JOIN tree p
ON c.ref_pk1=p.pk1 AND c.ref_pk2=p.pk2 AND c.pk2!=p.pk2
AND array_length(p.all_refs,1)<5 -- 5 or 6 avoiding endless loops
)
SELECT pk1, pk2, array_agg_cat(all_refs) as pairs
FROM (
SELECT distinct pk1, pk2, all_refs
FROM tree
WHERE array_length(all_refs,1)>1 -- ignores initial array[0,0].
) t
GROUP BY 1,2
ORDER BY 1,2
) rec
WHERE rec.pk1=t_up.pk1 AND rec.pk2=t_up.pk2
;
To test:
CREATE TABLE t_scan(
pk1 char,pk2 bigint, attr bigint,
ref_pk1 char, ref_pk2 bigint
);
INSERT INTO t_scan VALUES
('n',123, 1 ,NULL,NULL),
('n',456, 2 ,NULL,NULL),
('r',123, 1 ,'w' ,123),
('w',123, 5 ,'n' ,456),
('r',456, 2 ,'n' ,123),
('r',123, 1 ,'n' ,111),
('n',111, 4 ,NULL,NULL);
Running only rec you will obtain:
pk1 | pk2 | pairs
-----+-----+-----------------
r | 123 | {{0,0},{1,123}}
r | 456 | {{0,0},{2,456}}
w | 123 | {{0,0},{5,123}}
But, unfortunately, to appreciate the "Big Data performance problem", you need to see it in a real database... I am preparing a public Github that run with OpenStreetMap Big Data.

Returning null individual values with postgres tablefunc crosstab()

I am trying to incorporate the null values within the returned lists, such that:
batch_id |test_name |test_value
-----------------------------------
10 | pH | 4.7
10 | Temp | 154
11 | pH | 4.8
11 | Temp | 152
12 | pH | 4.5
13 | Temp | 155
14 | pH | 4.9
14 | Temp | 152
15 | Temp | 149
16 | pH | 4.7
16 | Temp | 150
would return:
batch_id | pH |Temp
---------------------------------------
10 | 4.7 | 154
11 | 4.8 | 152
12 | 4.5 | <null>
13 | <null> | 155
14 | 4.9 | 152
15 | <null> | 149
16 | 4.7 | 150
However, it currently returns this:
batch_id | pH |Temp
---------------------------------------
10 | 4.7 | 154
11 | 4.8 | 152
12 | 4.5 | <null>
13 | 155 | <null>
14 | 4.9 | 152
15 | 149 | <null>
16 | 4.7 | 150
This is an extension of a prior question -
Can the categories in the postgres tablefunc crosstab() function be integers? - which led to this current query:
SELECT *
FROM crosstab('SELECT lab_tests_results.batch_id, lab_tests.test_name, lab_tests_results.test_result::FLOAT
FROM lab_tests_results, lab_tests
WHERE lab_tests.id=lab_tests_results.lab_test AND (lab_tests.test_name LIKE ''Test Name 1'' OR lab_tests.test_name LIKE ''Test Name 2'')
ORDER BY 1,2'
) AS final_result(batch_id VARCHAR, test_name_1 FLOAT, test_name_2 FLOAT);
I also know that I am not the first to ask this question generally, but I have yet to find a solution that works for these circumstances. For example, this one - How to include null values in `tablefunc` query in postgresql? - assumes the same Batch IDs each time. I do not want to specify the Batch IDs, but rather all that are available.
This leads into the other set of solutions I've found out there, which address a null list result from specified categories. Since I'm just taking what's already there, however, this isn't an issue. It's the null individual values causing the problem and resulting in a pivot table with values shifted to the left.
Any suggestions are much appreciated!
Edit: With Klin's help, got it sorted out. Something to note is that the VALUES section must match the actual lab_tests.test_name values you're after, such that:
SELECT *
FROM crosstab(
$$
SELECT lab_tests_results.batch_id, lab_tests.test_name, lab_tests_results.test_result::FLOAT
FROM lab_tests_results, lab_tests
WHERE lab_tests.id = lab_tests_results.lab_test
AND (
lab_tests_results.lab_test = 1
OR lab_tests_results.lab_test = 2
OR lab_tests_results.lab_test = 3
OR lab_tests_results.lab_test = 4
OR lab_tests_results.lab_test = 5
OR lab_tests_results.lab_test = 50 )
ORDER BY 1 DESC, 2
$$,
$$
VALUES('Mash pH'),
('Sparge pH'),
('Final Lauter pH'),
('Wort pH'),
('Wort FAN'),
('Original Gravity'),
('Mash Temperature')
$$
) AS final_result(batch_id VARCHAR,
ph_mash FLOAT,
ph_sparge FLOAT,
ph_final_lauter FLOAT,
ph_wort FLOAT,
FAN_wort FLOAT,
original_gravity FLOAT,
mash_temperature FLOAT)
Thanks for the help!
Use the second form of the function:
crosstab(text source_sql, text category_sql) - Produces a “pivot table” with the value columns specified by a second query.
E.g.:
SELECT *
FROM crosstab(
$$
SELECT lab_tests_results.batch_id, lab_tests.test_name, lab_tests_results.test_result::FLOAT
FROM lab_tests_results, lab_tests
WHERE lab_tests.id=lab_tests_results.lab_test
AND (
lab_tests.test_name LIKE 'Test Name 1'
OR lab_tests.test_name LIKE 'Test Name 2')
ORDER BY 1,2
$$,
$$
VALUES('pH'), ('Temp')
$$
) AS final_result(batch_id VARCHAR, "pH" FLOAT, "Temp" FLOAT);

Find Parent Recursively using Query

I am using postgresql. I have the table as like below
parent_id child_id
----------------------
101 102
103 104
104 105
105 106
I want to write a sql query which will give the final parent of input.
i.e suppose i pass 106 as input then , its output will be 103.
(106 --> 105 --> 104 --> 103)
Here's a complete example. First the DDL:
test=> CREATE TABLE node (
test(> id SERIAL,
test(> label TEXT NOT NULL, -- name of the node
test(> parent_id INT,
test(> PRIMARY KEY(id)
test(> );
NOTICE: CREATE TABLE will create implicit sequence "node_id_seq" for serial column "node.id"
NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "node_pkey" for table "node"
CREATE TABLE
...and some data...
test=> INSERT INTO node (label, parent_id) VALUES ('n1',NULL),('n2',1),('n3',2),('n4',3);
INSERT 0 4
test=> INSERT INTO node (label) VALUES ('garbage1'),('garbage2'), ('garbage3');
INSERT 0 3
test=> INSERT INTO node (label,parent_id) VALUES ('garbage4',6);
INSERT 0 1
test=> SELECT * FROM node;
id | label | parent_id
----+----------+-----------
1 | n1 |
2 | n2 | 1
3 | n3 | 2
4 | n4 | 3
5 | garbage1 |
6 | garbage2 |
7 | garbage3 |
8 | garbage4 | 6
(8 rows)
This performs a recursive query on every id in node:
test=> WITH RECURSIVE nodes_cte(id, label, parent_id, depth, path) AS (
SELECT tn.id, tn.label, tn.parent_id, 1::INT AS depth, tn.id::TEXT AS path
FROM node AS tn
WHERE tn.parent_id IS NULL
UNION ALL
SELECT c.id, c.label, c.parent_id, p.depth + 1 AS depth,
(p.path || '->' || c.id::TEXT)
FROM nodes_cte AS p, node AS c
WHERE c.parent_id = p.id
)
SELECT * FROM nodes_cte AS n ORDER BY n.id ASC;
id | label | parent_id | depth | path
----+----------+-----------+-------+------------
1 | n1 | | 1 | 1
2 | n2 | 1 | 2 | 1->2
3 | n3 | 2 | 3 | 1->2->3
4 | n4 | 3 | 4 | 1->2->3->4
5 | garbage1 | | 1 | 5
6 | garbage2 | | 1 | 6
7 | garbage3 | | 1 | 7
8 | garbage4 | 6 | 2 | 6->8
(8 rows)
This gets all of the descendents WHERE node.id = 1:
test=> WITH RECURSIVE nodes_cte(id, label, parent_id, depth, path) AS (
SELECT tn.id, tn.label, tn.parent_id, 1::INT AS depth, tn.id::TEXT AS path FROM node AS tn WHERE tn.id = 1
UNION ALL
SELECT c.id, c.label, c.parent_id, p.depth + 1 AS depth, (p.path || '->' || c.id::TEXT) FROM nodes_cte AS p, node AS c WHERE c.parent_id = p.id
)
SELECT * FROM nodes_cte AS n;
id | label | parent_id | depth | path
----+-------+-----------+-------+------------
1 | n1 | | 1 | 1
2 | n2 | 1 | 2 | 1->2
3 | n3 | 2 | 3 | 1->2->3
4 | n4 | 3 | 4 | 1->2->3->4
(4 rows)
The following will get the path of the node with id 4:
test=> WITH RECURSIVE nodes_cte(id, label, parent_id, depth, path) AS (
SELECT tn.id, tn.label, tn.parent_id, 1::INT AS depth, tn.id::TEXT AS path
FROM node AS tn
WHERE tn.parent_id IS NULL
UNION ALL
SELECT c.id, c.label, c.parent_id, p.depth + 1 AS depth,
(p.path || '->' || c.id::TEXT)
FROM nodes_cte AS p, node AS c
WHERE c.parent_id = p.id
)
SELECT * FROM nodes_cte AS n WHERE n.id = 4;
id | label | parent_id | depth | path
----+-------+-----------+-------+------------
4 | n4 | 3 | 4 | 1->2->3->4
(1 row)
And let's assume you want to limit your search to descendants with a depth less than three (note that depth hasn't been incremented yet):
test=> WITH RECURSIVE nodes_cte(id, label, parent_id, depth, path) AS (
SELECT tn.id, tn.label, tn.parent_id, 1::INT AS depth, tn.id::TEXT AS path
FROM node AS tn WHERE tn.id = 1
UNION ALL
SELECT c.id, c.label, c.parent_id, p.depth + 1 AS depth,
(p.path || '->' || c.id::TEXT)
FROM nodes_cte AS p, node AS c
WHERE c.parent_id = p.id AND p.depth < 2
)
SELECT * FROM nodes_cte AS n;
id | label | parent_id | depth | path
----+-------+-----------+-------+------
1 | n1 | | 1 | 1
2 | n2 | 1 | 2 | 1->2
(2 rows)
I'd recommend using an ARRAY data type instead of a string for demonstrating the "path", but the arrow is more illustrative of the parent<=>child relationship.
Use WITH RECURSIVE to create a Common Table Expression (CTE). For the non-recursive term, get the rows in which the child is immediately below the parent:
SELECT
c.child_id,
c.parent_id
FROM
mytable c
LEFT JOIN
mytable p ON c.parent_id = p.child_id
WHERE
p.child_id IS NULL
child_id | parent_id
----------+-----------
102 | 101
104 | 103
For the recursive term, you want the children of these children.
WITH RECURSIVE tree(child, root) AS (
SELECT
c.child_id,
c.parent_id
FROM
mytable c
LEFT JOIN
mytable p ON c.parent_id = p.child_id
WHERE
p.child_id IS NULL
UNION
SELECT
child_id,
root
FROM
tree
INNER JOIN
mytable on tree.child = mytable.parent_id
)
SELECT * FROM tree;
child | root
-------+------
102 | 101
104 | 103
105 | 103
106 | 103
You can filter the children when querying the CTE:
WITH RECURSIVE tree(child, root) AS (...) SELECT root FROM tree WHERE child = 106;
root
------
103

Equivalent to unpivot() in PostgreSQL

Is there a unpivot equivalent function in PostgreSQL?
Create an example table:
CREATE TEMP TABLE foo (id int, a text, b text, c text);
INSERT INTO foo VALUES (1, 'ant', 'cat', 'chimp'), (2, 'grape', 'mint', 'basil');
You can 'unpivot' or 'uncrosstab' using UNION ALL:
SELECT id,
'a' AS colname,
a AS thing
FROM foo
UNION ALL
SELECT id,
'b' AS colname,
b AS thing
FROM foo
UNION ALL
SELECT id,
'c' AS colname,
c AS thing
FROM foo
ORDER BY id;
This runs 3 different subqueries on foo, one for each column we want to unpivot, and returns, in one table, every record from each of the subqueries.
But that will scan the table N times, where N is the number of columns you want to unpivot. This is inefficient, and a big problem when, for example, you're working with a very large table that takes a long time to scan.
Instead, use:
SELECT id,
unnest(array['a', 'b', 'c']) AS colname,
unnest(array[a, b, c]) AS thing
FROM foo
ORDER BY id;
This is easier to write, and it will only scan the table once.
array[a, b, c] returns an array object, with the values of a, b, and c as it's elements.
unnest(array[a, b, c]) breaks the results into one row for each of the array's elements.
You could use VALUES() and JOIN LATERAL to unpivot the columns.
Sample data:
CREATE TABLE test(id int, a INT, b INT, c INT);
INSERT INTO test(id,a,b,c) VALUES (1,11,12,13),(2,21,22,23),(3,31,32,33);
Query:
SELECT t.id, s.col_name, s.col_value
FROM test t
JOIN LATERAL(VALUES('a',t.a),('b',t.b),('c',t.c)) s(col_name, col_value) ON TRUE;
DBFiddle Demo
Using this approach it is possible to unpivot multiple groups of columns at once.
EDIT
Using Zack's suggestion:
SELECT t.id, col_name, col_value
FROM test t
CROSS JOIN LATERAL (VALUES('a', t.a),('b', t.b),('c',t.c)) s(col_name, col_value);
<=>
SELECT t.id, col_name, col_value
FROM test t
,LATERAL (VALUES('a', t.a),('b', t.b),('c',t.c)) s(col_name, col_value);
db<>fiddle demo
Great article by Thomas Kellerer found here
Unpivot with Postgres
Sometimes it’s necessary to normalize de-normalized tables - the opposite of a “crosstab” or “pivot” operation. Postgres does not support an UNPIVOT operator like Oracle or SQL Server, but simulating it, is very simple.
Take the following table that stores aggregated values per quarter:
create table customer_turnover
(
customer_id integer,
q1 integer,
q2 integer,
q3 integer,
q4 integer
);
And the following sample data:
customer_id | q1 | q2 | q3 | q4
------------+-----+-----+-----+----
1 | 100 | 210 | 203 | 304
2 | 150 | 118 | 422 | 257
3 | 220 | 311 | 271 | 269
But we want the quarters to be rows (as they should be in a normalized data model).
In Oracle or SQL Server this could be achieved with the UNPIVOT operator, but that is not available in Postgres. However Postgres’ ability to use the VALUES clause like a table makes this actually quite easy:
select c.customer_id, t.*
from customer_turnover c
cross join lateral (
values
(c.q1, 'Q1'),
(c.q2, 'Q2'),
(c.q3, 'Q3'),
(c.q4, 'Q4')
) as t(turnover, quarter)
order by customer_id, quarter;
will return the following result:
customer_id | turnover | quarter
------------+----------+--------
1 | 100 | Q1
1 | 210 | Q2
1 | 203 | Q3
1 | 304 | Q4
2 | 150 | Q1
2 | 118 | Q2
2 | 422 | Q3
2 | 257 | Q4
3 | 220 | Q1
3 | 311 | Q2
3 | 271 | Q3
3 | 269 | Q4
The equivalent query with the standard UNPIVOT operator would be:
select customer_id, turnover, quarter
from customer_turnover c
UNPIVOT (turnover for quarter in (q1 as 'Q1',
q2 as 'Q2',
q3 as 'Q3',
q4 as 'Q4'))
order by customer_id, quarter;
FYI for those of us looking for how to unpivot in RedShift.
The long form solution given by Stew appears to be the only way to accomplish this.
For those who cannot see it there, here is the text pasted below:
We do not have built-in functions that will do pivot or unpivot. However,
you can always write SQL to do that.
create table sales (regionid integer, q1 integer, q2 integer, q3 integer, q4 integer);
insert into sales values (1,10,12,14,16), (2,20,22,24,26);
select * from sales order by regionid;
regionid | q1 | q2 | q3 | q4
----------+----+----+----+----
1 | 10 | 12 | 14 | 16
2 | 20 | 22 | 24 | 26
(2 rows)
pivot query
create table sales_pivoted (regionid, quarter, sales)
as
select regionid, 'Q1', q1 from sales
UNION ALL
select regionid, 'Q2', q2 from sales
UNION ALL
select regionid, 'Q3', q3 from sales
UNION ALL
select regionid, 'Q4', q4 from sales
;
select * from sales_pivoted order by regionid, quarter;
regionid | quarter | sales
----------+---------+-------
1 | Q1 | 10
1 | Q2 | 12
1 | Q3 | 14
1 | Q4 | 16
2 | Q1 | 20
2 | Q2 | 22
2 | Q3 | 24
2 | Q4 | 26
(8 rows)
unpivot query
select regionid, sum(Q1) as Q1, sum(Q2) as Q2, sum(Q3) as Q3, sum(Q4) as Q4
from
(select regionid,
case quarter when 'Q1' then sales else 0 end as Q1,
case quarter when 'Q2' then sales else 0 end as Q2,
case quarter when 'Q3' then sales else 0 end as Q3,
case quarter when 'Q4' then sales else 0 end as Q4
from sales_pivoted)
group by regionid
order by regionid;
regionid | q1 | q2 | q3 | q4
----------+----+----+----+----
1 | 10 | 12 | 14 | 16
2 | 20 | 22 | 24 | 26
(2 rows)
Hope this helps, Neil
Pulling slightly modified content from the link in the comment from #a_horse_with_no_name into an answer because it works:
Installing Hstore
If you don't have hstore installed and are running PostgreSQL 9.1+, you can use the handy
CREATE EXTENSION hstore;
For lower versions, look for the hstore.sql file in share/contrib and run in your database.
Assuming that your source (e.g., wide data) table has one 'id' column, named id_field, and any number of 'value' columns, all of the same type, the following will create an unpivoted view of that table.
CREATE VIEW vw_unpivot AS
SELECT id_field, (h).key AS column_name, (h).value AS column_value
FROM (
SELECT id_field, each(hstore(foo) - 'id_field'::text) AS h
FROM zcta5 as foo
) AS unpiv ;
This works with any number of 'value' columns. All of the resulting values will be text, unless you cast, e.g., (h).value::numeric.
Just use JSON:
with data (id, name) as (
values (1, 'a'), (2, 'b')
)
select t.*
from data, lateral jsonb_each_text(to_jsonb(data)) with ordinality as t
order by data.id, t.ordinality;
This yields
|key |value|ordinality|
|----|-----|----------|
|id |1 |1 |
|name|a |2 |
|id |2 |1 |
|name|b |2 |
dbfiddle
I wrote a horrible unpivot function for PostgreSQL. It's rather slow but it at least returns results like you'd expect an unpivot operation to.
https://cgsrv1.arrc.csiro.au/blog/2010/05/14/unpivotuncrosstab-in-postgresql/
Hopefully you can find it useful..
Depending on what you want to do... something like this can be helpful.
with wide_table as (
select 1 a, 2 b, 3 c
union all
select 4 a, 5 b, 6 c
)
select unnest(array[a,b,c]) from wide_table
You can use FROM UNNEST() array handling to UnPivot a dataset, tandem with a correlated subquery (works w/ PG 9.4).
FROM UNNEST() is more powerful & flexible than the typical method of using FROM (VALUES .... ) to unpivot datasets. This is b/c FROM UNNEST() is variadic (with n-ary arity). By using a correlated subquery the need for the lateral ORDINAL clause is eliminated, & Postgres keeps the resulting parallel columnar sets in the proper ordinal sequence.
This is, BTW, FAST -- in practical use spawning 8 million rows in < 15 seconds on a 24-core system.
WITH _students AS ( /** CTE **/
SELECT * FROM
( SELECT 'jane'::TEXT ,'doe'::TEXT , 1::INT
UNION
SELECT 'john'::TEXT ,'doe'::TEXT , 2::INT
UNION
SELECT 'jerry'::TEXT ,'roe'::TEXT , 3::INT
UNION
SELECT 'jodi'::TEXT ,'roe'::TEXT , 4::INT
) s ( fn, ln, id )
) /** end WITH **/
SELECT s.id
, ax.fanm -- field labels, now expanded to two rows
, ax.anm -- field data, now expanded to two rows
, ax.someval -- manually incl. data
, ax.rankednum -- manually assigned ranks
,ax.genser -- auto-generate ranks
FROM _students s
,UNNEST /** MULTI-UNNEST() BLOCK **/
(
( SELECT ARRAY[ fn, ln ]::text[] AS anm -- expanded into two rows by outer UNNEST()
/** CORRELATED SUBQUERY **/
FROM _students s2 WHERE s2.id = s.id -- outer relation
)
,( /** ordinal relationship preserved in variadic UNNEST() **/
SELECT ARRAY[ 'first name', 'last name' ]::text[] -- exp. into 2 rows
AS fanm
)
,( SELECT ARRAY[ 'z','x','y'] -- only 3 rows gen'd, but ordinal rela. kept
AS someval
)
,( SELECT ARRAY[ 1,2,3,4,5 ] -- 5 rows gen'd, ordinal rela. kept.
AS rankednum
)
,( SELECT ARRAY( /** you may go wild ... **/
SELECT generate_series(1, 15, 3 )
AS genser
)
)
) ax ( anm, fanm, someval, rankednum , genser )
;
RESULT SET:
+--------+----------------+-----------+----------+---------+-------
| id | fanm | anm | someval |rankednum| [ etc. ]
+--------+----------------+-----------+----------+---------+-------
| 2 | first name | john | z | 1 | .
| 2 | last name | doe | y | 2 | .
| 2 | [null] | [null] | x | 3 | .
| 2 | [null] | [null] | [null] | 4 | .
| 2 | [null] | [null] | [null] | 5 | .
| 1 | first name | jane | z | 1 | .
| 1 | last name | doe | y | 2 | .
| 1 | | | x | 3 | .
| 1 | | | | 4 | .
| 1 | | | | 5 | .
| 4 | first name | jodi | z | 1 | .
| 4 | last name | roe | y | 2 | .
| 4 | | | x | 3 | .
| 4 | | | | 4 | .
| 4 | | | | 5 | .
| 3 | first name | jerry | z | 1 | .
| 3 | last name | roe | y | 2 | .
| 3 | | | x | 3 | .
| 3 | | | | 4 | .
| 3 | | | | 5 | .
+--------+----------------+-----------+----------+---------+ ----
Here's a way that combines the hstore and CROSS JOIN approaches from other answers.
It's a modified version of my answer to a similar question, which is itself based on the method at https://blog.sql-workbench.eu/post/dynamic-unpivot/ and another answer to that question.
-- Example wide data with a column for each year...
WITH example_wide_data("id", "2001", "2002", "2003", "2004") AS (
VALUES
(1, 4, 5, 6, 7),
(2, 8, 9, 10, 11)
)
-- that is tided to have "year" and "value" columns
SELECT
id,
r.key AS year,
r.value AS value
FROM
example_wide_data w
CROSS JOIN
each(hstore(w.*)) AS r(key, value)
WHERE
-- This chooses columns that look like years
-- In other cases you might need a different condition
r.key ~ '^[0-9]{4}$';
It has a few benefits over other solutions:
By using hstore and not jsonb, it hopefully minimises issues with type conversions (although hstore does convert everything to text)
The columns don't need to be hard coded or known in advance. Here, columns are chosen by a regex on the name, but you could use any SQL logic based on the name, or even the value.
It doesn't require PL/pgSQL - it's all SQL