DB2: How to join indirectly referenced data

DB2: How to join indirectly referenced data - db2

I have the following given table structure (I've removed some columns and created a stub) to support versioning and reduce duplication of data. Imagine an article review process whereas each step is stored in database (article_meta). Whenever the article itself changes, the data is stored in DB, too.
The versioning is done by a reference to the predecessor (pre_meta_id).
WITH
t_article_meta (id, pre_meta_id, user_id, state) as (
values (1, NULL, 101, 'submitted')
union all values (2, 1, 7, 'inreview')
union all values (3, 2, 7, 'rejected')
union all values (4, 3, 101, 'submitted')
union all values (5, NULL, 202, 'submitted')
union all values (6, 5, 7, 'inreview')
union all values (7, 6, 7, 'accepted')
union all values (8, 4, 7, 'inreview')
union all values (9, 8, 7, 'accepted')
),
t_article (id, meta_id, content) as (
values (1, 1, 'Hello wordl')
union all values (2, 4, 'Hello world')
union all values (3, 5, 'Lorem ipsum doloret')
)
SELECT ...;
Now I want to create a view that somehow combines meta data and article data even if there is no direct reference (only indirect via predecessor).
id | pre_meta_id | user_id | state | content (left join) | content (I want to have)
---|-------------|---------|-----------|---------------------|-------------------------
1 | NULL | 101 | submitted | Hello wordl | Hello wordl
2 | 1 | 7 | inreview | NULL | Hello wordl
3 | 2 | 7 | rejected | NULL | Hello wordl
4 | 3 | 101 | submitted | Hello world | Hello world
5 | NULL | 202 | submitted | Lorem ipsum doloret | Lorem ipsum doloret
6 | 5 | 7 | inreview | NULL | Lorem ipsum doloret
7 | 6 | 7 | accepted | NULL | Lorem ipsum doloret
8 | 4 | 7 | inreview | NULL | Hello world
9 | 8 | 7 | accepted | NULL | Hello world
How can I realize something like that in DB2 in a performing way? My first idea: a join on a function (to get the predecessor with an article related) sounds really expensive to me.

This SQL would do the job:
SELECT m.id, successor_id, user_id, state, content,
last_value(content,'IGNORE NULLS') over (order by m.id) as last_value
FROM article_meta m
LEFT JOIN article a
ON m.id = a.article_meta_id
ORDER BY m.id
It is the regular join to combine the tables with an aditional column (with another name compared to your expected result to show the difference)
You might want to rename that column and remove content to get a exact match to you expected result.
For the adjusted requirements the SQL gets more complex as we have to define a recursive query to get the title/content for all the childs - it will look like this:
with temp (id, pre_meta_id, user_id, state, level, parent, root) as (
select m.id, m.pre_meta_id, m.user_id, m.state, 1 as level, m.pre_meta_id as parent, m.id as root
from article_meta m, article a
where m.id = a.meta_id
union all
select m.id, m.pre_meta_id, m.user_id, m.state, level + 1 as level, t.id as parent, t.root
from temp t, article_meta m
where m.pre_meta_id = t.id
and m.id not in (select meta_id from article)
and level < 10
)
select *
from temp t
left join article a
on t.root = a.meta_id
order by 1

Related

Aggregate all combinations of rows taken k at a time

I am trying to calculate an aggregate function for a field for a subset of rows in a table. The problem is that I'd like to find the mean of every combination of rows taken k at a time --- so for all the rows, I'd like to find (say) the mean of every combination of 10 rows. So:
id | count
----|------
1 | 5
2 | 3
3 | 6
...
30 | 16
should give me
mean of ids 1..10; ids 1, 3..11; ids 1, 4..12, and so so. I know this will yield a lot of rows.
There are SO answers for finding combinations from arrays. I could do this programmatically by taking 30 ids 10 at a time and then SELECTing them. Is there a way to do this with PARTITION BY, TABLESAMPLE, or another function (something like python's itertools.combinations())? (TABLESAMPLE by itself won't guarantee which subset of rows I am selecting as far as I can tell.)

The method described in the cited answer is static. A more convenient solution may be to use recursion.
Example data:
drop table if exists my_table;
create table my_table(id int primary key, number int);
insert into my_table values
(1, 5),
(2, 3),
(3, 6),
(4, 9),
(5, 2);
Query which finds 2 element subsets in 5 element set (k-combination with k = 2):
with recursive recur as (
select
id,
array[id] as combination,
array[number] as numbers,
number as sum
from my_table
union all
select
t.id,
combination || t.id,
numbers || t.number,
sum+ number
from my_table t
join recur r on r.id < t.id
and cardinality(combination) < 2 -- param k
)
select combination, numbers, sum/2.0 as average -- param k
from recur
where cardinality(combination) = 2 -- param k
combination | numbers | average
-------------+---------+--------------------
{1,2} | {5,3} | 4.0000000000000000
{1,3} | {5,6} | 5.5000000000000000
{1,4} | {5,9} | 7.0000000000000000
{1,5} | {5,2} | 3.5000000000000000
{2,3} | {3,6} | 4.5000000000000000
{2,4} | {3,9} | 6.0000000000000000
{2,5} | {3,2} | 2.5000000000000000
{3,4} | {6,9} | 7.5000000000000000
{3,5} | {6,2} | 4.0000000000000000
{4,5} | {9,2} | 5.5000000000000000
(10 rows)
The same query for k = 3 gives:
combination | numbers | average
-------------+---------+--------------------
{1,2,3} | {5,3,6} | 4.6666666666666667
{1,2,4} | {5,3,9} | 5.6666666666666667
{1,2,5} | {5,3,2} | 3.3333333333333333
{1,3,4} | {5,6,9} | 6.6666666666666667
{1,3,5} | {5,6,2} | 4.3333333333333333
{1,4,5} | {5,9,2} | 5.3333333333333333
{2,3,4} | {3,6,9} | 6.0000000000000000
{2,3,5} | {3,6,2} | 3.6666666666666667
{2,4,5} | {3,9,2} | 4.6666666666666667
{3,4,5} | {6,9,2} | 5.6666666666666667
(10 rows)
Of course, you can remove numbers from the query if you do not need them.

how to number distinct values while respecting their original ordering?

Here's my input data:
CREATE TEMP TABLE test AS SELECT * FROM (VALUES
(1, 12),
(2, 7),
(3, 8),
(4, 8),
(5, 7)
) AS rows (position, value);
I want to, in a single query (no subqueries or CTEs), assign a unique number for each distinct value. However, I also want those numbers to ascend according to the associated position -- i.e., a distinct value's number should be assigned according to its lowest position.
Assumptions:
each row will always have a unique position
value is not guaranteed unique per row
the number of a distinct value is only for ordinal purposes, e.g. it doesn't matter whether distinct_values goes 1-2-3 or 3-8-14
The desired output is:
position | value | distinct_value
----------+-------+----------------
1 | 12 | 1
2 | 7 | 2
3 | 8 | 3
4 | 8 | 3
5 | 7 | 2
I can get close using DENSE_RANK to number distinct values:
SELECT
position,
value,
DENSE_RANK() OVER (ORDER BY value) AS distinct_value
FROM test ORDER BY position;
The result obviously ignores position:
position | value | distinct_value
----------+-------+----------------
1 | 12 | 3
2 | 7 | 1
3 | 8 | 2
4 | 8 | 2
5 | 7 | 1
Is there a better window function for this?

with
t(x,y) as (values
(1, 12),
(2, 7),
(3, 8),
(4, 8),
(5, 7)),
pos(i,y) as (select min(x), y from t group by y),
ind(i,y) as (select row_number() over(order by i), y from pos)
select * from ind join t using(y) order by x;

Grouping by unique values inside a JSONB array

Consider the following table structure:
CREATE TABLE residences (id int, price int, categories jsonb);
INSERT INTO residences VALUES
(1, 3, '["monkeys", "hamsters", "foxes"]'),
(2, 5, '["monkeys", "hamsters", "foxes", "foxes"]'),
(3, 7, '[]'),
(4, 11, '["turtles"]');
SELECT * FROM residences;
id | price | categories
----+-------+-------------------------------------------
1 | 3 | ["monkeys", "hamsters", "foxes"]
2 | 5 | ["monkeys", "hamsters", "foxes", "foxes"]
3 | 7 | []
4 | 11 | ["turtles"]
Now I would like to know how many residences there are for each category, as well as their sum of prices. The only way I found was to do this was using a sub-query:
SELECT category, SUM(price), COUNT(*) AS residences_no
FROM
residences a,
(
SELECT DISTINCT(jsonb_array_elements(categories)) AS category
FROM residences
) b
WHERE a.categories #> category
GROUP BY category
ORDER BY category;
category | sum | residences_no
------------+-----+---------------
"foxes" | 8 | 2
"hamsters" | 8 | 2
"monkeys" | 8 | 2
"turtles" | 11 | 1
Using jsonb_array_elements without subquery would return three residences for foxes because of the duplicate entry in the second row. Also the price of the residence would be inflated by 5.
Is there any way to do this without using the sub-query, or any better way to accomplish this result?
EDIT
Initially I did not mention the price column.

select category, count(distinct (id, category))
from residences, jsonb_array_elements(categories) category
group by category
order by category;
category | count
------------+-------
"foxes" | 2
"hamsters" | 2
"monkeys" | 2
"turtles" | 1
(4 rows)
You have to use a derived table to aggregate another column (all prices at 10):
select category, count(*), sum(price) total
from (
select distinct id, category, price
from residences, jsonb_array_elements(categories) category
) s
group by category
order by category;
category | count | total
------------+-------+-------
"foxes" | 2 | 20
"hamsters" | 2 | 20
"monkeys" | 2 | 20
"turtles" | 1 | 10
(4 rows)

Select statement with join, or subquery limit

For few days now I'm trying to solve this problem.
I have table group_user, group_name.
What I wanna to do is select user groups, than description that group (from group_name), and 10 other users from the group.
It's not problem with first two. The problem is, that I'm nowhere to get limit users.
I can select user_group, and other users in that group. I don't know how to limit that.
Using:
SELECT a.g_id,b.group,b.userid
FROM group_user AS a
RIGHT JOIN
(SELECT g_id as group, u_id as userid FROM group_user) AS b ON a.g_id=b.group
WHERE u_id=112
It showing me, my user groups and users in that group. But when I'm trying to limit in subwuery, it limits all, not particular group.
I tried, Select users, with using IN where was goups of my user without luck.
I was thinking maybe group and having will help, but I can't see how I could use it.
So my question is, how can I limit subquery result in MySQL where the subquery is built on result of query.
I think im overload and maybe I don't see something.
UPDATE to show what I really wanna accomplish here's another piece of code.
SELECT g_id FROM group_user WHERE user_id = 112
So I get all groups that user is in let, saye each of that select is var extra_group, so second query will be
SELECT u_id FROM group_user WHERE group_id = extra_group LIMIT 10
I need to do same as above, in one query.
another UPDATE after MIKE post.
I should ADD that, user can be in more than 1 group. So I think the real problem is, that I don't have any clue how to select those groups and in same query select 10 users for selected groups, so in result could be
g_id u_id
1 | 2
1 | 3
1 | 4
3 | 3
3 | 8
where g_id is user groups from that query
SELECT g_id FROM group_user WHERE user_id = 112

Create sample tables and add data:
CREATE TABLE `group_user` (
`u_id` int(11) DEFAULT NULL,
`g_id` int(11) DEFAULT NULL,
`apply_date` date DEFAULT NULL
);
CREATE TABLE `group_name` (
`g_id` int(11) DEFAULT NULL,
`g_name` varchar(255) DEFAULT NULL
);
INSERT INTO `group_name` VALUES
(1, 'Group 1'), (2, 'Group 2'), (3, 'Group 3'), (4, 'Group 4'), (5, 'Group 5');
INSERT INTO `group_user` VALUES
(1, 1, '2010-12-01'), (1, 2, '2010-12-01'), (1, 3, '2010-12-01'), (1, 4, '2010-12-01'), (1, 5, '2010-12-01'),
(2, 1, '2010-12-02'), (2, 2, '2010-12-02'),
(3, 1, '2010-12-03'), (3, 2, '2010-12-03'), (3, 3, '2010-12-03'), (3, 4, '2010-12-03'),
(4, 1, '2010-12-04'), (4, 2, '2010-12-04'),
(5, 1, '2010-12-05'), (5, 2, '2010-12-05'),
(6, 1, '2010-12-06'), (6, 2, '2010-12-06'),
(7, 1, '2010-12-07'), (7, 2, '2010-12-07'), (7, 3, '2010-12-07'), (7, 4, '2010-12-07'), (7, 5, '2010-12-07'),
(8, 1, '2010-12-08'), (8, 2, '2010-12-08'),
(9, 1, '2010-12-09'), (9, 2, '2010-12-09'), (9, 3, '2010-12-09'), (9, 4, '2010-12-09'), (9, 5, '2010-12-09');
Select the groups of which user u_id == 1 is a member. Then for each group select a maximum of 4 members (excluding user u_id == 1), ordered by descending apply_date:
SELECT u3.g_id, g.g_name, u3.u_id, u3.apply_date
FROM (
SELECT
u1.g_id,
u1.u_id,
u1.apply_date,
IF( #prev_gid <> u1.g_id, #user_index := 1, #user_index := #user_index + 1 ) AS user_index,
#prev_gid := u1.g_id AS prev_gid
FROM group_user AS u1
JOIN (SELECT #prev_gid := 0, #user_index := NULL) AS vars
JOIN group_user AS u2
ON u2.g_id = u1.g_id
AND u2.u_id = 1
AND u1.u_id <> 1
ORDER BY u1.g_id, u1.apply_date DESC, u1.u_id
) AS u3
JOIN group_name AS g ON g.g_id = u3.g_id
WHERE u3.user_index <= 4
ORDER BY u3.g_id, u3.apply_date DESC, u3.u_id;
+------+---------+------+------------+
| g_id | g_name | u_id | apply_date |
+------+---------+------+------------+
| 1 | Group 1 | 5 | 2010-12-05 |
| 1 | Group 1 | 4 | 2010-12-04 |
| 1 | Group 1 | 3 | 2010-12-03 |
| 1 | Group 1 | 2 | 2010-12-02 |
| 2 | Group 2 | 5 | 2010-12-05 |
| 2 | Group 2 | 4 | 2010-12-04 |
| 2 | Group 2 | 3 | 2010-12-03 |
| 2 | Group 2 | 2 | 2010-12-02 |
| 3 | Group 3 | 9 | 2010-12-09 |
| 3 | Group 3 | 7 | 2010-12-07 |
| 3 | Group 3 | 3 | 2010-12-03 |
| 4 | Group 4 | 9 | 2010-12-09 |
| 4 | Group 4 | 7 | 2010-12-07 |
| 4 | Group 4 | 3 | 2010-12-03 |
| 5 | Group 5 | 9 | 2010-12-09 |
| 5 | Group 5 | 7 | 2010-12-07 |
+------+---------+------+------------+

Equivalent to unpivot() in PostgreSQL

Is there a unpivot equivalent function in PostgreSQL?

Create an example table:
CREATE TEMP TABLE foo (id int, a text, b text, c text);
INSERT INTO foo VALUES (1, 'ant', 'cat', 'chimp'), (2, 'grape', 'mint', 'basil');
You can 'unpivot' or 'uncrosstab' using UNION ALL:
SELECT id,
'a' AS colname,
a AS thing
FROM foo
UNION ALL
SELECT id,
'b' AS colname,
b AS thing
FROM foo
UNION ALL
SELECT id,
'c' AS colname,
c AS thing
FROM foo
ORDER BY id;
This runs 3 different subqueries on foo, one for each column we want to unpivot, and returns, in one table, every record from each of the subqueries.
But that will scan the table N times, where N is the number of columns you want to unpivot. This is inefficient, and a big problem when, for example, you're working with a very large table that takes a long time to scan.
Instead, use:
SELECT id,
unnest(array['a', 'b', 'c']) AS colname,
unnest(array[a, b, c]) AS thing
FROM foo
ORDER BY id;
This is easier to write, and it will only scan the table once.
array[a, b, c] returns an array object, with the values of a, b, and c as it's elements.
unnest(array[a, b, c]) breaks the results into one row for each of the array's elements.

You could use VALUES() and JOIN LATERAL to unpivot the columns.
Sample data:
CREATE TABLE test(id int, a INT, b INT, c INT);
INSERT INTO test(id,a,b,c) VALUES (1,11,12,13),(2,21,22,23),(3,31,32,33);
Query:
SELECT t.id, s.col_name, s.col_value
FROM test t
JOIN LATERAL(VALUES('a',t.a),('b',t.b),('c',t.c)) s(col_name, col_value) ON TRUE;
DBFiddle Demo
Using this approach it is possible to unpivot multiple groups of columns at once.
EDIT
Using Zack's suggestion:
SELECT t.id, col_name, col_value
FROM test t
CROSS JOIN LATERAL (VALUES('a', t.a),('b', t.b),('c',t.c)) s(col_name, col_value);
<=>
SELECT t.id, col_name, col_value
FROM test t
,LATERAL (VALUES('a', t.a),('b', t.b),('c',t.c)) s(col_name, col_value);
db<>fiddle demo

Great article by Thomas Kellerer found here
Unpivot with Postgres
Sometimes it’s necessary to normalize de-normalized tables - the opposite of a “crosstab” or “pivot” operation. Postgres does not support an UNPIVOT operator like Oracle or SQL Server, but simulating it, is very simple.
Take the following table that stores aggregated values per quarter:
create table customer_turnover
(
customer_id integer,
q1 integer,
q2 integer,
q3 integer,
q4 integer
);
And the following sample data:
customer_id | q1 | q2 | q3 | q4
------------+-----+-----+-----+----
1 | 100 | 210 | 203 | 304
2 | 150 | 118 | 422 | 257
3 | 220 | 311 | 271 | 269
But we want the quarters to be rows (as they should be in a normalized data model).
In Oracle or SQL Server this could be achieved with the UNPIVOT operator, but that is not available in Postgres. However Postgres’ ability to use the VALUES clause like a table makes this actually quite easy:
select c.customer_id, t.*
from customer_turnover c
cross join lateral (
values
(c.q1, 'Q1'),
(c.q2, 'Q2'),
(c.q3, 'Q3'),
(c.q4, 'Q4')
) as t(turnover, quarter)
order by customer_id, quarter;
will return the following result:
customer_id | turnover | quarter
------------+----------+--------
1 | 100 | Q1
1 | 210 | Q2
1 | 203 | Q3
1 | 304 | Q4
2 | 150 | Q1
2 | 118 | Q2
2 | 422 | Q3
2 | 257 | Q4
3 | 220 | Q1
3 | 311 | Q2
3 | 271 | Q3
3 | 269 | Q4
The equivalent query with the standard UNPIVOT operator would be:
select customer_id, turnover, quarter
from customer_turnover c
UNPIVOT (turnover for quarter in (q1 as 'Q1',
q2 as 'Q2',
q3 as 'Q3',
q4 as 'Q4'))
order by customer_id, quarter;

FYI for those of us looking for how to unpivot in RedShift.
The long form solution given by Stew appears to be the only way to accomplish this.
For those who cannot see it there, here is the text pasted below:
We do not have built-in functions that will do pivot or unpivot. However,
you can always write SQL to do that.
create table sales (regionid integer, q1 integer, q2 integer, q3 integer, q4 integer);
insert into sales values (1,10,12,14,16), (2,20,22,24,26);
select * from sales order by regionid;
regionid | q1 | q2 | q3 | q4
----------+----+----+----+----
1 | 10 | 12 | 14 | 16
2 | 20 | 22 | 24 | 26
(2 rows)
pivot query
create table sales_pivoted (regionid, quarter, sales)
as
select regionid, 'Q1', q1 from sales
UNION ALL
select regionid, 'Q2', q2 from sales
UNION ALL
select regionid, 'Q3', q3 from sales
UNION ALL
select regionid, 'Q4', q4 from sales
;
select * from sales_pivoted order by regionid, quarter;
regionid | quarter | sales
----------+---------+-------
1 | Q1 | 10
1 | Q2 | 12
1 | Q3 | 14
1 | Q4 | 16
2 | Q1 | 20
2 | Q2 | 22
2 | Q3 | 24
2 | Q4 | 26
(8 rows)
unpivot query
select regionid, sum(Q1) as Q1, sum(Q2) as Q2, sum(Q3) as Q3, sum(Q4) as Q4
from
(select regionid,
case quarter when 'Q1' then sales else 0 end as Q1,
case quarter when 'Q2' then sales else 0 end as Q2,
case quarter when 'Q3' then sales else 0 end as Q3,
case quarter when 'Q4' then sales else 0 end as Q4
from sales_pivoted)
group by regionid
order by regionid;
regionid | q1 | q2 | q3 | q4
----------+----+----+----+----
1 | 10 | 12 | 14 | 16
2 | 20 | 22 | 24 | 26
(2 rows)
Hope this helps, Neil

Pulling slightly modified content from the link in the comment from #a_horse_with_no_name into an answer because it works:
Installing Hstore
If you don't have hstore installed and are running PostgreSQL 9.1+, you can use the handy
CREATE EXTENSION hstore;
For lower versions, look for the hstore.sql file in share/contrib and run in your database.
Assuming that your source (e.g., wide data) table has one 'id' column, named id_field, and any number of 'value' columns, all of the same type, the following will create an unpivoted view of that table.
CREATE VIEW vw_unpivot AS
SELECT id_field, (h).key AS column_name, (h).value AS column_value
FROM (
SELECT id_field, each(hstore(foo) - 'id_field'::text) AS h
FROM zcta5 as foo
) AS unpiv ;
This works with any number of 'value' columns. All of the resulting values will be text, unless you cast, e.g., (h).value::numeric.

Just use JSON:
with data (id, name) as (
values (1, 'a'), (2, 'b')
)
select t.*
from data, lateral jsonb_each_text(to_jsonb(data)) with ordinality as t
order by data.id, t.ordinality;
This yields
|key |value|ordinality|
|----|-----|----------|
|id |1 |1 |
|name|a |2 |
|id |2 |1 |
|name|b |2 |
dbfiddle

I wrote a horrible unpivot function for PostgreSQL. It's rather slow but it at least returns results like you'd expect an unpivot operation to.
https://cgsrv1.arrc.csiro.au/blog/2010/05/14/unpivotuncrosstab-in-postgresql/
Hopefully you can find it useful..

Depending on what you want to do... something like this can be helpful.
with wide_table as (
select 1 a, 2 b, 3 c
union all
select 4 a, 5 b, 6 c
)
select unnest(array[a,b,c]) from wide_table

You can use FROM UNNEST() array handling to UnPivot a dataset, tandem with a correlated subquery (works w/ PG 9.4).
FROM UNNEST() is more powerful & flexible than the typical method of using FROM (VALUES .... ) to unpivot datasets. This is b/c FROM UNNEST() is variadic (with n-ary arity). By using a correlated subquery the need for the lateral ORDINAL clause is eliminated, & Postgres keeps the resulting parallel columnar sets in the proper ordinal sequence.
This is, BTW, FAST -- in practical use spawning 8 million rows in < 15 seconds on a 24-core system.
WITH _students AS ( /** CTE **/
SELECT * FROM
( SELECT 'jane'::TEXT ,'doe'::TEXT , 1::INT
UNION
SELECT 'john'::TEXT ,'doe'::TEXT , 2::INT
UNION
SELECT 'jerry'::TEXT ,'roe'::TEXT , 3::INT
UNION
SELECT 'jodi'::TEXT ,'roe'::TEXT , 4::INT
) s ( fn, ln, id )
) /** end WITH **/
SELECT s.id
, ax.fanm -- field labels, now expanded to two rows
, ax.anm -- field data, now expanded to two rows
, ax.someval -- manually incl. data
, ax.rankednum -- manually assigned ranks
,ax.genser -- auto-generate ranks
FROM _students s
,UNNEST /** MULTI-UNNEST() BLOCK **/
(
( SELECT ARRAY[ fn, ln ]::text[] AS anm -- expanded into two rows by outer UNNEST()
/** CORRELATED SUBQUERY **/
FROM _students s2 WHERE s2.id = s.id -- outer relation
)
,( /** ordinal relationship preserved in variadic UNNEST() **/
SELECT ARRAY[ 'first name', 'last name' ]::text[] -- exp. into 2 rows
AS fanm
)
,( SELECT ARRAY[ 'z','x','y'] -- only 3 rows gen'd, but ordinal rela. kept
AS someval
)
,( SELECT ARRAY[ 1,2,3,4,5 ] -- 5 rows gen'd, ordinal rela. kept.
AS rankednum
)
,( SELECT ARRAY( /** you may go wild ... **/
SELECT generate_series(1, 15, 3 )
AS genser
)
)
) ax ( anm, fanm, someval, rankednum , genser )
;
RESULT SET:
+--------+----------------+-----------+----------+---------+-------
| id | fanm | anm | someval |rankednum| [ etc. ]
+--------+----------------+-----------+----------+---------+-------
| 2 | first name | john | z | 1 | .
| 2 | last name | doe | y | 2 | .
| 2 | [null] | [null] | x | 3 | .
| 2 | [null] | [null] | [null] | 4 | .
| 2 | [null] | [null] | [null] | 5 | .
| 1 | first name | jane | z | 1 | .
| 1 | last name | doe | y | 2 | .
| 1 | | | x | 3 | .
| 1 | | | | 4 | .
| 1 | | | | 5 | .
| 4 | first name | jodi | z | 1 | .
| 4 | last name | roe | y | 2 | .
| 4 | | | x | 3 | .
| 4 | | | | 4 | .
| 4 | | | | 5 | .
| 3 | first name | jerry | z | 1 | .
| 3 | last name | roe | y | 2 | .
| 3 | | | x | 3 | .
| 3 | | | | 4 | .
| 3 | | | | 5 | .
+--------+----------------+-----------+----------+---------+ ----

Here's a way that combines the hstore and CROSS JOIN approaches from other answers.
It's a modified version of my answer to a similar question, which is itself based on the method at https://blog.sql-workbench.eu/post/dynamic-unpivot/ and another answer to that question.
-- Example wide data with a column for each year...
WITH example_wide_data("id", "2001", "2002", "2003", "2004") AS (
VALUES
(1, 4, 5, 6, 7),
(2, 8, 9, 10, 11)
)
-- that is tided to have "year" and "value" columns
SELECT
id,
r.key AS year,
r.value AS value
FROM
example_wide_data w
CROSS JOIN
each(hstore(w.*)) AS r(key, value)
WHERE
-- This chooses columns that look like years
-- In other cases you might need a different condition
r.key ~ '^[0-9]{4}$';
It has a few benefits over other solutions:
By using hstore and not jsonb, it hopefully minimises issues with type conversions (although hstore does convert everything to text)
The columns don't need to be hard coded or known in advance. Here, columns are chosen by a regex on the name, but you could use any SQL logic based on the name, or even the value.
It doesn't require PL/pgSQL - it's all SQL

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

DB2: How to join indirectly referenced data - db2

Related

Aggregate all combinations of rows taken k at a time

how to number distinct values while respecting their original ordering?

Grouping by unique values inside a JSONB array

Select statement with join, or subquery limit

Equivalent to unpivot() in PostgreSQL

Categories

Resources