Oracle SQL Percent Difference Same Column - oracle12c

Given the following auction data, how would you find the percent difference between a persons most recent and previous bid for a product using Oracle SQL?
The duplicate sequence (SEQ) for person A and B is representative of data I am working with.
An example of your SQL would be very appreciated.
TXN_TIME | SEQ | PERSON | PRODUCT | TRANSACTION | BID |
2017-11-22 15:41:10:0 | 20 | A | 1 | BID | 12 |
2017-11-22 15:35:10:0 | 10C | A | 1 | CXLBID | NULL |
2017-11-22 15:34:25:0 | 10 | A | 1 | BID | 10 |
2017-11-22 15:35:40:0 | 6 | A | 2 | BID | 4 |
2017-11-22 15:34:50:0 | 1C | A | 2 | CXLBID | NULL |
2017-11-22 15:34:20:0 | 1 | A | 2 | BID | 5 |
2017-11-22 15:35:45:0 | 6 | B | 2 | BID | 2 |
2017-11-22 15:34:55:0 | 1C | B | 2 | CXLBID | NULL |
2017-11-22 15:34:25:0 | 1 | B | 2 | BID | 1 |

We could try to use LEAD/LAG analytic functions if they be available. But one approach here would be to use a CTE to identify just the most recent, and immediately prior, bid for each person, and then compare these two values.
WITH cte AS (
SELECT PERSON, BID,
ROW_NUMBER() OVER (PARTITION BY PERSON ORDER BY TXN_TIME DESC) rn
FROM yourTable
WHERE TRANSACTION = 'BID'
)
SELECT
t1.PERSON,
100*(t1.BID - t2.BID) / t2.BID AS BID_PCT_DIFF
FROM cte t1
INNER JOIN cte t2
ON t1.PERSON = t2.PERSON AND
t1.rn = 1 AND t2.rn = 2;
This output looks correct, because person A went from a bid of 4 to 12, which is an increase of 8, or 200%, and person B went from a bid of 1 to 2, which is a 100% increase.
I created a demo below in SQL Server, because I always have difficulties getting Oracle demos to work. But my query is just ANSI SQL and should run the same on either SQL Server or Oracle.
Demo

Good thing you are using Oracle 12. This way you can use the MATCH_RECOGNIZE clause, which is perfect for your problem.
I calculate the CHANGE column in the MATCH_RECOGNIZE clause, using the LAST() function with the optional second argument, which is a logical offset within the set of rows mapped to a specific pattern variable. I format the CHANGE column in the SELECT clause - I use a favorite hack, using the "currency" symbol to attach the percent sign... you can modify the formatting any way you want, without affecting the calculation (which is hidden in the MATCH_RECOGNIZE clause).
with auction_data ( txn_time, seq, person, product, transaction, bid ) as (
select timestamp '2017-11-22 15:41:10', '20' , 'A', 1, 'BID' , 12 from dual union all
select timestamp '2017-11-22 15:35:10', '10C', 'A', 1, 'CXLBID', NULL from dual union all
select timestamp '2017-11-22 15:34:25', '10' , 'A', 1, 'BID' , 10 from dual union all
select timestamp '2017-11-22 15:35:40', '6' , 'A', 2, 'BID' , 4 from dual union all
select timestamp '2017-11-22 15:34:50', '1C' , 'A', 2, 'CXLBID', NULL from dual union all
select timestamp '2017-11-22 15:34:20', '1' , 'A', 2, 'BID' , 5 from dual union all
select timestamp '2017-11-22 15:35:45', '6' , 'B', 2, 'BID' , 2 from dual union all
select timestamp '2017-11-22 15:34:55', '1C' , 'B', 2, 'CXLBID', NULL from dual union all
select timestamp '2017-11-22 15:34:25', '1' , 'B', 2, 'BID' , 1 from dual
)
-- End of simulated inputs (for testing only, not part of the solution).
select txn_time, seq, person, product, transaction, bid,
to_char( 100 * (change - 1), '999D0L', 'nls_currency=''%''') as change
from auction_data
match_recognize(
partition by person, product
order by txn_time
measures case when classifier() = 'B' then bid / last(B.bid, 1) end as change
all rows per match
pattern ( (B|A)* )
define B as B.transaction = 'BID'
);
TXN_TIME SEQ PERSON PRODUCT TRANSACTION BID CHANGE
------------------- --- ------ ---------- ----------- ---------- ----------------
2017-11-22 15:34:25 10 A 1 BID 10
2017-11-22 15:35:10 10C A 1 CXLBID
2017-11-22 15:41:10 20 A 1 BID 12 20.0%
2017-11-22 15:34:20 1 A 2 BID 5
2017-11-22 15:34:50 1C A 2 CXLBID
2017-11-22 15:35:40 6 A 2 BID 4 -20.0%
2017-11-22 15:34:25 1 B 2 BID 1
2017-11-22 15:34:55 1C B 2 CXLBID
2017-11-22 15:35:45 6 B 2 BID 2 100.0%

Related

POSTGRESQL: Enumerate with the same number if having the same criteria

What I have
id | value
1 | foo
2 | foo
3 | bah
4 | bah
5 | bah
6 | jezz
7 | jezz
8 | jezz
9 | pas
10 | log
What I need:
Enumerate rows as in the following example
id | value | enumeration
1 | foo | 1
2 | foo | 1
3 | bah | 2
4 | bah | 2
5 | bah | 2
6 | jezz | 3
7 | jezz | 3
8 | jezz | 3
9 | pas | 4
10 | log | 5
I've tried row_number with over partition. But this leads to another kind of enumeration.
Thanks for any help
You can use rank() or dense_rank() for that case:
Click: demo:db<>fiddle
SELECT
*,
dense_rank() OVER (ORDER BY value)
FROM
mytable
rank() generates an ordered number to every element of a group, but it creates gaps (if there were 3 elements in the first group, the second group starting at row 4 would get the number 4). dense_rank() avoids these gaps.
Note, this orders the table by the value column alphabetically. So, the result will be: blah == 1, foo == 2, jezz == 3, log == 4, pas == 5.
If you want to keep your order, you need an additional order criterion. In your case you could use the id column to create such a column, if no other is available:
Click: demo:db<>fiddle
First, use first_value() to find the lowest id per value group:
SELECT
*,
first_value(id) OVER (PARTITION BY value ORDER BY id)
FROM
mytable
This first value (foo == 1, blah == 3, ...) can be used to keep the original order when calculating the dense_rank():
SELECT
id,
value,
dense_rank() OVER (ORDER BY first_value)
FROM (
SELECT
*,
first_value(id) OVER (PARTITION BY value ORDER BY id)
FROM
mytable
) s

Grouping by unique values inside a JSONB array

Consider the following table structure:
CREATE TABLE residences (id int, price int, categories jsonb);
INSERT INTO residences VALUES
(1, 3, '["monkeys", "hamsters", "foxes"]'),
(2, 5, '["monkeys", "hamsters", "foxes", "foxes"]'),
(3, 7, '[]'),
(4, 11, '["turtles"]');
SELECT * FROM residences;
id | price | categories
----+-------+-------------------------------------------
1 | 3 | ["monkeys", "hamsters", "foxes"]
2 | 5 | ["monkeys", "hamsters", "foxes", "foxes"]
3 | 7 | []
4 | 11 | ["turtles"]
Now I would like to know how many residences there are for each category, as well as their sum of prices. The only way I found was to do this was using a sub-query:
SELECT category, SUM(price), COUNT(*) AS residences_no
FROM
residences a,
(
SELECT DISTINCT(jsonb_array_elements(categories)) AS category
FROM residences
) b
WHERE a.categories #> category
GROUP BY category
ORDER BY category;
category | sum | residences_no
------------+-----+---------------
"foxes" | 8 | 2
"hamsters" | 8 | 2
"monkeys" | 8 | 2
"turtles" | 11 | 1
Using jsonb_array_elements without subquery would return three residences for foxes because of the duplicate entry in the second row. Also the price of the residence would be inflated by 5.
Is there any way to do this without using the sub-query, or any better way to accomplish this result?
EDIT
Initially I did not mention the price column.
select category, count(distinct (id, category))
from residences, jsonb_array_elements(categories) category
group by category
order by category;
category | count
------------+-------
"foxes" | 2
"hamsters" | 2
"monkeys" | 2
"turtles" | 1
(4 rows)
You have to use a derived table to aggregate another column (all prices at 10):
select category, count(*), sum(price) total
from (
select distinct id, category, price
from residences, jsonb_array_elements(categories) category
) s
group by category
order by category;
category | count | total
------------+-------+-------
"foxes" | 2 | 20
"hamsters" | 2 | 20
"monkeys" | 2 | 20
"turtles" | 1 | 10
(4 rows)

How can I uniquely map a long string identifier to a numerical value in a single query (for bandwidth reasons)?

I have a Postgresql database (technically Greenplum) with data on individuals over time. The database has three fields: user_id, monthly_date, and account_value. When I put in a query, I have to download the results from a remote server, so bandwidth is an issue. Since the user_id field is a very long string (around 50 characters), I'd like to return a numerical value that corresponds 1:1 with each value of user_id, since this will take up less space.
For example, the database might have sample data like this:
63a9364385350b13473279 Jan-2000
63a9364385350b13473279 Feb-2000
2066937e2887w206010393 Apr-2001
036686037e507d01764237 Mar-2003
036686037e507d01764237 Jun-2003
036686037e507d01764237 Jul-2003
036686037e507d01764237 Dec-2003
90829x098327549n286418 Apr-2004
90829x098327549n286418 Sep-2004
67518x834512306933u500 Nov-2000
and I'm trying to work out a query using ROW_NUMBER() and various window functions like PARTITION BY to get results like this:
1 Jan-2000
1 Feb-2000
2 Apr-2001
3 Mar-2003
3 Jun-2003
3 Jul-2003
3 Dec-2003
4 Apr-2004
4 Sep-2004
5 Nov-2000
I know these aren't actual database formats, but I'm just using them as example data. Is this possible? I don't care (although it would be nice and very neat to see) if, for example, 63a9364385350b13473279 maps to 1 in one query and 2 in the next, but in any given query, 63a9364385350b13473279 should always map to the same value regardless of date. The mapped numbers don't need to be in sequence or have any meaningful value besides being unique.
If you just need a unique number, this will do the trick:
SELECT
id,
split_part(t.d, '-', 2),
row_number() OVER all_window - row_number() OVER group_window AS a_unique_number_by_id
FROM (
VALUES
('63a9364385350b13473279','Jan-2000'),
('63a9364385350b13473279','Feb-2000'),
('2066937e2887w206010393','Apr-2001'),
('036686037e507d01764237','Mar-2003'),
('036686037e507d01764237','Jun-2003'),
('036686037e507d01764237','Jul-2003'),
('036686037e507d01764237','Dec-2003'),
('90829x098327549n286418','Apr-2004'),
('90829x098327549n286418','Sep-2004'),
('67518x834512306933u500','Nov-2000')
) as t(id, d)
WINDOW group_window AS (
PARTITION BY id
ORDER BY split_part(t.d, '-', 2)
), all_window AS (
ORDER BY split_part(t.d, '-', 2)
);
Here is the result:
id | split_part | a_unique_number_by_id
------------------------+------------+-----------------------
63a9364385350b13473279 | 2000 | 0
63a9364385350b13473279 | 2000 | 0
67518x834512306933u500 | 2000 | 2
2066937e2887w206010393 | 2001 | 3
036686037e507d01764237 | 2003 | 4
036686037e507d01764237 | 2003 | 4
036686037e507d01764237 | 2003 | 4
036686037e507d01764237 | 2003 | 4
90829x098327549n286418 | 2004 | 8
90829x098327549n286418 | 2004 | 8
(10 rows)
You should re-order it with another column to keep the original ordering.
I think you are looking for dense_rank().
create table sample_data
(userid varchar(50) not null,
monthly_date date not null)
distributed by (userid);
insert into sample_data (userid, monthly_date) values
('63a9364385350b13473279','2000-01-01'),
('63a9364385350b13473279','2000-02-01'),
('2066937e2887w206010393','2001-04-01'),
('036686037e507d01764237','2003-03-01'),
('036686037e507d01764237','2003-06-01'),
('036686037e507d01764237','2003-07-01'),
('036686037e507d01764237','2003-12-01'),
('90829x098327549n286418','2004-04-01'),
('90829x098327549n286418','2004-09-01'),
('67518x834512306933u500','2000-11-01');
select dense_rank() over(order by userid) as new_userid, userid, monthly_date
from sample_data
order by 2;
new_userid | userid | monthly_date
------------+------------------------+--------------
1 | 036686037e507d01764237 | 2003-06-01
1 | 036686037e507d01764237 | 2003-07-01
1 | 036686037e507d01764237 | 2003-12-01
1 | 036686037e507d01764237 | 2003-03-01
2 | 2066937e2887w206010393 | 2001-04-01
3 | 63a9364385350b13473279 | 2000-02-01
3 | 63a9364385350b13473279 | 2000-01-01
4 | 67518x834512306933u500 | 2000-11-01
5 | 90829x098327549n286418 | 2004-09-01
5 | 90829x098327549n286418 | 2004-04-01
(10 rows)
Try the below script
create table test_schema.source_data (id varchar(50), dt varchar(50));
insert into test_schema.source_data
values ('63a9364385350b13473279','Jan-2000'),
('63a9364385350b13473279','Feb-2000'),
('2066937e2887w206010393','Apr-2001'),
('036686037e507d01764237','Mar-2003'),
('036686037e507d01764237','Jun-2003'),
('036686037e507d01764237','Jul-2003'),
('036686037e507d01764237','Dec-2003'),
('90829x098327549n286418','Apr-2004'),
('90829x098327549n286418','Sep-2004'),
('67518x834512306933u500','Nov-2000');
create temporary table id_mapping
as
select t1.id, row_number() over(order by t1.id) rownum
from (
SELECT distinct id
FROM test_schema.source_data
) t1;
select t1.id, t1.dt, t2.rownum
from
test_schema.source_data t1
join id_mapping t2
on t1.id = t2.id;
And here is the result
id dt rownum
------------------------+------------+-----
036686037e507d01764237 Dec-2003 1
036686037e507d01764237 Jul-2003 1
036686037e507d01764237 Jun-2003 1
036686037e507d01764237 Mar-2003 1
2066937e2887w206010393 Apr-2001 2
63a9364385350b13473279 Feb-2000 3
63a9364385350b13473279 Jan-2000 3
67518x834512306933u500 Nov-2000 4
90829x098327549n286418 Sep-2004 5
90829x098327549n286418 Apr-2004 5

SQL Server recursive query·

I have a table in SQL Server 2008 R2 which contains product orders. For the most part, it is one entry per product
ID | Prod | Qty
------------
1 | A | 1
4 | B | 1
7 | A | 1
8 | A | 1
9 | A | 1
12 | C | 1
15 | A | 1
16 | A | 1
21 | B | 1
I want to create a view based on the table which looks like this
ID | Prod | Qty
------------------
1 | A | 1
4 | B | 1
9 | A | 3
12 | C | 1
16 | A | 2
21 | B | 1
I've written a query using a table expression, but I am stumped on how to make it work. The sql below does not actually work, but is a sample of what I am trying to do. I've written this query multiple different ways, but cannot figure out how to get the right results. I am using row_number to generate a sequential id. From that, I can order and compare consecutive rows to see if the next row has the same product as the previous row since ReleaseId is sequential, but not necessarily contiguous.
;with myData AS
(
SELECT
row_number() over (order by a.ReleaseId) as 'Item',
a.ReleaseId,
a.ProductId,
a.Qty
FROM OrdersReleased a
UNION ALL
SELECT
row_number() over (order by b.ReleaseId) as 'Item',
b.ReleaseId,
b.ProductId,
b.Qty
FROM OrdersReleased b
INNER JOIN myData c ON b.Item = c.Item + 1 and b.ProductId = c.ProductId
)
SELECT * from myData
Usually you drop the ID out of something like this, since it is a summary.
SELECT a.ProductId,
SUM(a.Qty) AS Qty
FROM OrdersReleased a
GROUP BY a.ProductId
ORDER BY a.ProductId
-- if you want to do sub query you can do it as a column (if you don't have a very large dataset).
SELECT a.ProductId,
SUM(a.Qty) AS Qty,
(SELECT COUNT(1)
FROM OrdersReleased b
WHERE b.ReleasedID - 1 = a.ReleasedID
AND b.ProductId = b.ProductId) as NumberBackToBack
FROM OrdersReleased a
GROUP BY a.ProductId
ORDER BY a.ProductId

Equivalent to unpivot() in PostgreSQL

Is there a unpivot equivalent function in PostgreSQL?
Create an example table:
CREATE TEMP TABLE foo (id int, a text, b text, c text);
INSERT INTO foo VALUES (1, 'ant', 'cat', 'chimp'), (2, 'grape', 'mint', 'basil');
You can 'unpivot' or 'uncrosstab' using UNION ALL:
SELECT id,
'a' AS colname,
a AS thing
FROM foo
UNION ALL
SELECT id,
'b' AS colname,
b AS thing
FROM foo
UNION ALL
SELECT id,
'c' AS colname,
c AS thing
FROM foo
ORDER BY id;
This runs 3 different subqueries on foo, one for each column we want to unpivot, and returns, in one table, every record from each of the subqueries.
But that will scan the table N times, where N is the number of columns you want to unpivot. This is inefficient, and a big problem when, for example, you're working with a very large table that takes a long time to scan.
Instead, use:
SELECT id,
unnest(array['a', 'b', 'c']) AS colname,
unnest(array[a, b, c]) AS thing
FROM foo
ORDER BY id;
This is easier to write, and it will only scan the table once.
array[a, b, c] returns an array object, with the values of a, b, and c as it's elements.
unnest(array[a, b, c]) breaks the results into one row for each of the array's elements.
You could use VALUES() and JOIN LATERAL to unpivot the columns.
Sample data:
CREATE TABLE test(id int, a INT, b INT, c INT);
INSERT INTO test(id,a,b,c) VALUES (1,11,12,13),(2,21,22,23),(3,31,32,33);
Query:
SELECT t.id, s.col_name, s.col_value
FROM test t
JOIN LATERAL(VALUES('a',t.a),('b',t.b),('c',t.c)) s(col_name, col_value) ON TRUE;
DBFiddle Demo
Using this approach it is possible to unpivot multiple groups of columns at once.
EDIT
Using Zack's suggestion:
SELECT t.id, col_name, col_value
FROM test t
CROSS JOIN LATERAL (VALUES('a', t.a),('b', t.b),('c',t.c)) s(col_name, col_value);
<=>
SELECT t.id, col_name, col_value
FROM test t
,LATERAL (VALUES('a', t.a),('b', t.b),('c',t.c)) s(col_name, col_value);
db<>fiddle demo
Great article by Thomas Kellerer found here
Unpivot with Postgres
Sometimes it’s necessary to normalize de-normalized tables - the opposite of a “crosstab” or “pivot” operation. Postgres does not support an UNPIVOT operator like Oracle or SQL Server, but simulating it, is very simple.
Take the following table that stores aggregated values per quarter:
create table customer_turnover
(
customer_id integer,
q1 integer,
q2 integer,
q3 integer,
q4 integer
);
And the following sample data:
customer_id | q1 | q2 | q3 | q4
------------+-----+-----+-----+----
1 | 100 | 210 | 203 | 304
2 | 150 | 118 | 422 | 257
3 | 220 | 311 | 271 | 269
But we want the quarters to be rows (as they should be in a normalized data model).
In Oracle or SQL Server this could be achieved with the UNPIVOT operator, but that is not available in Postgres. However Postgres’ ability to use the VALUES clause like a table makes this actually quite easy:
select c.customer_id, t.*
from customer_turnover c
cross join lateral (
values
(c.q1, 'Q1'),
(c.q2, 'Q2'),
(c.q3, 'Q3'),
(c.q4, 'Q4')
) as t(turnover, quarter)
order by customer_id, quarter;
will return the following result:
customer_id | turnover | quarter
------------+----------+--------
1 | 100 | Q1
1 | 210 | Q2
1 | 203 | Q3
1 | 304 | Q4
2 | 150 | Q1
2 | 118 | Q2
2 | 422 | Q3
2 | 257 | Q4
3 | 220 | Q1
3 | 311 | Q2
3 | 271 | Q3
3 | 269 | Q4
The equivalent query with the standard UNPIVOT operator would be:
select customer_id, turnover, quarter
from customer_turnover c
UNPIVOT (turnover for quarter in (q1 as 'Q1',
q2 as 'Q2',
q3 as 'Q3',
q4 as 'Q4'))
order by customer_id, quarter;
FYI for those of us looking for how to unpivot in RedShift.
The long form solution given by Stew appears to be the only way to accomplish this.
For those who cannot see it there, here is the text pasted below:
We do not have built-in functions that will do pivot or unpivot. However,
you can always write SQL to do that.
create table sales (regionid integer, q1 integer, q2 integer, q3 integer, q4 integer);
insert into sales values (1,10,12,14,16), (2,20,22,24,26);
select * from sales order by regionid;
regionid | q1 | q2 | q3 | q4
----------+----+----+----+----
1 | 10 | 12 | 14 | 16
2 | 20 | 22 | 24 | 26
(2 rows)
pivot query
create table sales_pivoted (regionid, quarter, sales)
as
select regionid, 'Q1', q1 from sales
UNION ALL
select regionid, 'Q2', q2 from sales
UNION ALL
select regionid, 'Q3', q3 from sales
UNION ALL
select regionid, 'Q4', q4 from sales
;
select * from sales_pivoted order by regionid, quarter;
regionid | quarter | sales
----------+---------+-------
1 | Q1 | 10
1 | Q2 | 12
1 | Q3 | 14
1 | Q4 | 16
2 | Q1 | 20
2 | Q2 | 22
2 | Q3 | 24
2 | Q4 | 26
(8 rows)
unpivot query
select regionid, sum(Q1) as Q1, sum(Q2) as Q2, sum(Q3) as Q3, sum(Q4) as Q4
from
(select regionid,
case quarter when 'Q1' then sales else 0 end as Q1,
case quarter when 'Q2' then sales else 0 end as Q2,
case quarter when 'Q3' then sales else 0 end as Q3,
case quarter when 'Q4' then sales else 0 end as Q4
from sales_pivoted)
group by regionid
order by regionid;
regionid | q1 | q2 | q3 | q4
----------+----+----+----+----
1 | 10 | 12 | 14 | 16
2 | 20 | 22 | 24 | 26
(2 rows)
Hope this helps, Neil
Pulling slightly modified content from the link in the comment from #a_horse_with_no_name into an answer because it works:
Installing Hstore
If you don't have hstore installed and are running PostgreSQL 9.1+, you can use the handy
CREATE EXTENSION hstore;
For lower versions, look for the hstore.sql file in share/contrib and run in your database.
Assuming that your source (e.g., wide data) table has one 'id' column, named id_field, and any number of 'value' columns, all of the same type, the following will create an unpivoted view of that table.
CREATE VIEW vw_unpivot AS
SELECT id_field, (h).key AS column_name, (h).value AS column_value
FROM (
SELECT id_field, each(hstore(foo) - 'id_field'::text) AS h
FROM zcta5 as foo
) AS unpiv ;
This works with any number of 'value' columns. All of the resulting values will be text, unless you cast, e.g., (h).value::numeric.
Just use JSON:
with data (id, name) as (
values (1, 'a'), (2, 'b')
)
select t.*
from data, lateral jsonb_each_text(to_jsonb(data)) with ordinality as t
order by data.id, t.ordinality;
This yields
|key |value|ordinality|
|----|-----|----------|
|id |1 |1 |
|name|a |2 |
|id |2 |1 |
|name|b |2 |
dbfiddle
I wrote a horrible unpivot function for PostgreSQL. It's rather slow but it at least returns results like you'd expect an unpivot operation to.
https://cgsrv1.arrc.csiro.au/blog/2010/05/14/unpivotuncrosstab-in-postgresql/
Hopefully you can find it useful..
Depending on what you want to do... something like this can be helpful.
with wide_table as (
select 1 a, 2 b, 3 c
union all
select 4 a, 5 b, 6 c
)
select unnest(array[a,b,c]) from wide_table
You can use FROM UNNEST() array handling to UnPivot a dataset, tandem with a correlated subquery (works w/ PG 9.4).
FROM UNNEST() is more powerful & flexible than the typical method of using FROM (VALUES .... ) to unpivot datasets. This is b/c FROM UNNEST() is variadic (with n-ary arity). By using a correlated subquery the need for the lateral ORDINAL clause is eliminated, & Postgres keeps the resulting parallel columnar sets in the proper ordinal sequence.
This is, BTW, FAST -- in practical use spawning 8 million rows in < 15 seconds on a 24-core system.
WITH _students AS ( /** CTE **/
SELECT * FROM
( SELECT 'jane'::TEXT ,'doe'::TEXT , 1::INT
UNION
SELECT 'john'::TEXT ,'doe'::TEXT , 2::INT
UNION
SELECT 'jerry'::TEXT ,'roe'::TEXT , 3::INT
UNION
SELECT 'jodi'::TEXT ,'roe'::TEXT , 4::INT
) s ( fn, ln, id )
) /** end WITH **/
SELECT s.id
, ax.fanm -- field labels, now expanded to two rows
, ax.anm -- field data, now expanded to two rows
, ax.someval -- manually incl. data
, ax.rankednum -- manually assigned ranks
,ax.genser -- auto-generate ranks
FROM _students s
,UNNEST /** MULTI-UNNEST() BLOCK **/
(
( SELECT ARRAY[ fn, ln ]::text[] AS anm -- expanded into two rows by outer UNNEST()
/** CORRELATED SUBQUERY **/
FROM _students s2 WHERE s2.id = s.id -- outer relation
)
,( /** ordinal relationship preserved in variadic UNNEST() **/
SELECT ARRAY[ 'first name', 'last name' ]::text[] -- exp. into 2 rows
AS fanm
)
,( SELECT ARRAY[ 'z','x','y'] -- only 3 rows gen'd, but ordinal rela. kept
AS someval
)
,( SELECT ARRAY[ 1,2,3,4,5 ] -- 5 rows gen'd, ordinal rela. kept.
AS rankednum
)
,( SELECT ARRAY( /** you may go wild ... **/
SELECT generate_series(1, 15, 3 )
AS genser
)
)
) ax ( anm, fanm, someval, rankednum , genser )
;
RESULT SET:
+--------+----------------+-----------+----------+---------+-------
| id | fanm | anm | someval |rankednum| [ etc. ]
+--------+----------------+-----------+----------+---------+-------
| 2 | first name | john | z | 1 | .
| 2 | last name | doe | y | 2 | .
| 2 | [null] | [null] | x | 3 | .
| 2 | [null] | [null] | [null] | 4 | .
| 2 | [null] | [null] | [null] | 5 | .
| 1 | first name | jane | z | 1 | .
| 1 | last name | doe | y | 2 | .
| 1 | | | x | 3 | .
| 1 | | | | 4 | .
| 1 | | | | 5 | .
| 4 | first name | jodi | z | 1 | .
| 4 | last name | roe | y | 2 | .
| 4 | | | x | 3 | .
| 4 | | | | 4 | .
| 4 | | | | 5 | .
| 3 | first name | jerry | z | 1 | .
| 3 | last name | roe | y | 2 | .
| 3 | | | x | 3 | .
| 3 | | | | 4 | .
| 3 | | | | 5 | .
+--------+----------------+-----------+----------+---------+ ----
Here's a way that combines the hstore and CROSS JOIN approaches from other answers.
It's a modified version of my answer to a similar question, which is itself based on the method at https://blog.sql-workbench.eu/post/dynamic-unpivot/ and another answer to that question.
-- Example wide data with a column for each year...
WITH example_wide_data("id", "2001", "2002", "2003", "2004") AS (
VALUES
(1, 4, 5, 6, 7),
(2, 8, 9, 10, 11)
)
-- that is tided to have "year" and "value" columns
SELECT
id,
r.key AS year,
r.value AS value
FROM
example_wide_data w
CROSS JOIN
each(hstore(w.*)) AS r(key, value)
WHERE
-- This chooses columns that look like years
-- In other cases you might need a different condition
r.key ~ '^[0-9]{4}$';
It has a few benefits over other solutions:
By using hstore and not jsonb, it hopefully minimises issues with type conversions (although hstore does convert everything to text)
The columns don't need to be hard coded or known in advance. Here, columns are chosen by a regex on the name, but you could use any SQL logic based on the name, or even the value.
It doesn't require PL/pgSQL - it's all SQL