Multiple Self-Joins to Find Transitive Subsets (cliques)

Multiple Self-Joins to Find Transitive Subsets (cliques) - tsql

In simplest terms, I have a table representing a relation. Rows in the table represent pairs in my relation. In other words, the first row indicates that the id of 1 is related to the id of 4 and that the id of 4 is related to the id of 1. Hopefully, it is not hard for you to see that my relation is symmetric, although the table shows this symmetry in a concise form.
+-----+-----+
| id1 | id2 |
+-----+-----+
| 1 | 4 |
| 3 | 1 |
| 2 | 1 |
| 2 | 3 |
| 2 | 4 |
| 5 | 1 |
+-----+-----+
EDIT
This table is meant to concisely show the following relation:
{(1,4), (4,1), (3,1), (1,3), (2,1), (1,2), (2,3), (3,2), (2,4), (4,2), (5,1), (1,5)}. This can be visualized by the following undirected graph.
CREATE TABLE Test (
id1 int not null,
id2 int not null);
INSERT INTO Test
VALUES
(1,4),
(3,1),
(2,1),
(2,3),
(2,4),
(5,1);
I would like to identify transitive subsets (cliques) in my table.
EDIT
For example, I would like to identify the transitive subset demonstrated by the fact that the id of 3 is related to the id of 1 and the id of 1 is related to the id of 2 implies that the id of 3 is related to the id of 2. (These can be seen as triangles in the undirected graph photo. Although, in the best case scenario, I would like to be able to list other complete subgraphs that are larger than triangles if they are present in the original table/graph.)
I've tried doing the following but the result set is larger than I want it to be. I hope that there is an easier way.
select t1.id1, t1.id2, t2.id1, t2.id2, t3.id1, t3.id2
from test as t1
join test as t2
on t1.id1 = t2.id1
or t1.id2 = t2.id2
or t1.id1 = t2.id2
or t1.id2 = t2.id1
join test as t3
on t2.id1 = t3.id1
or t2.id2 = t3.id2
or t2.id1 = t3.id2
or t2.id2 = t3.id1
where
not
(
t1.id1 = t2.id1
and
t1.id2 = t2.id2
)
and not
(
t2.id1 = t3.id1
and
t2.id2 = t3.id2
)
and not
(
t1.id1 = t3.id1
and
t1.id2 = t3.id2
)
and
(
(
t3.id1 = t1.id1
or
t3.id1 = t1.id2
or
t3.id1 = t2.id1
or
t3.id1 = t2.id2
)
and
(
t3.id2 = t1.id1
or
t3.id2 = t1.id2
or
t3.id2 = t2.id1
or
t3.id2 = t2.id2
)
);
Actual Output:
+-----+-----+-----+-----+-----+-----+
| id1 | id2 | id1 | id2 | id1 | id2 |
+-----+-----+-----+-----+-----+-----+
| 1 | 4 | 2 | 4 | 2 | 1 |
| 1 | 4 | 2 | 1 | 2 | 4 |
| 3 | 1 | 2 | 3 | 2 | 1 |
| 3 | 1 | 2 | 1 | 2 | 3 |
| 2 | 1 | 2 | 4 | 1 | 4 |
| 2 | 1 | 2 | 3 | 3 | 1 |
| 2 | 1 | 3 | 1 | 2 | 3 |
| 2 | 1 | 1 | 4 | 2 | 4 |
| 2 | 3 | 2 | 1 | 3 | 1 |
| 2 | 3 | 3 | 1 | 2 | 1 |
| 2 | 4 | 2 | 1 | 1 | 4 |
| 2 | 4 | 1 | 4 | 2 | 1 |
+-----+-----+-----+-----+-----+-----+
The expected result set would only have two rows. Each row would represent a transitive relation that is a subset of the original relation.
╔═════╦═════╦═════╦═════╦═════╦═════╗
║ id1 ║ id2 ║ id1 ║ id2 ║ id1 ║ id2 ║
╠═════╬═════╬═════╬═════╬═════╬═════╣
║ 1 ║ 4 ║ 2 ║ 4 ║ 2 ║ 1 ║
║ 3 ║ 1 ║ 2 ║ 1 ║ 2 ║ 3 ║
╚═════╩═════╩═════╩═════╩═════╩═════╝
EDIT
The expected output could also look like,
╔═════╦═════╦═════╗
║ id1 ║ id2 ║ id3 ║
╠═════╬═════╬═════╣
║ 1 ║ 4 ║ 2 ║
║ 3 ║ 1 ║ 2 ║
╚═════╩═════╩═════╝,
whatever is easier. I just need to display the fact that the sets
{(1,4), (4,1), (2,4), (4,2), (2,1), (1,2)}
and
{(3,1), (1,3), (2,1), (1,2), (2,3), (3,2)}
are proper subsets of the original relation and are themselves transitive relations. I am using the definition that a relation R is transitive if and only if
∀a∀b∀c((a,b)∈R ∧ (b,c)∈R → (a,c)∈R). In other words, I am trying to find all subgraphs that are also complete graphs.
I'm new to graphy theory, but it seems like my problem is similar to the clique problem where I am looking for cliques containing 3 or more vertices. I would accept as an answer solutions that return only cliques with 3 vertices. My question is similar to this one. However, the solutions presented there don't seem to use the definition of a clique that I want where every vertex is connected to every other vertex inside of the clique.
Here is an algorithm I found using Java. Hopefully, this will help with an implementation using SQL.

Times ago I needed to create clusters of data using transitive closure. The best way to do that was using SQLCLR. Here's the GitHub code (article to detailed link is there too)
https://github.com/yorek/non-scalar-uda-transitive-closure
That could be a good starting point. Can you also be more precise on what is the expected result with the input data you have in the sample?

Here's the solution. It is based on the idea that a complete graph contains all the possible combination of its subgraphs. Code is here, I'll comment it in detail over the weekend, but it case I couldn't at least you have the code right on monday. Please be aware that this is quite a brute force approach and if you need graph bigger than 30 nodes won't work. Still I think it's a nice sample of "lateral thinking". Enjoy:
/*
Create table holding graph data.
Id1 and Id2 represent the vertex of the graph.
(Id1, Id2) represent and edge.
https://stackoverflow.com/questions/56979737/multiple-self-joins-to-find-transitive-subsets-cliques/56979901#56979901
*/
DROP TABLE IF EXISTS #Graph;
CREATE TABLE #Graph (Id1 INT, Id2 INT);
INSERT INTO
#Graph
VALUES
(1,2)
,(1,3)
,(1,4)
,(2,3)
,(2,4)
,(5,1)
--,(4,3) -- Uncomment this to create a complete subgraph of 4 vertex
;
GO
/*
Create Numbers Table
*/
DROP TABLE IF EXISTS #Numbers;
SELECT TOP (100000)
ROW_NUMBER() OVER(ORDER BY A.[object_id]) AS Num
INTO
#Numbers
FROM
sys.[all_columns] a CROSS JOIN sys.[all_columns] b
ORDER BY
Num
GO
/*
Make sure Id1 is always lower then Id2.
This can be done as the graph is undirected
*/
DROP TABLE IF EXISTS #Graph2;
SELECT DISTINCT
CASE WHEN Id1<Id2 THEN Id1 ELSE Id2 END AS Id1,
CASE WHEN Id1<Id2 THEN Id2 ELSE Id1 END AS Id2
INTO
#Graph2
FROM
#Graph;
GO
/*
Turn edges into single columns
*/
DROP TABLE IF EXISTS #Graph3;
SELECT
CAST(Id1 AS VARCHAR(MAX)) + '>' + CAST(Id2 AS VARCHAR(MAX)) COLLATE Latin1_General_BIN2 AS [Edge]
INTO
#Graph3
FROM
#Graph2;
/*
Get the list of all the unique vertexes
*/
DROP TABLE IF EXISTS #Vertex;
WITH cte AS
(
SELECT Id1 AS Id FROM #Graph
UNION
SELECT Id2 AS Id FROM #Graph
)
SELECT * INTO #Vertex FROM cte;
/*
Given a complete graph with "n" vertexes,
calculate all the possibile complete cyclic subgraphs
*/
-- From https://stackoverflow.com/questions/3686062/generate-all-combinations-in-sql
-- And changed to return all combinations complete cyclic subgraphs
DROP TABLE IF EXISTS #AllCyclicVertex;
DECLARE #VertexCount INT = (SELECT COUNT(*) FROM [#Vertex]);
WITH Nums AS
(
SELECT
Num
FROM
#Numbers
WHERE
Num BETWEEN 0 AND POWER(2, #VertexCount) - 1
), BaseSet AS
(
SELECT
I = POWER(2, ROW_NUMBER() OVER (ORDER BY [Id]) - 1), *
FROM
[#Vertex]
), Combos AS
(
SELECT
CombId = N.Num,
S.Id,
K = COUNT(*) OVER (PARTITION BY N.Num)
FROM
Nums AS N
INNER JOIN
BaseSet AS S ON N.Num & S.I <> 0
)
SELECT
DENSE_RANK() OVER (ORDER BY K, [CombID]) AS CombNum,
K,
Id
INTO
#AllCyclicVertex
FROM
Combos
WHERE
K BETWEEN 3 AND #VertexCount
ORDER BY
CombNum, Id;
GO
--SELECT * FROM [#AllCyclicVertex]
/*
Calculate the edges for the calculated cyclic graphs
*/
DROP TABLE IF EXISTS #WellKnownPatterns;
CREATE TABLE #WellKnownPatterns ([Name] VARCHAR(100), [Id1] INT, [Id2] INT, [Edge] VARCHAR(100) COLLATE Latin1_General_BIN2);
INSERT INTO #WellKnownPatterns
([Name], [Id1], [Id2], [Edge])
SELECT
CAST(a.[CombNum] AS VARCHAR(100)) + '/' + CAST(a.[K] AS VARCHAR(100)),
a.Id AS Id1,
b.Id AS Id2,
CAST(a.[Id] AS VARCHAR(MAX)) + '>' + CAST(b.[Id] AS VARCHAR(MAX)) AS [Edge]
FROM
#AllCyclicVertex a
INNER JOIN
#AllCyclicVertex b ON b.id > a.id AND a.[CombNum] = b.[CombNum]
;
-- SELECT * FROM [#WellKnownPatterns]
/*
Now take from the original set only those
who are EXACT RELATIONAL DIVISION of a well-known cyclic graph
*/
WITH cte AS
(
SELECT * FROM #Graph3
),
cte2 AS
(
SELECT
COUNT(*) OVER (PARTITION BY [Name]) AS [EdgeCount],
*
FROM
#WellKnownPatterns
)
SELECT
T1.[Name]
FROM
cte2 AS T1
LEFT OUTER JOIN
cte AS S ON T1.[Edge] = S.[Edge]
GROUP BY
T1.[Name]
HAVING
COUNT(S.[Edge]) = MIN(T1.[EdgeCount])
GO
-- Test a solution
SELECT * FROM [#WellKnownPatterns] WHERE [Name] = '1/3'

Related

How to order rows with linked parts in PostgreSQL

I have a table A with columns: id, title, condition
And i have another table B with information about position for some rows from table A. Table B have columns id, next_id, prev_id
How to sort rows from A based on information from table B?
For example,
Table A
id| title
---+-----
1 | title1
2 | title2
3 | title3
4 | title4
5 | title5
Table B
id| next_id | prev_id
---+-----
2 | 1 | null
5 | 4 | 3
I want to get this result:
id| title
---+-----
2 | title2
1 | title1
3 | title3
5 | title5
4 | title4
And after apply this sort, i want to sort by condition column yet.
I've already spent a lot of time looking for a solution, and hope for your help.

You have to add weights to your data, so you can order accordingly. This example uses next_id, not sure if you need to use prev_id, you don't explain the use of it.
Anyway, here's a code example:
-- Temporal Data for the test:
CREATE TEMP TABLE table_a(id integer,tittle text);
CREATE TEMP TABLE table_b(id integer,next_id integer, prev_id integer);
INSERT INTO table_a VALUES
(1,'title1'),
(2,'title2'),
(3,'title3'),
(4,'title4'),
(5,'title5');
INSERT INTO table_b VALUES
(2,1,null),
(5,4,3);
-- QUERY:
SELECT
id,tittle,
CASE -- Adding weight
WHEN next_id IS NULL THEN (id + 0.1)
ELSE next_id
END AS orden
FROM -- Joining tables
(SELECT ta.*,tb.next_id
FROM table_a ta
LEFT JOIN table_b tb
ON ta.id=tb.id)join_a_b
ORDER BY orden
And here's the result:
id | tittle | orden
--------------------------
2 | title2 | 1
1 | title1 | 1.1
3 | title3 | 3.1
5 | title5 | 4
4 | title4 | 4.1

SQL Server recursive query·

I have a table in SQL Server 2008 R2 which contains product orders. For the most part, it is one entry per product
ID | Prod | Qty
------------
1 | A | 1
4 | B | 1
7 | A | 1
8 | A | 1
9 | A | 1
12 | C | 1
15 | A | 1
16 | A | 1
21 | B | 1
I want to create a view based on the table which looks like this
ID | Prod | Qty
------------------
1 | A | 1
4 | B | 1
9 | A | 3
12 | C | 1
16 | A | 2
21 | B | 1
I've written a query using a table expression, but I am stumped on how to make it work. The sql below does not actually work, but is a sample of what I am trying to do. I've written this query multiple different ways, but cannot figure out how to get the right results. I am using row_number to generate a sequential id. From that, I can order and compare consecutive rows to see if the next row has the same product as the previous row since ReleaseId is sequential, but not necessarily contiguous.
;with myData AS
(
SELECT
row_number() over (order by a.ReleaseId) as 'Item',
a.ReleaseId,
a.ProductId,
a.Qty
FROM OrdersReleased a
UNION ALL
SELECT
row_number() over (order by b.ReleaseId) as 'Item',
b.ReleaseId,
b.ProductId,
b.Qty
FROM OrdersReleased b
INNER JOIN myData c ON b.Item = c.Item + 1 and b.ProductId = c.ProductId
)
SELECT * from myData

Usually you drop the ID out of something like this, since it is a summary.
SELECT a.ProductId,
SUM(a.Qty) AS Qty
FROM OrdersReleased a
GROUP BY a.ProductId
ORDER BY a.ProductId
-- if you want to do sub query you can do it as a column (if you don't have a very large dataset).
SELECT a.ProductId,
SUM(a.Qty) AS Qty,
(SELECT COUNT(1)
FROM OrdersReleased b
WHERE b.ReleasedID - 1 = a.ReleasedID
AND b.ProductId = b.ProductId) as NumberBackToBack
FROM OrdersReleased a
GROUP BY a.ProductId
ORDER BY a.ProductId

Find Parent Recursively using Query

I am using postgresql. I have the table as like below
parent_id child_id
----------------------
101 102
103 104
104 105
105 106
I want to write a sql query which will give the final parent of input.
i.e suppose i pass 106 as input then , its output will be 103.
(106 --> 105 --> 104 --> 103)

Here's a complete example. First the DDL:
test=> CREATE TABLE node (
test(> id SERIAL,
test(> label TEXT NOT NULL, -- name of the node
test(> parent_id INT,
test(> PRIMARY KEY(id)
test(> );
NOTICE: CREATE TABLE will create implicit sequence "node_id_seq" for serial column "node.id"
NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "node_pkey" for table "node"
CREATE TABLE
...and some data...
test=> INSERT INTO node (label, parent_id) VALUES ('n1',NULL),('n2',1),('n3',2),('n4',3);
INSERT 0 4
test=> INSERT INTO node (label) VALUES ('garbage1'),('garbage2'), ('garbage3');
INSERT 0 3
test=> INSERT INTO node (label,parent_id) VALUES ('garbage4',6);
INSERT 0 1
test=> SELECT * FROM node;
id | label | parent_id
----+----------+-----------
1 | n1 |
2 | n2 | 1
3 | n3 | 2
4 | n4 | 3
5 | garbage1 |
6 | garbage2 |
7 | garbage3 |
8 | garbage4 | 6
(8 rows)
This performs a recursive query on every id in node:
test=> WITH RECURSIVE nodes_cte(id, label, parent_id, depth, path) AS (
SELECT tn.id, tn.label, tn.parent_id, 1::INT AS depth, tn.id::TEXT AS path
FROM node AS tn
WHERE tn.parent_id IS NULL
UNION ALL
SELECT c.id, c.label, c.parent_id, p.depth + 1 AS depth,
(p.path || '->' || c.id::TEXT)
FROM nodes_cte AS p, node AS c
WHERE c.parent_id = p.id
)
SELECT * FROM nodes_cte AS n ORDER BY n.id ASC;
id | label | parent_id | depth | path
----+----------+-----------+-------+------------
1 | n1 | | 1 | 1
2 | n2 | 1 | 2 | 1->2
3 | n3 | 2 | 3 | 1->2->3
4 | n4 | 3 | 4 | 1->2->3->4
5 | garbage1 | | 1 | 5
6 | garbage2 | | 1 | 6
7 | garbage3 | | 1 | 7
8 | garbage4 | 6 | 2 | 6->8
(8 rows)
This gets all of the descendents WHERE node.id = 1:
test=> WITH RECURSIVE nodes_cte(id, label, parent_id, depth, path) AS (
SELECT tn.id, tn.label, tn.parent_id, 1::INT AS depth, tn.id::TEXT AS path FROM node AS tn WHERE tn.id = 1
UNION ALL
SELECT c.id, c.label, c.parent_id, p.depth + 1 AS depth, (p.path || '->' || c.id::TEXT) FROM nodes_cte AS p, node AS c WHERE c.parent_id = p.id
)
SELECT * FROM nodes_cte AS n;
id | label | parent_id | depth | path
----+-------+-----------+-------+------------
1 | n1 | | 1 | 1
2 | n2 | 1 | 2 | 1->2
3 | n3 | 2 | 3 | 1->2->3
4 | n4 | 3 | 4 | 1->2->3->4
(4 rows)
The following will get the path of the node with id 4:
test=> WITH RECURSIVE nodes_cte(id, label, parent_id, depth, path) AS (
SELECT tn.id, tn.label, tn.parent_id, 1::INT AS depth, tn.id::TEXT AS path
FROM node AS tn
WHERE tn.parent_id IS NULL
UNION ALL
SELECT c.id, c.label, c.parent_id, p.depth + 1 AS depth,
(p.path || '->' || c.id::TEXT)
FROM nodes_cte AS p, node AS c
WHERE c.parent_id = p.id
)
SELECT * FROM nodes_cte AS n WHERE n.id = 4;
id | label | parent_id | depth | path
----+-------+-----------+-------+------------
4 | n4 | 3 | 4 | 1->2->3->4
(1 row)
And let's assume you want to limit your search to descendants with a depth less than three (note that depth hasn't been incremented yet):
test=> WITH RECURSIVE nodes_cte(id, label, parent_id, depth, path) AS (
SELECT tn.id, tn.label, tn.parent_id, 1::INT AS depth, tn.id::TEXT AS path
FROM node AS tn WHERE tn.id = 1
UNION ALL
SELECT c.id, c.label, c.parent_id, p.depth + 1 AS depth,
(p.path || '->' || c.id::TEXT)
FROM nodes_cte AS p, node AS c
WHERE c.parent_id = p.id AND p.depth < 2
)
SELECT * FROM nodes_cte AS n;
id | label | parent_id | depth | path
----+-------+-----------+-------+------
1 | n1 | | 1 | 1
2 | n2 | 1 | 2 | 1->2
(2 rows)
I'd recommend using an ARRAY data type instead of a string for demonstrating the "path", but the arrow is more illustrative of the parent<=>child relationship.

Use WITH RECURSIVE to create a Common Table Expression (CTE). For the non-recursive term, get the rows in which the child is immediately below the parent:
SELECT
c.child_id,
c.parent_id
FROM
mytable c
LEFT JOIN
mytable p ON c.parent_id = p.child_id
WHERE
p.child_id IS NULL
child_id | parent_id
----------+-----------
102 | 101
104 | 103
For the recursive term, you want the children of these children.
WITH RECURSIVE tree(child, root) AS (
SELECT
c.child_id,
c.parent_id
FROM
mytable c
LEFT JOIN
mytable p ON c.parent_id = p.child_id
WHERE
p.child_id IS NULL
UNION
SELECT
child_id,
root
FROM
tree
INNER JOIN
mytable on tree.child = mytable.parent_id
)
SELECT * FROM tree;
child | root
-------+------
102 | 101
104 | 103
105 | 103
106 | 103
You can filter the children when querying the CTE:
WITH RECURSIVE tree(child, root) AS (...) SELECT root FROM tree WHERE child = 106;
root
------
103

postgres counting one record twice if it meets certain criteria

I thought that the query below would naturally do what I explain, but apparently not...
My table looks like this:
id | name | g | partner | g2
1 | John | M | Sam | M
2 | Devon | M | Mike | M
3 | Kurt | M | Susan | F
4 | Stacy | F | Bob | M
5 | Rosa | F | Rita | F
I'm trying to get the id where either the g or g2 value equals 'M'... But, a record where both the g and g2 values are 'M' should return two lines, not 1.
So, in the above sample data, I'm trying to return:
$q = pg_query("SELECT id FROM mytable WHERE ( g = 'M' OR g2 = 'M' )");
1
1
2
2
3
4
But, it always returns:
1
2
3
4

Your query doesn't work because each row is returned only once whether it matches one or both of the conditions. To get what you want use two queries and use UNION ALL to combine the results:
SELECT id FROM mytable WHERE g = 'M'
UNION ALL
SELECT id FROM mytable WHERE g2 = 'M'
ORDER BY id
Result:
1
1
2
2
3
4

you might try a UNION along these lines:
"SELECT id FROM mytable WHERE ( g = 'M') UNION SELECT id FROM mytable WHERE ( g2 = 'M')"
Hope this helps, Martin

SELECT id FROM mytable WHERE g = 'M'
UNION
SELECT id FROM mytable WHERE g2 = 'M'

Equivalent to unpivot() in PostgreSQL

Is there a unpivot equivalent function in PostgreSQL?

Create an example table:
CREATE TEMP TABLE foo (id int, a text, b text, c text);
INSERT INTO foo VALUES (1, 'ant', 'cat', 'chimp'), (2, 'grape', 'mint', 'basil');
You can 'unpivot' or 'uncrosstab' using UNION ALL:
SELECT id,
'a' AS colname,
a AS thing
FROM foo
UNION ALL
SELECT id,
'b' AS colname,
b AS thing
FROM foo
UNION ALL
SELECT id,
'c' AS colname,
c AS thing
FROM foo
ORDER BY id;
This runs 3 different subqueries on foo, one for each column we want to unpivot, and returns, in one table, every record from each of the subqueries.
But that will scan the table N times, where N is the number of columns you want to unpivot. This is inefficient, and a big problem when, for example, you're working with a very large table that takes a long time to scan.
Instead, use:
SELECT id,
unnest(array['a', 'b', 'c']) AS colname,
unnest(array[a, b, c]) AS thing
FROM foo
ORDER BY id;
This is easier to write, and it will only scan the table once.
array[a, b, c] returns an array object, with the values of a, b, and c as it's elements.
unnest(array[a, b, c]) breaks the results into one row for each of the array's elements.

You could use VALUES() and JOIN LATERAL to unpivot the columns.
Sample data:
CREATE TABLE test(id int, a INT, b INT, c INT);
INSERT INTO test(id,a,b,c) VALUES (1,11,12,13),(2,21,22,23),(3,31,32,33);
Query:
SELECT t.id, s.col_name, s.col_value
FROM test t
JOIN LATERAL(VALUES('a',t.a),('b',t.b),('c',t.c)) s(col_name, col_value) ON TRUE;
DBFiddle Demo
Using this approach it is possible to unpivot multiple groups of columns at once.
EDIT
Using Zack's suggestion:
SELECT t.id, col_name, col_value
FROM test t
CROSS JOIN LATERAL (VALUES('a', t.a),('b', t.b),('c',t.c)) s(col_name, col_value);
<=>
SELECT t.id, col_name, col_value
FROM test t
,LATERAL (VALUES('a', t.a),('b', t.b),('c',t.c)) s(col_name, col_value);
db<>fiddle demo

Great article by Thomas Kellerer found here
Unpivot with Postgres
Sometimes it’s necessary to normalize de-normalized tables - the opposite of a “crosstab” or “pivot” operation. Postgres does not support an UNPIVOT operator like Oracle or SQL Server, but simulating it, is very simple.
Take the following table that stores aggregated values per quarter:
create table customer_turnover
(
customer_id integer,
q1 integer,
q2 integer,
q3 integer,
q4 integer
);
And the following sample data:
customer_id | q1 | q2 | q3 | q4
------------+-----+-----+-----+----
1 | 100 | 210 | 203 | 304
2 | 150 | 118 | 422 | 257
3 | 220 | 311 | 271 | 269
But we want the quarters to be rows (as they should be in a normalized data model).
In Oracle or SQL Server this could be achieved with the UNPIVOT operator, but that is not available in Postgres. However Postgres’ ability to use the VALUES clause like a table makes this actually quite easy:
select c.customer_id, t.*
from customer_turnover c
cross join lateral (
values
(c.q1, 'Q1'),
(c.q2, 'Q2'),
(c.q3, 'Q3'),
(c.q4, 'Q4')
) as t(turnover, quarter)
order by customer_id, quarter;
will return the following result:
customer_id | turnover | quarter
------------+----------+--------
1 | 100 | Q1
1 | 210 | Q2
1 | 203 | Q3
1 | 304 | Q4
2 | 150 | Q1
2 | 118 | Q2
2 | 422 | Q3
2 | 257 | Q4
3 | 220 | Q1
3 | 311 | Q2
3 | 271 | Q3
3 | 269 | Q4
The equivalent query with the standard UNPIVOT operator would be:
select customer_id, turnover, quarter
from customer_turnover c
UNPIVOT (turnover for quarter in (q1 as 'Q1',
q2 as 'Q2',
q3 as 'Q3',
q4 as 'Q4'))
order by customer_id, quarter;

FYI for those of us looking for how to unpivot in RedShift.
The long form solution given by Stew appears to be the only way to accomplish this.
For those who cannot see it there, here is the text pasted below:
We do not have built-in functions that will do pivot or unpivot. However,
you can always write SQL to do that.
create table sales (regionid integer, q1 integer, q2 integer, q3 integer, q4 integer);
insert into sales values (1,10,12,14,16), (2,20,22,24,26);
select * from sales order by regionid;
regionid | q1 | q2 | q3 | q4
----------+----+----+----+----
1 | 10 | 12 | 14 | 16
2 | 20 | 22 | 24 | 26
(2 rows)
pivot query
create table sales_pivoted (regionid, quarter, sales)
as
select regionid, 'Q1', q1 from sales
UNION ALL
select regionid, 'Q2', q2 from sales
UNION ALL
select regionid, 'Q3', q3 from sales
UNION ALL
select regionid, 'Q4', q4 from sales
;
select * from sales_pivoted order by regionid, quarter;
regionid | quarter | sales
----------+---------+-------
1 | Q1 | 10
1 | Q2 | 12
1 | Q3 | 14
1 | Q4 | 16
2 | Q1 | 20
2 | Q2 | 22
2 | Q3 | 24
2 | Q4 | 26
(8 rows)
unpivot query
select regionid, sum(Q1) as Q1, sum(Q2) as Q2, sum(Q3) as Q3, sum(Q4) as Q4
from
(select regionid,
case quarter when 'Q1' then sales else 0 end as Q1,
case quarter when 'Q2' then sales else 0 end as Q2,
case quarter when 'Q3' then sales else 0 end as Q3,
case quarter when 'Q4' then sales else 0 end as Q4
from sales_pivoted)
group by regionid
order by regionid;
regionid | q1 | q2 | q3 | q4
----------+----+----+----+----
1 | 10 | 12 | 14 | 16
2 | 20 | 22 | 24 | 26
(2 rows)
Hope this helps, Neil

Pulling slightly modified content from the link in the comment from #a_horse_with_no_name into an answer because it works:
Installing Hstore
If you don't have hstore installed and are running PostgreSQL 9.1+, you can use the handy
CREATE EXTENSION hstore;
For lower versions, look for the hstore.sql file in share/contrib and run in your database.
Assuming that your source (e.g., wide data) table has one 'id' column, named id_field, and any number of 'value' columns, all of the same type, the following will create an unpivoted view of that table.
CREATE VIEW vw_unpivot AS
SELECT id_field, (h).key AS column_name, (h).value AS column_value
FROM (
SELECT id_field, each(hstore(foo) - 'id_field'::text) AS h
FROM zcta5 as foo
) AS unpiv ;
This works with any number of 'value' columns. All of the resulting values will be text, unless you cast, e.g., (h).value::numeric.

Just use JSON:
with data (id, name) as (
values (1, 'a'), (2, 'b')
)
select t.*
from data, lateral jsonb_each_text(to_jsonb(data)) with ordinality as t
order by data.id, t.ordinality;
This yields
|key |value|ordinality|
|----|-----|----------|
|id |1 |1 |
|name|a |2 |
|id |2 |1 |
|name|b |2 |
dbfiddle

I wrote a horrible unpivot function for PostgreSQL. It's rather slow but it at least returns results like you'd expect an unpivot operation to.
https://cgsrv1.arrc.csiro.au/blog/2010/05/14/unpivotuncrosstab-in-postgresql/
Hopefully you can find it useful..

Depending on what you want to do... something like this can be helpful.
with wide_table as (
select 1 a, 2 b, 3 c
union all
select 4 a, 5 b, 6 c
)
select unnest(array[a,b,c]) from wide_table

You can use FROM UNNEST() array handling to UnPivot a dataset, tandem with a correlated subquery (works w/ PG 9.4).
FROM UNNEST() is more powerful & flexible than the typical method of using FROM (VALUES .... ) to unpivot datasets. This is b/c FROM UNNEST() is variadic (with n-ary arity). By using a correlated subquery the need for the lateral ORDINAL clause is eliminated, & Postgres keeps the resulting parallel columnar sets in the proper ordinal sequence.
This is, BTW, FAST -- in practical use spawning 8 million rows in < 15 seconds on a 24-core system.
WITH _students AS ( /** CTE **/
SELECT * FROM
( SELECT 'jane'::TEXT ,'doe'::TEXT , 1::INT
UNION
SELECT 'john'::TEXT ,'doe'::TEXT , 2::INT
UNION
SELECT 'jerry'::TEXT ,'roe'::TEXT , 3::INT
UNION
SELECT 'jodi'::TEXT ,'roe'::TEXT , 4::INT
) s ( fn, ln, id )
) /** end WITH **/
SELECT s.id
, ax.fanm -- field labels, now expanded to two rows
, ax.anm -- field data, now expanded to two rows
, ax.someval -- manually incl. data
, ax.rankednum -- manually assigned ranks
,ax.genser -- auto-generate ranks
FROM _students s
,UNNEST /** MULTI-UNNEST() BLOCK **/
(
( SELECT ARRAY[ fn, ln ]::text[] AS anm -- expanded into two rows by outer UNNEST()
/** CORRELATED SUBQUERY **/
FROM _students s2 WHERE s2.id = s.id -- outer relation
)
,( /** ordinal relationship preserved in variadic UNNEST() **/
SELECT ARRAY[ 'first name', 'last name' ]::text[] -- exp. into 2 rows
AS fanm
)
,( SELECT ARRAY[ 'z','x','y'] -- only 3 rows gen'd, but ordinal rela. kept
AS someval
)
,( SELECT ARRAY[ 1,2,3,4,5 ] -- 5 rows gen'd, ordinal rela. kept.
AS rankednum
)
,( SELECT ARRAY( /** you may go wild ... **/
SELECT generate_series(1, 15, 3 )
AS genser
)
)
) ax ( anm, fanm, someval, rankednum , genser )
;
RESULT SET:
+--------+----------------+-----------+----------+---------+-------
| id | fanm | anm | someval |rankednum| [ etc. ]
+--------+----------------+-----------+----------+---------+-------
| 2 | first name | john | z | 1 | .
| 2 | last name | doe | y | 2 | .
| 2 | [null] | [null] | x | 3 | .
| 2 | [null] | [null] | [null] | 4 | .
| 2 | [null] | [null] | [null] | 5 | .
| 1 | first name | jane | z | 1 | .
| 1 | last name | doe | y | 2 | .
| 1 | | | x | 3 | .
| 1 | | | | 4 | .
| 1 | | | | 5 | .
| 4 | first name | jodi | z | 1 | .
| 4 | last name | roe | y | 2 | .
| 4 | | | x | 3 | .
| 4 | | | | 4 | .
| 4 | | | | 5 | .
| 3 | first name | jerry | z | 1 | .
| 3 | last name | roe | y | 2 | .
| 3 | | | x | 3 | .
| 3 | | | | 4 | .
| 3 | | | | 5 | .
+--------+----------------+-----------+----------+---------+ ----

Here's a way that combines the hstore and CROSS JOIN approaches from other answers.
It's a modified version of my answer to a similar question, which is itself based on the method at https://blog.sql-workbench.eu/post/dynamic-unpivot/ and another answer to that question.
-- Example wide data with a column for each year...
WITH example_wide_data("id", "2001", "2002", "2003", "2004") AS (
VALUES
(1, 4, 5, 6, 7),
(2, 8, 9, 10, 11)
)
-- that is tided to have "year" and "value" columns
SELECT
id,
r.key AS year,
r.value AS value
FROM
example_wide_data w
CROSS JOIN
each(hstore(w.*)) AS r(key, value)
WHERE
-- This chooses columns that look like years
-- In other cases you might need a different condition
r.key ~ '^[0-9]{4}$';
It has a few benefits over other solutions:
By using hstore and not jsonb, it hopefully minimises issues with type conversions (although hstore does convert everything to text)
The columns don't need to be hard coded or known in advance. Here, columns are chosen by a regex on the name, but you could use any SQL logic based on the name, or even the value.
It doesn't require PL/pgSQL - it's all SQL

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Multiple Self-Joins to Find Transitive Subsets (cliques) - tsql

Related

How to order rows with linked parts in PostgreSQL

SQL Server recursive query·

Find Parent Recursively using Query

postgres counting one record twice if it meets certain criteria

Equivalent to unpivot() in PostgreSQL

Categories

Resources