Give "ID" to multiple columns - postgresql

Is there a native postgresql function that gives "IDs", based on the column.
column 1 column 2 id1 id2
aa AA 1 1
aa BB 1 2
bb BB 2 2
cc BB 3 2
cc CC 3 3
dd DD 4 4
I only want the "ID" to increment, when the value in the column changes. Otherwise, the "ID" should be the same.

SELECT o.column1, o.column2
, dense_rank() OVER (ORDER BY column1) AS id1
, dense_rank() OVER (ORDER BY column2) AS id2
FROM ordi o
;

Related

Query to assign max date among child items to parent item

I've data in two Postgres tables as below
table1
wid w.name owner
1 abc own1
2 def own2
3 ghi own3
table2
vid wid vname date
9 1 vnam1 10-7-2020
10 1 vnam1 10-8-2018
11 1 vnam2 10-9-2019
12 1 vnam2 10-8-2020
13 2 vnam3 10-10-2017
14 2 vnam3 10-08-2020
15 2 vnam4 10-10-2018
16 2 vnam4 10-10-2019
17 3 vnam5 10-06-2016
18 3 vnam5 10-07-2020
19 3 vnam6 10-08-2020
I was able to get max date for each of the table2 vname related to w.name in table2 but I'm looking for something like this in the result so that I can decide each w.name max date.
wid w.name owner vname maxdate
1 abc own1 vnam2 10-08-2020 (Max date out of 4 values of vnames) <br>
2 def own2 vnam3 10-08-2020
3 ghi own3 vnam6 10-08-2020
Use DISTINCT ON to achieve this.
select distinct on (t1.wid)
t1.wid, t1."w.name", t1.owner, t2.vname, t2.date
from table1 t1
join table2 t2 on t2.wid = t1.wid
order by t1.wid, t2.date desc;
Working fiddle

Pivot Table in SQL (using Groupby)

I have a table structured as below
Customer_ID Sequence Comment_Code Comment
1 10 0 a
1 11 1 b
1 12 1 c
1 13 1 d
2 20 0 x
2 21 1 y
3 100 0 m
3 101 1 n
3 102 1 o
1 52 0 t
1 53 1 y
1 54 1 u
Sequence number is the unique number in the table
I want the output in SQL as below
Customer_ID Sequence
1 abcd
2 xy
3 mno
1 tyu
Can someone please help me with this. I can provide more details if required.
enter image description here
This looks like a simple gaps/islands problem.
-- Sample Data
DECLARE #table TABLE
(
Customer_ID INT,
[Sequence] INT,
Comment_Code INT,
Comment CHAR(1)
);
INSERT #table
(
Customer_ID,
[Sequence],
Comment_Code,
Comment
)
VALUES (1,10 ,0,'a'),(1,11 ,1,'b'),(1,12 ,1,'c'),(1,13 ,1,'d'),(2,20 ,0,'x'),(2,21 ,1,'y'),
(3,100,0,'m'),(3,101,1,'n'),(3,102,1,'o'),(1,52 ,0,'t'),(1,53 ,1,'y'),(1,54 ,1,'u');
-- Solution
WITH groups AS
(
SELECT
t.Customer_ID,
Grouper = [Sequence] - DENSE_RANK() OVER (ORDER BY [Sequence]),
t.Comment
FROM #table AS t
)
SELECT
g.Customer_ID,
[Sequence] =
(
SELECT g2.Comment+''
FROM groups AS g2
WHERE g.Customer_ID = g2.Customer_ID AND g.Grouper = g2.Grouper
FOR XML PATH('')
)
FROM groups AS g
GROUP BY g.Customer_ID, g.Grouper;
Returns:
Customer_ID Sequence
----------- ----------
1 abcd
1 tyu
2 xy
3 mno

Refer to current row in window function

Is it possible to refer to the current row in a window partition? I want to do something like the following:
SELECT min(ABS(variable - CURRENT.variable)) over (order by criterion RANGE UNBOUNDED PRECEDING)
That is, i want to find in the given partition the variable which is closest to the current value. Is is possible to do something like that?
As an example, from:
criterion | variable
1 2
2 4
3 2
4 7
5 6
We would obtain:
null
2
0
3
1
Thanks
As far as I know, this cannot be done with window functions.
But it can be done with a self join:
SELECT a.id,
a.variable,
min(abs(a.variable - b.variable))
FROM mydata a
LEFT JOIN mydata b
ON (b.criterion < a.criterion)
GROUP BY a.id, a.variable
ORDER BY a.id;
If I understand correctly:
with t (v) as (values (-5),(-2),(0),(1),(3),(10))
select v,
least(
v - lag(v) over (order by v),
lead(v) over (order by v) - v
) as closest
from t
;
v | closest
----+---------
-5 | 3
-2 | 2
0 | 1
1 | 1
3 | 2
10 | 7
Hope this could help you (pay attention for performance problems).
I tried this in MSSQL (at bottom you'll find POSTGRESQL version):
CREATE TABLE TX (CRITERION INT, VARIABILE INT);
INSERT INTO TX VALUES (1,2), (2,4),(3,2),(4,7), (5,6);
SELECT CRITERION, MIN_DELTA FROM
(
SELECT TX.CRITERION
, MIN(ABS(B.TX2_VAR - TX.VARIABILE)) OVER (PARTITION BY TX.CRITERION) AS MIN_DELTA
, RANK() OVER (PARTITION BY TX.CRITERION ORDER BY ABS(B.TX2_VAR - TX.VARIABILE) ) AS MIN_RANK
FROM TX
CROSS APPLY (SELECT TX2.CRITERION AS TX2_CRIT, TX2.VARIABILE AS TX2_VAR FROM TX TX2 WHERE TX2.CRITERION < TX.CRITERION) B
) C
WHERE MIN_RANK=1
ORDER BY CRITERION
;
Output:
CRITERION MIN_DELTA
----------- -----------
2 2
3 0
4 3
5 1
POSTGRESQL Version (tested on Rextester http://rextester.com/VMGJ87600):
CREATE TABLE TX (CRITERION INT, VARIABILE INT);
INSERT INTO TX VALUES (1,2), (2,4),(3,2),(4,7), (5,6);
SELECT * FROM TX;
SELECT CRITERION, MIN_DELTA FROM
(
SELECT TX.CRITERION
, MIN(ABS(B.TX2_VAR - TX.VARIABILE)) OVER (PARTITION BY TX.CRITERION) AS MIN_DELTA
, RANK() OVER (PARTITION BY TX.CRITERION ORDER BY ABS(B.TX2_VAR - TX.VARIABILE) ) AS MIN_RANK
FROM TX
LEFT JOIN LATERAL (SELECT TX2.CRITERION AS TX2_CRIT, TX2.VARIABILE AS TX2_VAR FROM TX TX2 WHERE TX2.CRITERION < TX.CRITERION) B ON TRUE
) C
WHERE MIN_RANK=1
ORDER BY CRITERION
;
DROP TABLE TX;
Output:
criterion variabile
1 1 2
2 2 4
3 3 2
4 4 7
5 5 6
criterion min_delta
1 1 NULL
2 2 2
3 3 0
4 4 3
5 5 1

Postgres - bind results of equal type by year - long to wide data

Please excuse my not very propper way of asking this as i am new to postgres...
Having the following two tables:
CREATE TABLE pub (
id int
, time timestamp
);
id time
1 1 2010-02-10 01:00:00
2 2 2011-02-10 01:00:00
3 3 2012-02-10 01:00:00
And
CREATE TABLE val (
id int
, type text
, val int
);
id type val
1 1 A 1
2 1 B 2
3 1 C 3
4 2 A 4
5 2 B 5
6 3 D 6
I would like to get the following output (for id <= 2 )
type 2010 2011
1 A 1 4
2 B 2 5
3 C 3 NULL
So type is the superset of all type's present in table val.
NULL meaning that there is no value for label C.
Ideally the column-headings are are years of the time. Alternatively the id itself...
Exists at least two ways to do this.
If your table have not many categories you can use CTE
WITH x AS (
SELECT type,
sum(val) FILTER (WHERE date_part('year', time) = 2010) AS "2010",
sum(val) FILTER (WHERE date_part('year', time) = 2011) AS "2011"
FROM pub AS p JOIN val AS v ON (v.id = p.id)
GROUP BY type
)
SELECT * FROM x
WHERE "2010" is NOT NULL OR "2011" IS NOT NULL
ORDER BY type
;
But if you have many or dynamic categories you must use crosstab:
CREATE EXTENSION tablefunc;
SELECT * FROM crosstab(
$$
SELECT type,
date_part('year', time)::text as time,
sum(val) AS val
FROM pub AS p JOIN val AS v ON (v.id = p.id)
GROUP BY type, 2
ORDER BY 1, 2
$$,
$$VALUES ('2010'::text), ('2011'), ('2012') $$
) AS ct (type text, "2010" int, "2011" int, "2012" int);
;

Renumbering a column in postgresql based on sorted values in that column

Edit: I am using postgresql v8.3
I have a table that contains a column we can call column A.
Column A is populated, for our purposes, with arbitrary positive integers.
I want to renumber column A from 1 to N based on ordering the records of the table by column A ascending. (SELECT * FROM table ORDER BY A ASC;)
Is there a simple way to accomplish this without the need of building a postgresql function?
Example:
(Before:
A: 3,10,20,100,487,1,6)
(After:
A: 2,4,5,6,7,1,3)
Use the rank() (or dense_rank() ) WINDOW-functions (available since PG-8.4):
create table aaa
( id serial not null primary key
, num integer not null
, rnk integer not null default 0
);
insert into aaa(num) values( 3) , (10) , (20) , (100) , (487) , (1) , (6)
;
UPDATE aaa
SET rnk = w.rnk
FROM (
SELECT id
, rank() OVER (order by num ASC) AS rnk
FROM aaa
) w
WHERE w.id = aaa.id;
SELECT * FROM aaa
ORDER BY id
;
Results:
CREATE TABLE
INSERT 0 7
UPDATE 7
id | num | rnk
----+-----+-----
1 | 3 | 2
2 | 10 | 4
3 | 20 | 5
4 | 100 | 6
5 | 487 | 7
6 | 1 | 1
7 | 6 | 3
(7 rows)
IF window functions are not available, you could still count the number of rows before any row:
UPDATE aaa
SET rnk = w.rnk
FROM ( SELECT a0.id AS id
, COUNT(*) AS rnk
FROM aaa a0
JOIN aaa a1 ON a1.num <= a0.num
GROUP BY a0.id
) w
WHERE w.id = aaa.id;
SELECT * FROM aaa
ORDER BY id
;
Or the same with a scalar subquery:
UPDATE aaa a0
SET rnk =
( SELECT COUNT(*)
FROM aaa a1
WHERE a1.num <= a0.num
)
;