Pivot and aggregate 1..n duplicate records Postgres psql

Pivot and aggregate 1..n duplicate records Postgres psql - postgresql

I have a table tbl_action like this:
game_id
action
action_datetime
1
start
2022-04-05T10:30+00
1
attack
2022-04-05T10:45+00
1
defend
2022-04-05T11:30+00
1
attack
2022-04-05T11:45+00
1
defend
2022-04-05T12:00+00
1
stop
2022-04-05T12:10+00
create table if not exists tblaction;
insert into "tblaction" (game_id, action_name, action_time) values (1,'start','2022-04-05T10:30+00'),
(2,'attack','2022-04-05T10:45+00'),
(3,'defend','2022-04-05T11:30+00'),
(4,'attack','2022-04-05T11:45+00'),
(5,'defend','2022-04-05T12:00+00'),
(6,'stop','2022-04-05T12:10+00');
I want to pivot it like this:
game_id
start
attack1
defend1
...
attackn
defendn
stop
1
2022-04-05T10:30+00
2022-04-05T10:45+00
2022-04-05T11:30+00
...
2022-04-05T11:45+00
2022-04-05T12:00+00
2022-04-05T12:10+00
My question is how to aggregate actions 1..n instead of aggregating all of each unique action_name into one number. I am using Postgres.
This MySQL sample code works when each action_name occurs <2 times but I would like something that works 1..n. I don't want to remove the duplicates. I am not limited to MySQL I can use the latest versions of Postgres.
SELECT
game_id,
MAX( CASE WHEN action_name = 'start' THEN action_time ELSE NULL END ) AS "start",
MIN( CASE WHEN action_name = 'attack' THEN action_time ELSE NULL END ) AS "attack",
MIN( CASE WHEN action_name = 'defend' THEN action_time ELSE NULL END ) AS "defend",
MAX( CASE WHEN action_name = 'attack' THEN action_time ELSE NULL END ) AS "attack",
MAX( CASE WHEN action_name = 'defend' THEN action_time ELSE NULL END ) AS "defend",
MAX( CASE WHEN action_name = 'stop' THEN action_time ELSE NULL END ) AS "stop"
FROM
tblaction
GROUP BY
game_id
ORDER BY
game_id ASC;
I read the Postgres tablefunc docs and tried \crosstabview in pgadmin4 and get the error \crosstabview: query result contains multiple data values for row "1", column "attack"
SELECT game_id, action_name, action_time FROM tblaction \crosstabview

You can try to use subquery with ROW_NUMER window function which help you make row number by game_id, action_name columns, then you can use condition aggregate function for that rn
SELECT
game_id,
MAX(CASE WHEN action_name = 'start' THEN action_time END ) AS "start",
MAX(CASE WHEN action_name = 'attack' AND rn = 1 THEN action_time END ) AS "attack1",
MAX(CASE WHEN action_name = 'defend' AND rn = 1 THEN action_time END ) AS "defend1",
MAX(CASE WHEN action_name = 'attack' AND rn = 2 THEN action_time END ) AS "attack",
MAX(CASE WHEN action_name = 'defend' AND rn = 2 THEN action_time END ) AS "defend",
MAX(CASE WHEN action_name = 'stop' THEN action_time END ) AS "stop"
FROM
(
SELECT *,ROW_NUMBER() OVER(PARTITION BY game_id,action_name ORDER BY action_time) rn
FROM tblaction
) t1
GROUP BY
game_id
ORDER BY
game_id ASC;
sqlfiddle

I suggest that it will be simpler and more flexible with GROUP_CONCAT.
NB The question said to work on mySQL or Postgres. See https://dbfiddle.uk/?rdbms=postgres_10&fiddle=8719aedb7ab5736caf79fe86e76d6f40 for a version for Postgres using STRING_AGG( ~ ,',') instead of GROUP_CONCAT, otherwise identical
create table tblaction (
game_id int,
action_name varchar(20),
action_time varchar(25)
);
insert into tblaction
(game_id, action_name, action_time) values
(1,'start','2022-04-05T10:30+00'),
(1,'attack','2022-04-05T10:45+00'),
(1,'defend','2022-04-05T11:30+00'),
(1,'attack','2022-04-05T11:45+00'),
(1,'defend','2022-04-05T12:00+00'),
(1,'stop','2022-04-05T12:10+00');
✓
✓
SELECT
game_id,
GROUP_CONCAT(case when action_name='start' then action_time end) start,
GROUP_CONCAT(case when action_name='stop' then action_time end) stop,
COUNT(action_time) "number",
GROUP_CONCAT(case when action_name='attack' then action_time end) "attacks",
GROUP_CONCAT(case when action_name='defend' then action_time end) "defends"
FROM tblaction
GROUP BY
game_id;
game_id | start | stop | number | attacks | defends
------: | :------------------ | :------------------ | -----: | :-------------------------------------- | :--------------------------------------
1 | 2022-04-05T10:30+00 | 2022-04-05T12:10+00 | 6 | 2022-04-05T10:45+00,2022-04-05T11:45+00 | 2022-04-05T11:30+00,2022-04-05T12:00+00
db<>fiddle here

Related

How to selecting subqueries correctly

I have two queries that give me back a single entry. How can I select both of these as on table?
query1: Select max([column3]) from [table1] => 42
query2: Select Top 1 [column1] from [table1] => 'test'
I want a resultset like this
result1
result2
42
'test'
But how to do it correctly? Can I maybe select from nowhere somehow?

You could use ROW_NUMBER, twice:
WITH cte AS (
SELECT *, ROW_NUMBER() OVER (ORDER BY column3 DESC) rn1,
ROW_NUMBER() OVER (ORDER BY some_col) rn2
FROM table1
)
SELECT MAX(CASE WHEN rn1 = 1 THEN column3 END) AS result1,
MAX(CASE WHEN rn2 = 1 THEN column1 END) AS result2
FROM cte;
Note that I assume there exists a column some_col which you intend to use for choosing the column1 value.

Merge consecutive duplicate records including time range

I have a very similar problem to the question asked here: Merge duplicate temporal records in database
The difference here is, that I need the end date to be an actual date instead of NULL.
So given the following data:
EmployeeId StartDate EndDate Column1 Column2
1000 2009/05/01 2010/04/30 X Y
1000 2010/05/01 2011/04/30 X Y
1000 2011/05/01 2012/04/30 X X
1000 2012/05/01 2013/04/30 X Y
1000 2013/05/01 2014/04/30 X X
1000 2014/05/01 2014/06/01 X X
The desired result is:
EmployeeId StartDate EndDate Column1 Column2
1000 2009/05/01 2011/04/30 X Y
1000 2011/05/01 2012/04/30 X X
1000 2012/05/01 2013/04/30 X Y
1000 2013/05/01 2014/06/01 X X
The proposed solution in the linked thread is this:
with t1 as --tag first row with 1 in a continuous time series
(
select t1.*, case when t1.column1=t2.column1 and t1.column2=t2.column2
then 0 else 1 end as tag
from test_table t1
left join test_table t2
on t1.EmployeeId= t2.EmployeeId and dateadd(day,-1,t1.StartDate)= t2.EndDate
)
select t1.EmployeeId, t1.StartDate,
case when min(T2.StartDate) is null then null
else dateadd(day,-1,min(T2.StartDate)) end as EndDate,
t1.Column1, t1.Column2
from (select t1.* from t1 where tag=1 ) as t1 -- to get StartDate
left join (select t1.* from t1 where tag=1 ) as t2 -- to get a new EndDate
on t1.EmployeeId= t2.EmployeeId and t1.StartDate < t2.StartDate
group by t1.EmployeeId, t1.StartDate, t1.Column1, t1.Column2;
However, this does not seem to work when you need the end date instead of just NULL.
Could someone help me with this issue?

How about this?
create table test_table (EmployeeId int, StartDate date, EndDate date, Column1 char(1), Column2 char(1))
;
insert into test_table values
(1000 , '2009-05-01','2010-04-30','X','Y')
,(1000 , '2010-05-01','2011-04-30','X','Y')
,(1000 , '2011-05-01','2012-04-30','X','X')
,(1000 , '2012-05-01','2013-04-30','X','Y')
,(1000 , '2013-05-01','2014-04-30','X','X')
,(1000 , '2014-05-01','2014-06-01','X','X')
;
SELECT EmployeeId, StartDate, EndDate, Column1, Column2 FROM
(
SELECT EmployeeId, StartDate
, MAX(EndDate) OVER(PARTITION BY EmployeeId, RN) AS EndDate
, Column1
, Column2
, DIFF
FROM
(
SELECT t.*
, SUM(DIFF) OVER(PARTITION BY EmployeeId ORDER BY StartDate ) AS RN
FROM
(
SELECT t.*
, CASE WHEN
Column1 = LAG(Column1,1) OVER(PARTITION BY EmployeeId ORDER BY StartDate)
AND Column2 = LAG(Column2,1) OVER(PARTITION BY EmployeeId ORDER BY StartDate)
THEN 0 ELSE 1 END AS DIFF
FROM
test_table t
) t
)
)
WHERE DIFF = 1
;

This is another solution (taken from How do I group on continuous ranges). It is simpler to code and also caters for NULL values (i.e. treats NULL = NULL unlike the simple LAG() comparison). However it might not be quite as efficient on large volumes of data due to the GROUP BY
SELECT EmployeeId
, MIN(StartDate) AS StartDate
, MAX(EndDate) AS EndDate
, Column1
, Column2
FROM
(
SELECT t.*
, ROW_NUMBER() OVER(PARTITION BY EmployeeId, Column1, Column2 ORDER BY StartDate ) AS GRN
, ROW_NUMBER() OVER(PARTITION BY EmployeeId ORDER BY StartDate ) AS RN
FROM
test_table t
) t
GROUP BY
EmployeeId
, Column1
, Column2
, RN - GRN

Convert one column to multiple columns in postgres

I have a table like this below :
Would like to change the format as below on postgres :
I tried to use the case statement but did not give me desired results.
Thank you in advance for the help !
EDIT
select (case when column_1='A' then column_1 else 'other' end) column_1,
(case when column_1='B' then Column_1 else 'other' end) column_2 from test_t
where id= random_value;
Each time the query returns only 2 rows and the row values in the column_1 are dynamic and not fixed.

Here we go...
CREATE TABLE test_table(column_1 text);
INSERT INTO test_table ('A'),('B');
SELECT * FROM test_table ;
column_1
---------
B
A
SELECT
max(case when column_1='A' THEN column_1 END) column_1,
max(case when column_1='B' THEN column_1 END) column_2
from test_table;
column_1 | column_2
----------+----------
A | B
In PostgreSQL you can do this easily with crosstab(), but in greenplum still it is not implemented

Please refer to this link. Previously answered.
stackoverflow.com/a/10625294/1870151
SELECT
unnest(array['col1', 'col2', 'col3']) AS "Columns",
unnest(array[col1::text, col2::text, col3::text]) AS "Values"
FROM tbl;

You didn't really provide enough information to really answer the question but this is how you convert those two rows from one column into two columns and forced into a single row.
select max(column_1) as column_1, max(column_2) as column_2
from (select case when column_1 = 'A' then column_1 else '' end as column_1,
case when column_1 = 'B' then column_1 else '' end as column_2
from table_name);

If the result you want to transpose always has only 2 rows, this will work regardless of the contents of those columns, as you asked:
SELECT
MAX(CASE WHEN row_number=1 THEN column_1 END) column_1,
MAX(CASE WHEN row_number=2 THEN column_1 END) column_2
FROM (SELECT column_1,
ROW_NUMBER() OVER (ORDER BY test_table.column_1)
FROM test_table) t;
column_1 | column_2
----------+----------
A | B

PIVOT help on SQL Server 2008 R2?

Input follows :
MID NAME ACTIVESTATUS DID DNAME STATUS
1A SRN ACTIVE 1 FEEVER NEW
1A SRN ACTIVE 2 MOTIONS ACTIVE
1A SRN ACTIVE 3 SUGAR INVALIDCODE
1A SRN ACTIVE 4 BP ACTIVE-PRIMARY
Expected Output would be like :
MID NAME ACTIVESTATUS FEVERSTATUS MOTIONSTAUS SUGATSTATUS BPSTATUS
1 SRN ACTIVE NEW ACTIVE INVALIDCODE ACTIVE-PRIMARY

It is actually very simple:
SELECT mid,
name,
activestatus,
MAX (CASE dname WHEN 'FEEVER' THEN status ELSE NULL END) AS feverstatus,
MAX (CASE dname WHEN 'MOTIONS' THEN status ELSE NULL END)
AS motionstatus,
MAX (CASE dname WHEN 'INVALIDCODE' THEN status ELSE NULL END)
AS sugarstatus,
MAX (CASE dname WHEN 'BP' THEN status ELSE NULL END) AS bpstatus
FROM myTable
GROUP BY mid, name, activestatus;
You will get your expected output.

Please try this
Test Data
DECLARE #Table TABLE (MID VARCHAR(2),NAME VARCHAR(3),ACTIVESTATUS VARCHAR(10),DID TINYINT,DNAME VARCHAR(7),STATUS VARCHAR(20))
INSERT INTO #Table(MID,NAME,ACTIVESTATUS,DID,DNAME,STATUS) VALUES
('1A','SRN','ACTIVE','1','FEEVER','NEW'),
('1A','SRN','ACTIVE','2','MOTIONS','ACTIVE'),
('1A','SRN','ACTIVE','3','SUGAR','INVALIDCODE'),
('1A','SRN','ACTIVE','4','BP','ACTIVE-PRIMARY')
Using Pivot
SELECT MID,
NAME,
ACTIVESTATUS,
[FEEVER] AS FEVERSTATUS,
[MOTIONS] AS MOTIONSTAUS,
[SUGAR] AS SUGATSTATUS,
[BP] AS BPSTATUS
FROM (
SELECT LEFT(MID,1) AS MID,
NAME,
ACTIVESTATUS,
DNAME,
STATUS
FROM #Table t
) a
PIVOT(
MIN(STATUS)FOR DNAME in ([FEEVER],[MOTIONS],[SUGAR],[BP])
)piv
Without using Pivot
SELECT DISTINCT
(
SELECT TOP 1 LEFT(MID, 1)
FROM #Table t
WHERE tt.MID=t.MID
) AS MID,
(
SELECT TOP 1 NAME
FROM #Table t
WHERE tt.MID=t.MID
) AS NAME,
(
SELECT TOP 1 ACTIVESTATUS
FROM #Table t
WHERE tt.MID=t.MID
) AS ACTIVESTATUS,
(
SELECT TOP 1 STATUS
FROM #Table t
WHERE tt.MID=t.MID
AND t.DID =1
) AS FEVERSTATUS,
(
SELECT TOP 1 STATUS
FROM #Table t
WHERE tt.MID=t.MID
AND t.DID =2
) AS MOTIONSTAUS,
(
SELECT TOP 1 STATUS
FROM #Table t
WHERE tt.MID=t.MID
AND t.DID =3
) AS SUGATSTATUS,
(
SELECT TOP 1 STATUS
FROM #Table t
WHERE tt.MID=t.MID
AND t.DID =4
) AS BPSTATUS
FROM #Table tt

Try this:
SELECT mid,
NAME,
activestatus,
feever AS feverstatus,
motions AS motionstatus,
sugar AS sugarstatus,
bp AS bpstatus
FROM (
SELECT mid,
NAME,
activestatus,
dname,
[status]
FROM tbl
) s
PIVOT(
MAX([status])
FOR dname IN ([FEEVER], [MOTIONS], [SUGAR], [BP])
) p
Demo
Note that this solution assumes that the only possible values for dname are the ones included in the PIVOT clause. If the values are not going to be the above 4 for every mid, then you will need to generate that list first using dynamic SQL and then do a pivot operation.

SQL Running Subtraction and Deviation

-- Just a brief of business scenario is table has been created for a good receipt.
-- So here we have good expected line with PurchaseOrder(PO) in first few line.
-- And then we receive each expected line physically and that time these quantity may be different
-- due to business case like quantity may damage and short quantity like that.
-- So we maintain a status for that eg: OK, Damage, also we have to calculate short quantity
-- based on total of expected quantity of each item and total of received line.
if object_id('DEV..Temp','U') is not null
drop table Temp
CREATE TABLE Temp
(
ID INT IDENTITY(1,1) PRIMARY KEY CLUSTERED,
Item VARCHAR(32),
PO VARCHAR(32) NULL,
ExpectedQty INT NULL,
ReceivedQty INT NULL,
[STATUS] VARCHAR(32) NULL,
BoxName VARCHAR(32) NULL
)
-- Please see first few line with PO data will be the expected lines,
-- and then rest line will be received line
INSERT INTO TEMP (Item,PO,ExpectedQty,ReceivedQty,[STATUS],BoxName)
SELECT 'ITEM01','PO-01','30',NULL,NULL,NULL UNION ALL
SELECT 'ITEM01','PO-02','20',NULL,NULL,NULL UNION ALL
SELECT 'ITEM02','PO-01','40',NULL,NULL,NULL UNION ALL
SELECT 'ITEM03','PO-01','50',NULL,NULL,NULL UNION ALL
SELECT 'ITEM03','PO-02','30',NULL,NULL,NULL UNION ALL
SELECT 'ITEM03','PO-03','20',NULL,NULL,NULL UNION ALL
SELECT 'ITEM04','PO-01','30',NULL,NULL,NULL UNION ALL
SELECT 'ITEM01',NULL,NULL,'20','OK','box01' UNION ALL
SELECT 'ITEM01',NULL,NULL,'25','OK','box02' UNION ALL
SELECT 'ITEM01',NULL,NULL,'5','DAMAGE','box03' UNION ALL
SELECT 'ITEM02',NULL,NULL,'38','OK','box04' UNION ALL
SELECT 'ITEM02',NULL,NULL,'2','DAMAGE','box05' UNION ALL
SELECT 'ITEM03',NULL,NULL,'30','OK','box06' UNION ALL
SELECT 'ITEM03',NULL,NULL,'30','OK','box07' UNION ALL
SELECT 'ITEM03',NULL,NULL,'10','DAMAGE','box09' UNION ALL
SELECT 'ITEM04',NULL,NULL,'25','OK','box10'
-- Below Table is my expected result based on above data.
-- I need to show those data following way.
-- So I appreciate if you can give me an appropriate query for it.
-- Note: first row is blank and it is actually my table header. :)
-- Conditions : any of row, we cant have ReceivedQty, DamageQty and ShortQty
-- values more than ExpectedQty value. Item03 has this scenario
-- Query should run in SQL 2000 DB
SELECT ''as'ITEM', ''as'PO#', ''as'ExpectedQty',''as'ReceivedQty',''as'DamageQty' ,''as'ShortQty' UNION ALL
SELECT 'ITEM01','PO-01','30','30','0' ,'0' UNION ALL
SELECT 'ITEM01','PO-02','20','15','5' ,'0' UNION ALL
SELECT 'ITEM02','PO-01','40','38','2' ,'0' UNION ALL
SELECT 'ITEM03','PO-01','50','50','0' ,'0' UNION ALL
SELECT 'ITEM03','PO-02','30','20','10' ,'10' UNION ALL
SELECT 'ITEM03','PO-03','20','0','0','20' UNION ALL
SELECT 'ITEM04','PO-01','30','25','0' ,'5'

Using this solution as a starting point, I've eventually ended up with this:
SELECT
Item,
PO,
ExpectedQty,
ReceivedQty = CASE
WHEN RemainderQty >= 0 THEN ExpectedQty
WHEN RemainderQty < -ExpectedQty THEN 0
ELSE RemainderQty + ExpectedQty
END,
DamageQty = CASE
WHEN RemainderQty >=0 OR ExpectedQty < -TotalRemainderQty THEN 0
WHEN RemainderQty < -ExpectedQty AND TotalRemainderQty > 0 THEN ExpectedQty
WHEN RemainderQty < -ExpectedQty AND TotalRemainderQty < -DamagedQty THEN ExpectedQty + TotalRemainderQty
WHEN RemainderQty > -DamagedQty THEN -RemainderQty
ELSE DamagedQty
END,
ShortQty = CASE
WHEN TotalRemainderQty >= 0 THEN 0
WHEN TotalRemainderQty < -ExpectedQty THEN ExpectedQty
ELSE -TotalRemainderQty
END
FROM (
SELECT
a.Item,
a.PO,
a.ExpectedQty,
b.DamagedQty,
RemainderQty = b.ReceivedQty - a.RunningTotalQty,
TotalRemainderQty = b.ReceivedQty + b.DamagedQty - a.RunningTotalQty
FROM (
SELECT
a.Item,
a.PO,
a.ExpectedQty,
RunningTotalQty = SUM(a2.ExpectedQty)
FROM (SELECT Item, PO, ExpectedQty FROM Temp WHERE STATUS IS NULL) AS a
INNER JOIN (SELECT Item, PO, ExpectedQty FROM Temp WHERE STATUS IS NULL) AS a2
ON a.Item = a2.Item AND a.PO >= a2.PO
GROUP BY
a.Item,
a.PO,
a.ExpectedQty
) a
LEFT JOIN (
SELECT
Item,
ReceivedQty = SUM(CASE STATUS WHEN 'OK' THEN ReceivedQty ELSE 0 END),
DamagedQty = SUM(CASE STATUS WHEN 'DAMAGE' THEN ReceivedQty ELSE 0 END)
FROM Temp
GROUP BY Item
) b ON a.Item = b.Item
) s;

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Pivot and aggregate 1..n duplicate records Postgres psql - postgresql

Related

How to selecting subqueries correctly

Merge consecutive duplicate records including time range

Convert one column to multiple columns in postgres

PIVOT help on SQL Server 2008 R2?

SQL Running Subtraction and Deviation

Categories

Resources