SELECT, format rows with different columns into a single row that share an ID - select

I'm trying to format one SELECT statement so that it outputs a resultset with combined values over a few columns.
I have a resultset like this:
ID VID PID VALUE
1 x 1 a
2 y 1 A
3 y 2 B
4 x 2 b
5 y 3 C
6 x 3 c
7 x 4 d
8 y 4 D
9 x 5 e
10 y 5 E
Can I format one SELECT statement to effectively join the values with duplicate PIDs into a single row? I'm only really interested in PID and VALUE, e.g.
PID VALUE1 VALUE2
1 a A
2 b B
3 c C
4 d D
5 e E
Otherwise, should I be using actual JOINs with queries acting on the same table?
I tried to use CASE but can get up to a resultset like this:
ID VID PID VALUE1 VALUE2
1 x 1 a NULL
2 y 1 NULL A
3 y 2 NULL B
4 x 2 b NULL
5 y 3 NULL C
6 x 3 c NULL
7 x 4 d NULL
8 y 4 NULL D
9 x 5 e NULL
10 y 5 NULL E
The query I'm using looks somewhat like this.
SELECT
ID,
VID,
PID,
CASE WHEN VID = 'x' THEN VALUE END VALUE1,
CASE WHEN VID = 'y' THEN VALUE END VALUE2
FROM BIGTABLE
WHERE PID IN (1, 2, 3, 4, 5)
AND VID IN ('x', 'y')
There's a lot of values of PID and VID that aren't just 1-5 and x & y so I'm selecting them that way from the whole table.

Do you mean like this? It's called "conditional aggregation."
with
resultset ( id, vid, pid, value ) as (
select 1, 'x', 1, 'a' from dual union all
select 2, 'y', 1, 'A' from dual union all
select 3, 'y', 2, 'B' from dual union all
select 4, 'x', 2, 'b' from dual union all
select 5, 'y', 3, 'C' from dual union all
select 6, 'x', 3, 'c' from dual union all
select 7, 'x', 4, 'd' from dual union all
select 8, 'y', 4, 'D' from dual union all
select 9, 'x', 5, 'e' from dual union all
select 10, 'y', 5, 'E' from dual
)
-- End of simulated resultset (for testing purposes only, not part of the solution).
-- SQL query begins below this line.
select pid,
min(case when vid = 'x' then value end) as value1,
min(case when vid = 'y' then value end) as value2
from resultset
-- WHERE conditions, if any are needed - as in your attempt
group by pid
order by pid
;
PID VALUE1 VALUE2
--- ------ ------
1 a A
2 b B
3 c C
4 d D
5 e E

Related

SQL Renumbering index after group by

I have the following input table:
Seq Group GroupSequence
1 0
2 4 A
3 4 B
4 4 C
5 0
6 6 A
7 6 B
8 0
Output table is:
Line NewSeq GroupSequence
1 1
2 2 A
3 2 B
4 2 C
5 3
6 4 A
7 4 B
8 5
The rules for the input table are:
Any positive integer in the Group column indicates that the rows are grouped together. The entire field may be NULL or blank. A null or 0 indicates that the row is processed on its own. In the above example there are two groups and three 'single' rows.
the GroupSequence column is a single character that sorts within the group. NULL, blank, 'A', 'B' 'C' 'D' are the only characters allowed.
if Group has a positive integer, there must be alphabetic character in GroupSequence.
I need a query that creates the output table with a new column that sequences as shown.
External apps needs to iterate through this table in either Line or NewSeq order(same order, different values)
I've tried variations on GROUP BY, PARTITION BY, OVER(), etc. WITH no success.
Any help much appreciated.
Perhaps this will help
The only trick here is Flg which will indicate a new Group Sequence (values will be 1 or 0). Then it is a small matter to sum(Flg) via a window function.
Edit - Updated Flg method
Example
Declare #YourTable Table ([Seq] int,[Group] int,[GroupSequence] varchar(50))
Insert Into #YourTable Values
(1,0,null)
,(2,4,'A')
,(3,4,'B')
,(4,4,'C')
,(5,0,null)
,(6,6,'A')
,(7,6,'B')
,(8,0,null)
Select Line = Row_Number() over (Order by Seq)
,NewSeq = Sum(Flg) over (Order By Seq)
,GroupSequence
From (
Select *
,Flg = case when [Group] = lag([Group],1) over (Order by Seq) then 0 else 1 end
From #YourTable
) A
Order By Line
Returns
Line NewSeq GroupSequence
1 1 NULL
2 2 A
3 2 B
4 2 C
5 3 NULL
6 4 A
7 4 B
8 5 NULL

Refer to current row in window function

Is it possible to refer to the current row in a window partition? I want to do something like the following:
SELECT min(ABS(variable - CURRENT.variable)) over (order by criterion RANGE UNBOUNDED PRECEDING)
That is, i want to find in the given partition the variable which is closest to the current value. Is is possible to do something like that?
As an example, from:
criterion | variable
1 2
2 4
3 2
4 7
5 6
We would obtain:
null
2
0
3
1
Thanks
As far as I know, this cannot be done with window functions.
But it can be done with a self join:
SELECT a.id,
a.variable,
min(abs(a.variable - b.variable))
FROM mydata a
LEFT JOIN mydata b
ON (b.criterion < a.criterion)
GROUP BY a.id, a.variable
ORDER BY a.id;
If I understand correctly:
with t (v) as (values (-5),(-2),(0),(1),(3),(10))
select v,
least(
v - lag(v) over (order by v),
lead(v) over (order by v) - v
) as closest
from t
;
v | closest
----+---------
-5 | 3
-2 | 2
0 | 1
1 | 1
3 | 2
10 | 7
Hope this could help you (pay attention for performance problems).
I tried this in MSSQL (at bottom you'll find POSTGRESQL version):
CREATE TABLE TX (CRITERION INT, VARIABILE INT);
INSERT INTO TX VALUES (1,2), (2,4),(3,2),(4,7), (5,6);
SELECT CRITERION, MIN_DELTA FROM
(
SELECT TX.CRITERION
, MIN(ABS(B.TX2_VAR - TX.VARIABILE)) OVER (PARTITION BY TX.CRITERION) AS MIN_DELTA
, RANK() OVER (PARTITION BY TX.CRITERION ORDER BY ABS(B.TX2_VAR - TX.VARIABILE) ) AS MIN_RANK
FROM TX
CROSS APPLY (SELECT TX2.CRITERION AS TX2_CRIT, TX2.VARIABILE AS TX2_VAR FROM TX TX2 WHERE TX2.CRITERION < TX.CRITERION) B
) C
WHERE MIN_RANK=1
ORDER BY CRITERION
;
Output:
CRITERION MIN_DELTA
----------- -----------
2 2
3 0
4 3
5 1
POSTGRESQL Version (tested on Rextester http://rextester.com/VMGJ87600):
CREATE TABLE TX (CRITERION INT, VARIABILE INT);
INSERT INTO TX VALUES (1,2), (2,4),(3,2),(4,7), (5,6);
SELECT * FROM TX;
SELECT CRITERION, MIN_DELTA FROM
(
SELECT TX.CRITERION
, MIN(ABS(B.TX2_VAR - TX.VARIABILE)) OVER (PARTITION BY TX.CRITERION) AS MIN_DELTA
, RANK() OVER (PARTITION BY TX.CRITERION ORDER BY ABS(B.TX2_VAR - TX.VARIABILE) ) AS MIN_RANK
FROM TX
LEFT JOIN LATERAL (SELECT TX2.CRITERION AS TX2_CRIT, TX2.VARIABILE AS TX2_VAR FROM TX TX2 WHERE TX2.CRITERION < TX.CRITERION) B ON TRUE
) C
WHERE MIN_RANK=1
ORDER BY CRITERION
;
DROP TABLE TX;
Output:
criterion variabile
1 1 2
2 2 4
3 3 2
4 4 7
5 5 6
criterion min_delta
1 1 NULL
2 2 2
3 3 0
4 4 3
5 5 1

Selecting specific row from a sub query depending on lowest priority

I have a table with Clients and their Insurance Providers. There is a column called Priority that ranges from 1-8. I want to be able to select the lowest priority insurance into my 'master table' I have a query that provides Fees, Dates, Doctors etc. and I need a subquery that I can join to the Main query on Client_ID The priority doesn't always start with 1. The Insurance Table is the Many side of the relationship
Row# Client_id Insurance_id Priority active?
1 333 A 1 Y
2 333 B 2 Y
3 333 C 1 N
4 222 D 6 Y
5 222 A 8 Y
6 444 C 4 Y
7 444 A 5 Y
8 444 B 6 Y
Answer should be
Client_id Insurance_id Priority
333 A 1
222 D 6
444 C 4
I was able to achieve the results I think you're asking for pretty easily utilizing SQL's ROW_NUMBER() function:
declare #tbl table
(
Id int identity,
ClientId int,
InsuranceId char(1),
[Priority] int,
Active bit
)
insert into #tbl (ClientId, InsuranceId, [Priority], Active)
values (1, 'A', 1, 1),
(1, 'A', 2, 1),
(1, 'B', 3, 1),
(1, 'B', 4, 1),
(1, 'C', 1, 1),
(1, 'C', 2, 0),
(2, 'C', 1, 1),
(2, 'C', 2, 1)
select Id, ClientId, InsuranceId, [Priority]
from
(
select Id,
ClientId,
InsuranceId,
[Priority],
ROW_NUMBER() OVER (PARTITION BY ClientId, InsuranceId ORDER BY [Priority] desc) as RowNum
from #tbl
where Active = 1
) x
where x.RowNum = 1
Results:
(8 row(s) affected)
Id ClientId InsuranceId Priority
----------- ----------- ----------- -----------
2 1 A 2
4 1 B 4
5 1 C 1
8 2 C 2
(4 row(s) affected)

Postgresql: only keep unique values from integer array

Let's say I have an array of integers
1 6 6 3 3 8 4 4
It will be always of the form n*(pairs of number) + 2 (unique numbers).
Is there an efficient way of keeping only the 2 uniques values (i.e. the 2 with single occurence)?
Here, I would like to get 1 and 8.
So far is what I have:
SELECT node_id
FROM
( SELECT node_id, COUNT(*)
FROM unnest(array[1, 6, 6 , 3, 3 , 8 , 4 ,4]) AS node_id
GROUP BY node_id
) foo
ORDER BY count LIMIT 2;
You are very close, I think:
SELECT node_id
FROM (SELECT node_id, COUNT(*)
FROM unnest(array[1, 6, 6 , 3, 3 , 8 , 4 ,4]) AS node_id
GROUP BY node_id
HAVING count(*) = 1
) foo ;
You can group these back into an array, if you like, using array_agg().

Overlapping condition for case-when

I have the following query:
SELECT case
when tbl.id % 2 = 0 then 'mod-2'
when tbl.id % 3 = 0 then 'mod-3'
when tbl.id % 5 = 0 then 'mod-5'
else 'mod-x'
end as odds, tbl.id from some_xyz_table tbl;
If the table has Id 5,6,7 then it is returning output as (copied from pg-admin):
"mod-5";5
"mod-2";6
"mod-x";7
But, here I can see 6 is divisible by both 2 and 3. And my expected output is:
"mod-5";5
"mod-2";6
"mod-3";6 <-- this
"mod-x";7
Is there any way to modify this query to obtain such output? Any alternate solution will do for me.
You could do this with UNION queries [EDIT changed it to use UNION ALL]:
SELECT 'mod-5', id FROM tbl -- divisible by 5
WHERE id %5 = 0
UNION ALL
SELECT 'mod-2', id FROM tbl -- divisible by 2
WHERE id %2 = 0
UNION ALL
SELECT 'mod-3', id FROM tbl -- divisible by 3
WHERE id %3 = 0
UNION ALL
SELECT 'mod-x',id FROM tbl -- not divisible by 5,3 or 2
WHERE id %5 <> 0 AND id%2 <> 0 AND id % 3 <> 0