TSQL advanced ranking, grouping to find date spans - tsql

I need to do some advanced grouping in TSQL with data that looks like this:
PK YEARMO DATA
1 201201 AAA
1 201202 AAA
1 201203 AAA
1 201204 AAA
1 201205 (null)
1 201206 BBB
1 201207 AAA
2 201301 CCC
2 201302 CCC
2 201303 CCC
2 201304 DDD
2 201305 DDD
And then, every time DATA changes per primary key, pull up the date range for said item so that it looks something like this:
PK START_DT STOP_DT DATA
1 201201 201204 AAA
1 201205 201205 (null)
1 201206 201206 BBB
1 201207 201207 AAA
2 201301 201303 CCC
2 201304 201305 DDD
I've been playing around with ranking functions but haven't had much success. Any pointers in the right direction would be supremely awesome and appreciated.

You can use the row_number()function to partition your data into ranges:
SELECT
PK,
START_DT = MIN(YEARMO),
STOP_DT = MAX(YEARMO),
DATA
FROM (
SELECT
PK, DATA, YEARMO,
ROW_NUMBER() OVER (ORDER BY YEARMO) -
ROW_NUMBER() OVER (PARTITION BY PK, DATA ORDER BY YEARMO) grp
FROM your_table
) A
GROUP BY PK, DATA, grp
ORDER BY MIN(YEARMO)
Sample SQL Fiddle

Related

Find Minimum Timestamp From 2 Users POSTGRES

This is my table_gamers:
game_id
user1
user2
timestamp
1
890
123
2022-01-01
2
123
768
2022-02-09
I need to find for each user:
The first user they played.
Their first game ID.
Their MIN timestamp (timestamp from their first game).
This is what I need:
User
User They Played
Game ID
timestamp
890
123
1
2022-01-01
123
890
1
2022-01-01
768
123
2
2022-02-09
This is my query:
SELECT user1 FROM table_gamers WHERE MIN(timestamp)
UNION ALL
SELECT user1 FROM table_gamers WHERE MIN(timestamp)
How do I query each User's First Opponent? I am confused.
doing step by step by some with_clauses:
first get all matches user1-user2, user2-user1
second give some ids by ordering by timestamp
third get what you want:
with base_data as (
select game_id,user1,user2,timestamp from table_gamers
union all
select game_id,user2,user1,timestamp from table_gamers
),
base_id as (
select
row_number() over (order by base_data.timestamp) as id,
row_number() over (PARTITION by base_data.user1 order by base_data.timestamp) as id_2,
*
from base_data
)
select * from base_id
where id_2 = 1 order by timestamp
retults in
id id_2 game_id user1 user2 timestamp
2 1 1 123 890 2022-01-01T00:00:00.000Z
1 1 1 890 123 2022-01-01T00:00:00.000Z
4 1 2 768 123 2022-02-09T00:00:00.000Z
i hope that gives you the right idea
https://www.db-fiddle.com/f/9PrxioFeVaTmtVcYdteovj/0

I have Multiple logical records in one db row, how do I split them into separate rows?

I have a table that has data like:
Name
Item_1
Qty_1
Price_1
Item_2
Qty_2
Price_2
...
Item_50
Qty_50
Price_50
Bob
Apples
10
0.50
Pears
5
0.65
...
Lemons
12
0.25
Alice
Cherries
20
1.00
NULL
NULL
NULL
...
NULL
NULL
NULL
I need to process the data per-item, so the ideal form of the data would be:
Name
ItemNo
Item
Qty
Price
Bob
1
Apples
10
0.50
Bob
2
Pears
5
0.65
...
...
...
...
...
Bob
50
Lemons
12
0.25
Alice
1
Cherries
20
1.00
How can I convert between the two forms?
I have looked at the pivot command, but it seems to convert column names into data in a field, not split groups of columns into separate rows. It doesn't look like it will work for this application.
The current code looks something like:
( SELECT t1.Name, 1 AS ItemNo, t1.Item_1 AS Item, t1.Qty_1 AS Qty, t1.Price_1 AS Price FROM table t1
UNION ALL
SELECT t2.Name, 2 AS ItemNo, t2.Item_2 AS Item, t2.Qty_2 AS Qty, t2.Price_2 AS Price FROM table t2
UNION ALL
...
SELECT t50.Name, 50 AS ItemNo, t50.Item_50 AS Item, t50.Qty_50 AS Qty, t50.Price_50 AS Price FROM table t50
)
It works, but it seems hard to maintain. Is there a better way?
Hopefully the reason you want to do this is to fix your design. If not, then make the reason you're asking is to fix your design.
Anyway, one method is to use a VALUES table construct to unpivot the data:
SELECT YT.Name,
V.ItemNo,
V.Item,
V.Qty,
V.Price
FROM dbo.YourTable YT
CROSS APPLY (VALUES(1,YT.Item_1, YT.Qty_1, YT.Price1),
(2,YT.Item_2, YT.Qty_2, YT.Price2),
(3,YT.Item_3, YT.Qty_3, YT.Price3),
... --You get the idea
(49,YT.Item_49, YT.Qty_49, YT.Price49),
(50,YT.Item_50, YT.Qty_50, YT.Price50))V(ItemNo,Item,Qty,Price)
WHERE V.Item IS NOT NULL;

PostgreSQL: Count Number of Occurrences in Columns

BACKGROUND
I have three large tables (employee_info, driver_info, school_info) that I have joined together on common attributes using a series of LEFT OUTER JOIN operations. After each join, the resulting number of records increased slightly, indicating that there are duplicate IDs in the data. To try and find all of the duplicates in the IDs, I dumped the ID columns into a temp table like so:
Original Dump of ID Columns
first_name
last_name
employee_id
driver_id
school_id
Mickey
Mouse
1234
abcd
wxyz
Donald
Duck
2423
heca
qwer
Mary
Poppins
1111
acbe
aaaa
Wiley
Cayote
1234
strf
aaaa
Daffy
Duck
1256
acbe
pqrs
Bugs
Bunny
9999
strf
yxwv
Pink
Panther
2222
zzzz
zzaa
Michael
Archangel
0000
rstu
aaaa
In this overly simplified example, you will see that IDs 1234 (employee_id), strf (driver_id), and aaaa (school_id) are each duplicated at least once. I would like to add a count column for each of the ID columns, and populate them with the count for each ID used, like so:
ID Columns with Counts
first_name
last_name
employee_id
employee_id_count
driver_id
driver_id_count
school_id
school_id_count
Mickey
Mouse
1234
2
abcd
1
wxyz
1
Donald
Duck
2423
1
heca
1
qwer
1
Mary
Poppins
1111
1
acbe
1
aaaa
3
Wiley
Cayote
1234
2
strf
2
aaaa
3
Daffy
Duck
1256
1
acbe
1
pqrs
1
Bugs
Bunny
9999
1
strf
2
yxwv
1
Pink
Panther
2222
1
zzzz
1
zzaa
1
Michael
Archangel
0000
1
rstu
1
aaaa
3
You can see that IDs 1234 and strf each have 2 in the count, and aaaa has 3. After generating this table, my goal is to pull out all records where any of the counts are greater than 1, like so:
All Records with One or More Duplicate IDs
first_name
last_name
employee_id
employee_id_count
driver_id
driver_id_count
school_id
school_id_count
Mickey
Mouse
1234
2
abcd
1
wxyz
1
Mary
Poppins
1111
1
acbe
1
aaaa
3
Wiley
Cayote
1234
2
strf
2
aaaa
3
Bugs
Bunny
9999
1
strf
2
yxwv
1
Michael
Archangel
0000
1
rstu
1
aaaa
3
Real World Perspective
In my real-world work, the JOIN'd table contains 100 columns, 15 different ID fields and over 30,000 records, and the final table came out to be 28 more than the original. This may seem like a small amount, but each of the 28 represent a broken link that we must fix.
Is there a simple way to get the counts populated like in the second table above? I have been wrestling with this for hours already, and have not been able to make this work. I tried some aggregate functions, but they cannot be used in table UPDATE operations.
The COUNT function, when used as an analytic function, can do what you want here, e.g.
WITH cte AS (
SELECT *,
COUNT(employee_id) OVER (PARTITION BY employee_id) employee_id_count,
COUNT(driver_id) OVER (PARTITION BY driver_id) driver_id_count,
COUNT(school_id) OVER (PARTITION BY school_id) school_id_count
FROM yourTable
)
SELECT *
FROM cte
WHERE
employee_id_count > 1
driver_id_count > 1
school_id_count > 1;

Autoincrement in query

I need to create a query which increment value of current row by 8% to previous row.
Table (let's name it money) contains one row (and two columns), and it looks like
AMOUNT ID
100.00 AAA
I just need to populate a data from this table like this way (one select from this table, eg. 6 iterations):
100.00 AAA
108.00 AAA
116.64 AAA
125.97 AAA
136.04 AAA
146.93 AAA
You can do that with a common table expression.
E.g. if your source looks like this:
db2 "create table money(amount decimal(31,2), id varchar(10))"
db2 "insert into money values (100,'AAA')"
You can create the input data with the following query (I will include counter column for clarity):
db2 "with
cte(c1,c2,counter)
as
(select
amount, id, 1
from
money
union all
select
c1*1.08, c2, counter+1
from
cte
where counter < 10)
select * from cte"
C1 C2 COUNTER
--------------------------------- ---------- -----------
100.00 AAA 1
108.00 AAA 2
116.64 AAA 3
125.97 AAA 4
136.04 AAA 5
146.92 AAA 6
158.67 AAA 7
171.36 AAA 8
185.06 AAA 9
199.86 AAA 10
To populate the existing table without repeating the existing row you use e.g. an insert like this:
$ db2 "insert into money
with
cte(c1,c2,counter)
as
(select
amount*1.08, id, 1
from
money
union all
select
c1*1.08, c2, counter+1
from
cte
where counter < 10) select c1,c2 from cte"
$ db2 "select * from money"
AMOUNT ID
--------------------------------- ----------
100.00 AAA
108.00 AAA
116.64 AAA
125.97 AAA
136.04 AAA
146.93 AAA
158.68 AAA
171.38 AAA
185.09 AAA
199.90 AAA
215.89 AAA
11 record(s) selected.

TSQL, Pivot rows into single columns

Before, I had to solve something similar:
Here was my pivot and flatten for another solution:
I want to do the same thing on the example below but it is slightly different because there are no ranks.
In my previous example, the table looked like this:
LocationID Code Rank
1 123 1
1 124 2
1 138 3
2 999 1
2 888 2
2 938 3
And I was able to use this function to properly get my rows in a single column.
-- Check if tables exist, delete if they do so that you can start fresh.
IF OBJECT_ID('tempdb.dbo.#tbl_Location_Taxonomy_Pivot_Table', 'U') IS NOT NULL
DROP TABLE #tbl_Location_Taxonomy_Pivot_Table;
IF OBJECT_ID('tbl_Location_Taxonomy_NPPES_Flattened', 'U') IS NOT NULL
DROP TABLE tbl_Location_Taxonomy_NPPES_Flattened;
-- Pivot the original table so that you have
SELECT *
INTO #tbl_Location_Taxonomy_Pivot_Table
FROM [MOAD].[dbo].[tbl_Location_Taxonomy_NPPES] tax
PIVOT (MAX(tax.tbl_lkp_Taxonomy_Seq)
FOR tax.Taxonomy_Rank in ([1],[2],[3],[4],[5],[6],[7],[8],[9],[10],[11],[12],[13],[14],[15])) AS pvt
-- ORDER BY Location_ID
-- Flatten the tables.
SELECT Location_ID
,max(piv.[1]) as Tax_Seq_1
,max(piv.[2]) as Tax_Seq_2
,max(piv.[3]) as Tax_Seq_3
,max(piv.[4]) as Tax_Seq_4
,max(piv.[5]) as Tax_Seq_5
,max(piv.[6]) as Tax_Seq_6
,max(piv.[7]) as Tax_Seq_7
,max(piv.[8]) as Tax_Seq_8
,max(piv.[9]) as Tax_Seq_9
,max(piv.[10]) as Tax_Seq_10
,max(piv.[11]) as Tax_Seq_11
,max(piv.[12]) as Tax_Seq_12
,max(piv.[13]) as Tax_Seq_13
,max(piv.[14]) as Tax_Seq_14
,max(piv.[15]) as Tax_Seq_15
-- JOIN HERE
INTO tbl_Location_Taxonomy_NPPES_Flattened
FROM #tbl_Location_Taxonomy_Pivot_Table piv
GROUP BY Location_ID
So, then here is the data I would like to work with in this example.
LocationID Foreign Key
2 2
2 670
2 2902
2 5389
3 3
3 722
3 2905
3 5561
So I have some data that is formatted like this:
I have used pivot on data like this before--But the difference was it had a rank also. Is there a way to get my foreign keys to show up in this format using a pivot?
locationID FK1 FK2 FK3 FK4
2 2 670 2902 5389
3 3 722 2905 5561
Another way I'm looking to solve this is like this:
Another way I could look at doing this is I have the values in:
this form as well:
LocationID Address_Seq
2 670, 5389, 2902, 2,
3 722, 5561, 2905, 3
etc
is there anyway I can get this to be the same?
ID Col1 Col2 Col3 Col4
2 670 5389, 2902, 2
This, adding a rank column and reversing the orders, should gives you what you require:
SELECT locationid, [4] col1, [3] col2, [2] col3, [1] col4
FROM
(
SELECT locationid, foreignkey,rank from #Pivot_Table ----- temp table with a rank column
) x
PIVOT (MAX(x.foreignkey)
FOR x.rank in ([4],[3],[2],[1]) ) pvt