calculate roll_rank using prev value for each row - kdb

I have a table where we need to set and sum roll_rank except from roll_rank 0,1,2
we dont need to touch the rows where roll_rank in 0,1,2
we want to calculate sums of roll_rank by date where not roll_rank in 0,1,2.
example table:
tmp:([]date:`date$();name:`symbol$();roll_rank:`int$())
`tmp insert (2010.01.01;`sym1;1);
`tmp insert (2010.01.01;`sym2;2);
`tmp insert (2010.01.01;`sym3;0Ni);
`tmp insert (2010.01.01;`sym4;0Ni);
`tmp insert (2010.01.02;`sym1;0);
`tmp insert (2010.01.02;`sym2;1);
`tmp insert (2010.01.02;`sym3;2);
`tmp insert (2010.01.02;`sym4;0Ni);
`tmp insert (2010.01.02;`sym5;0Ni);
`tmp insert (2010.01.02;`sym6;0Ni);
`tmp insert (2010.01.03;`sym1;1);
`tmp insert (2010.01.03;`sym2;0Ni);
`tmp insert (2010.01.03;`sym3;0Ni);
`tmp insert (2010.01.03;`sym4;0Ni);
Expected output is

This might also achieve your desired result:
update sums 1^deltas roll_rank by date from tmp

One method using a vector conditional and over:
q){update{?[null x;1+prev x;x]}roll_rank from x}/[tmp]
date name roll_rank
-------------------------
2010.01.01 sym1 1
2010.01.01 sym2 2
2010.01.01 sym3 3
2010.01.01 sym4 4
2010.01.02 sym1 0
2010.01.02 sym2 1
2010.01.02 sym3 2
2010.01.02 sym4 3
2010.01.02 sym5 4
2010.01.02 sym6 5
2010.01.03 sym1 1
2010.01.03 sym2 2
2010.01.03 sym3 3
2010.01.03 sym4 4

Related

DB2 count distinct on multiple columns

I am trying to find count and distinct of multiple values but its not worikng in db2
select count(distinct col1, col2) from table
it throws syntax error that count has multiple columns.
any way to achieve this
column 1 column 2 date
1 a 2022-12-01
1 a 2022-12-01
2 a 2022-11-30
2 b 2022-11-30
1 b 2022-12-01
i want output
column1 column2 date count
1 a 2022-12-01 2
2 a 2022-11-30 1
2 b 2022-11-30 1
1 a 2022-12-01 1
The following query returns exactly what you want.
WITH MYTAB (column1, column2, date) AS
(
VALUES
(1, 'a', '2022-12-01')
, (1, 'a', '2022-12-01')
, (2, 'a', '2022-11-30')
, (2, 'b', '2022-11-30')
, (1, 'b', '2022-12-01')
)
SELECT
column1
, column2
, date
, COUNT (*) AS CNT
FROM MYTAB
GROUP BY
column1
, column2
, date
COLUMN1
COLUMN2
DATE
CNT
1
a
2022-12-01
2
1
b
2022-12-01
1
2
a
2022-11-30
1
2
b
2022-11-30
1
fiddle
Not exactly sure of what you are looking for...
but
select count(distinct col1), count(distinct col2) from table
or
select count(distinct col1 CONCAT col2) from table
Are how I would interpret "distinct count of multiple values" in a table..

Find missing dates and print the record [PostgresSQL]

I have a table with two columns ID and Date say like with below data. For a given range say like from 2022-09-01 to 2022-09-10 I want to return the missing dates for respective ID's along with ID value, I want data to be returned as mentioned in Expected output. How can I achieve this
Data inside table:
ID
Date
1
2022-09-01
1
2022-09-07
1
2022-09-08
1
2022-09-09
2
2022-09-01
2
2022-09-02
2
2022-09-03
2
2022-09-04
Expected Output:
ID
Missing Dates
1
2022-09-02
1
2022-09-03
1
2022-09-04
1
2022-09-05
1
2022-09-06
1
2022-09-10
2
2022-09-05
2
2022-09-06
2
2022-09-07
2
2022-09-08
2
2022-09-09
2
2022-09-10
I wrote sample query for you:
CREATE TABLE test1 (
id int4 NULL,
pdate date NULL
);
INSERT INTO test1 (id, pdate) VALUES(1, '2022-09-01');
INSERT INTO test1 (id, pdate) VALUES(1, '2022-09-07');
INSERT INTO test1 (id, pdate) VALUES(1, '2022-09-08');
INSERT INTO test1 (id, pdate) VALUES(1, '2022-09-09');
INSERT INTO test1 (id, pdate) VALUES(2, '2022-09-01');
INSERT INTO test1 (id, pdate) VALUES(2, '2022-09-02');
INSERT INTO test1 (id, pdate) VALUES(2, '2022-09-03');
INSERT INTO test1 (id, pdate) VALUES(2, '2022-09-04');
select t1.id, t1.datelist from (
select t.id, generate_series(t.startdate, t.enddate, '1 day')::date as datelist from (
select distinct id, '2022-09-01'::date as startdate, '2022-09-10'::date as enddate from test1
) t
) t1
left join test1 t2 on t2.pdate = t1.datelist and t1.id = t2.id
where t2.pdate is null
Result:
id datelist
1 2022-09-02
1 2022-09-03
1 2022-09-04
1 2022-09-05
1 2022-09-06
1 2022-09-10
2 2022-09-05
2 2022-09-06
2 2022-09-07
2 2022-09-08
2 2022-09-09
2 2022-09-10

how to rank the column values of each group

I have a table
Want to set the ranks based on max volume for each group .i.e date.
If volume is null, then dont rank it. Keep rank column empty for null volume. (example see line 11 and 12 in expected output snapshot)
The rank=1 is our front contract, if sym flipped then it cannot be rank1 again after flip. example see output snapshot line 9, 13 and 15
expected output is
To generate the sample table, use below code.
tab:([]date:`date$();sym:`symbol$();name:`symbol$();volume:`float$();roll_rank:`int$());
`tab insert (2010.01.01;`ESH22;`ES;100.1;0Ni);
`tab insert (2010.01.01;`ESH23;`ES;500.1;0Ni);
`tab insert (2010.01.02;`ESH22;`ES;100.1;0Ni);
`tab insert (2010.01.02;`ESH23;`ES;800.1;0Ni);
`tab insert (2010.01.02;`ESH24;`ES;600.1;0Ni);
`tab insert (2010.01.02;`ESH25;`ES;550.1;0Ni);
`tab insert (2010.01.02;`ESH26;`ES;200.1;0Ni);
`tab insert (2010.01.03;`ESH23;`ES;600.1;0Ni);
`tab insert (2010.01.03;`ESH24;`ES;700.1;0Ni);
`tab insert (2010.01.03;`ESH26;`ES;0n;0Ni);
`tab insert (2010.01.03;`ESH25;`ES;500.1;0Ni);
`tab insert (2010.01.03;`ESH26;`ES;0n;0Ni);
`tab insert (2010.01.04;`ESH23;`ES;50.1;0Ni);
`tab insert (2010.01.05;`ESH23;`ES;300.1;0Ni);
`tab insert (2010.01.05;`ESH24;`ES;800.1;0Ni);
`tab insert (2010.01.05;`ESH25;`ES;100.1;0Ni);
The following will put the table in descending order by date, with the rank number in a separate column:
q)ungroup select volume:desc volume,ranknumber:1+til count volume by date from tab
Code ouput with the provided table data:
date volume ranknumber
----------------------------
2010.01.01 500.1 1
2010.01.01 100.1 2
2010.01.02 800.1 1
2010.01.02 600.1 2
2010.01.02 550.1 3
2010.01.02 200.1 4
2010.01.02 100.1 5
2010.01.03 700.1 1
2010.01.03 600.1 2
2010.01.03 500.1 3
2010.01.03 4
2010.01.03 5
2010.01.04 50.1 1
2010.01.05 800.1 1
2010.01.05 300.1 2
2010.01.05 100.1 3
Haven't thought of an elegant way of not including the null values in the rank order yet.
Edit: You could use "update" on the sorted table to remove the ranked null values - something like this would work (where tab2 is the previous output):
q)update ranknumber:0N from tab2 where ranked=0N
date ranked ranknumber
----------------------------
2010.01.01 500.1 1
2010.01.01 100.1 2
2010.01.02 800.1 1
2010.01.02 600.1 2
2010.01.02 550.1 3
2010.01.02 200.1 4
2010.01.02 100.1 5
2010.01.03 700.1 1
2010.01.03 600.1 2
2010.01.03 500.1 3
2010.01.03
2010.01.03
2010.01.04 50.1 1
2010.01.05 800.1 1
2010.01.05 300.1 2
2010.01.05 100.1 3

TSQL, Pivot rows into single columns

Before, I had to solve something similar:
Here was my pivot and flatten for another solution:
I want to do the same thing on the example below but it is slightly different because there are no ranks.
In my previous example, the table looked like this:
LocationID Code Rank
1 123 1
1 124 2
1 138 3
2 999 1
2 888 2
2 938 3
And I was able to use this function to properly get my rows in a single column.
-- Check if tables exist, delete if they do so that you can start fresh.
IF OBJECT_ID('tempdb.dbo.#tbl_Location_Taxonomy_Pivot_Table', 'U') IS NOT NULL
DROP TABLE #tbl_Location_Taxonomy_Pivot_Table;
IF OBJECT_ID('tbl_Location_Taxonomy_NPPES_Flattened', 'U') IS NOT NULL
DROP TABLE tbl_Location_Taxonomy_NPPES_Flattened;
-- Pivot the original table so that you have
SELECT *
INTO #tbl_Location_Taxonomy_Pivot_Table
FROM [MOAD].[dbo].[tbl_Location_Taxonomy_NPPES] tax
PIVOT (MAX(tax.tbl_lkp_Taxonomy_Seq)
FOR tax.Taxonomy_Rank in ([1],[2],[3],[4],[5],[6],[7],[8],[9],[10],[11],[12],[13],[14],[15])) AS pvt
-- ORDER BY Location_ID
-- Flatten the tables.
SELECT Location_ID
,max(piv.[1]) as Tax_Seq_1
,max(piv.[2]) as Tax_Seq_2
,max(piv.[3]) as Tax_Seq_3
,max(piv.[4]) as Tax_Seq_4
,max(piv.[5]) as Tax_Seq_5
,max(piv.[6]) as Tax_Seq_6
,max(piv.[7]) as Tax_Seq_7
,max(piv.[8]) as Tax_Seq_8
,max(piv.[9]) as Tax_Seq_9
,max(piv.[10]) as Tax_Seq_10
,max(piv.[11]) as Tax_Seq_11
,max(piv.[12]) as Tax_Seq_12
,max(piv.[13]) as Tax_Seq_13
,max(piv.[14]) as Tax_Seq_14
,max(piv.[15]) as Tax_Seq_15
-- JOIN HERE
INTO tbl_Location_Taxonomy_NPPES_Flattened
FROM #tbl_Location_Taxonomy_Pivot_Table piv
GROUP BY Location_ID
So, then here is the data I would like to work with in this example.
LocationID Foreign Key
2 2
2 670
2 2902
2 5389
3 3
3 722
3 2905
3 5561
So I have some data that is formatted like this:
I have used pivot on data like this before--But the difference was it had a rank also. Is there a way to get my foreign keys to show up in this format using a pivot?
locationID FK1 FK2 FK3 FK4
2 2 670 2902 5389
3 3 722 2905 5561
Another way I'm looking to solve this is like this:
Another way I could look at doing this is I have the values in:
this form as well:
LocationID Address_Seq
2 670, 5389, 2902, 2,
3 722, 5561, 2905, 3
etc
is there anyway I can get this to be the same?
ID Col1 Col2 Col3 Col4
2 670 5389, 2902, 2
This, adding a rank column and reversing the orders, should gives you what you require:
SELECT locationid, [4] col1, [3] col2, [2] col3, [1] col4
FROM
(
SELECT locationid, foreignkey,rank from #Pivot_Table ----- temp table with a rank column
) x
PIVOT (MAX(x.foreignkey)
FOR x.rank in ([4],[3],[2],[1]) ) pvt

How to generate a date to be included in UNPIVOT results without a loop?

Say I had an example like so, where Im transposing columns into rows with UNPIVOT.
DECLARE #pvt AS TABLE (VendorID int, Emp1 int, Emp2 int, Emp3 int, Emp4 int, Emp5 int);
INSERT INTO #pvt (VendorId,Emp1,Emp2,Emp3,Emp4,Emp5) VALUES (1,4,3,5,4,4);
INSERT INTO #pvt (VendorId,Emp1,Emp2,Emp3,Emp4,Emp5) VALUES (2,4,1,5,5,5);
INSERT INTO #pvt (VendorId,Emp1,Emp2,Emp3,Emp4,Emp5) VALUES (3,4,3,5,4,4);
INSERT INTO #pvt (VendorId,Emp1,Emp2,Emp3,Emp4,Emp5) VALUES (4,4,2,5,5,4);
INSERT INTO #pvt (VendorId,Emp1,Emp2,Emp3,Emp4,Emp5) VALUES (5,5,1,5,5,5);
--Unpivot the table.
SELECT VendorID, Employee, Orders
FROM
(SELECT VendorID, Emp1, Emp2, Emp3, Emp4, Emp5
FROM #pvt) p
UNPIVOT
(Orders FOR Employee IN
(Emp1, Emp2, Emp3, Emp4, Emp5)
)AS unpvt;
GO
Which produces results like this
VendorID Employee Orders
1 Emp1 4
1 Emp2 3
1 Emp3 5
1 Emp4 4
1 Emp5 4
2 Emp1 4
2 Emp2 1
2 Emp3 5
2 Emp4 5
2 Emp5 5
3 Emp1 4
3 Emp2 3
3 Emp3 5
3 Emp4 4
3 Emp5 4
However, I want to include an "incremental date like so that it repeats in a group for each Vendor and the results would be like this
VendorID Employee Orders OrderDate
1 Emp1 4 01/01/2014
1 Emp2 3 02/01/2014
1 Emp3 5 03/01/2014
1 Emp4 4 04/01/2014
1 Emp5 4 05/01/2014
2 Emp1 4 ..
2 Emp2 1
2 Emp3 5
2 Emp4 5
2 Emp5 5
3 Emp1 4
3 Emp2 3
3 Emp3 5
3 Emp4 4
3 Emp5 4
The kicker is that I want to try to do this without resorting to a loop since the transposed results are going to be about 100K records. Is there a way to generate that date field like that without looping over the results?
[edit]
I think, but not sure yet, that [this]1 post might help, using ROW NUMBER
You can use:
Dateadd(DAY, row_number() over( partition by VendorId Order by Employee), #stardate)
According to your example you can partition by vendorId and order by Employee. But you can change just like a regular order by.