KDB/Q: how to join and fill null with 0

KDB/Q: how to join and fill null with 0 - kdb

I am joining 2 tables. How do I replace NULL with 0 a column from one of the table?
My code to join
newTable: table1 lj xkey `date`sym xkey table2
I am aware that 0^ helps you to do this, but I dont know how to apply here

In future I recommend that you show examples of the 2 tables you have and the expected outcome you would like because it is slightly difficult to know but I think this might be what you want.
First in your code you use xkey twice so it will throw an error. Change it to be:
newTable: table1 lj `date`sym xkey table2
Then for the updating of null values with a column from another tbl you could do:
q)tbl:([]date:.z.d;sym:10?`abc`xyz;data:10?8 2 0n)
q)tbl
date sym data
-------------------
2020.12.10 xyz 8
2020.12.10 abc 8
2020.12.10 abc 8
2020.12.10 abc
2020.12.10 abc
2020.12.10 xyz 2
2020.12.10 abc 2
2020.12.10 xyz
2020.12.10 xyz
2020.12.10 abc 2
q)tbl2:([date:.z.d;sym:`abc`xyz];data2:2?100)
q)tbl2
date sym| data2
--------------| -----
2020.12.10 abc| 23
2020.12.10 xyz| 46
q)select date,sym,data:data2^data from tbl lj `date`sym xkey tbl2 //Replace null values of data with data2.
date sym data
-------------------
2020.12.10 xyz 8
2020.12.10 abc 8
2020.12.10 abc 8
2020.12.10 abc 23
2020.12.10 abc 23
2020.12.10 xyz 2
2020.12.10 abc 2
2020.12.10 xyz 46
2020.12.10 xyz 46
2020.12.10 abc 2
So, it's

Use within an an update statement, for example:
q)newTable:([]column1:(1;0Nj;2;0Nj))
q)update 0^column1 from newTable
column1
-------
1
0
2
0
Or functional form:
q)newTable:([]column1:(1;0Nj;2;0Nj);column2:(1;2;3;0Nj))
q)parse"update 0^column1 from newTable"
!
`newTable
()
0b
(,`column1)!,(^;0;`column1)
q)![newTable;();0b;raze{enlist[x]!enlist(^;0;x)}each `column1`column2]
column1 column2
---------------
1 1
0 2
2 3
0 0

Related

Get distinct values in Pyspark and if duplicate value then should be placed in another column

Input Table:
prod
acct
acctno
newcinsfx
John
A01
1
89
John
A01
2
90
John
A01
2
92
Mary
A02
1
92
Mary
A02
3
81
Desired output table:
prod
acct
newcinsfx1
newcinsfx2
John
A01
89
John
A01
90
92
Mary
A02
92
Mary
A02
81
I tried to do it by distinct function.
df.select('prod',"acctno").distinct()
df.show()

Remove table duplicates under certain conditions

I have a table like below that shows me some pnl by instrument (code) for some shifts, maturity, etc.
Instrument 123 appears two times (2 sets of shift, booknumber, insmat but different pnl). I would like to clean the table to only keep the first set (3 first rows).
> code | shift | pnl | booknumber | insmat
123 -20% 5 1234 2021.01.29
123 -0% 7 1234 2021.01.29
123 +20% 9 1234 2021.01.29
123 -20% 4 1234 2021.01.29
123 -0% 6 1234 2021.01.29
123 +20% 8 1234 2021.01.29
456 -20% 1 1234 2021.01.29
456 -0% 2 1234 2021.01.29
456 +20% 3 1234 2021.01.29
If there were no shifts involved I would do something like this:
select first code, first pnl, first booknumber, first insmat by code from t
Would love to hear if you have a solution!
Thanks!

If the shift pattern is consistently 3 shifts, you could use
q)select from t where 0=i mod 3
code shift pnl booknumber insmat
------------------------------------
123 20 5 1234 2021.01.29
123 20 4 1234 2021.01.29
456 -20 1 1234 2021.01.29
Alternative solution with an fby
q)select from t where shift=(first;shift)fby code
code shift pnl booknumber insmat
------------------------------------
123 20 5 1234 2021.01.29
123 20 4 1234 2021.01.29
456 -20 1 1234 2021.01.29
This will only work if the first shift value is unique within the shift pattern however.

Unpivot data in PostgreSQL

I have a table in PostgreSQL with the below values,
empid hyderabad bangalore mumbai chennai
1 20 30 40 50
2 10 20 30 40
And my output should be like below
empid city nos
1 hyderabad 20
1 bangalore 30
1 mumbai 40
1 chennai 50
2 hyderabad 10
2 bangalore 20
2 mumbai 30
2 chennai 40
How can I do this unpivot in PostgreSQL?

You can use a lateral join:
select t.empid, x.city, x.nos
from the_table t
cross join lateral (
values
('hyderabad', t.hyderabad),
('bangalore', t.bangalore),
('mumbai', t.mumbai),
('chennai', t.chennai)
) as x(city, nos)
order by t.empid, x.city;

Or this one: simpler to read- and real plain SQL ...
WITH
input(empid,hyderabad,bangalore,mumbai,chennai) AS (
SELECT 1,20,30,40,50
UNION ALL SELECT 2,10,20,30,40
)
,
i(i) AS (
SELECT 1
UNION ALL SELECT 2
UNION ALL SELECT 3
UNION ALL SELECT 4
)
SELECT
empid
, CASE i
WHEN 1 THEN 'hyderabad'
WHEN 2 THEN 'bangalore'
WHEN 3 THEN 'mumbai'
WHEN 4 THEN 'chennai'
ELSE 'unknown'
END AS city
, CASE i
WHEN 1 THEN hyderabad
WHEN 2 THEN bangalore
WHEN 3 THEN mumbai
WHEN 4 THEN chennai
ELSE NULL::INT
END AS city
FROM input CROSS JOIN i
ORDER BY empid,i;
-- out empid | city | city
-- out -------+-----------+------
-- out 1 | hyderabad | 20
-- out 1 | bangalore | 30
-- out 1 | mumbai | 40
-- out 1 | chennai | 50
-- out 2 | hyderabad | 10
-- out 2 | bangalore | 20
-- out 2 | mumbai | 30
-- out 2 | chennai | 40

Add unique rows for each group when similar group repeats after certain rows

Hi Can anyone help me please to get unique group number?
I need to give unique rows for each group even when same group repeats after some groups.
I have following data:
id version product startdate enddate
123 0 2443 2010/09/01 2011/01/02
123 1 131 2011/01/03 2011/03/09
123 2 131 2011/08/10 2012/09/10
123 3 3009 2012/09/11 2014/03/31
123 4 668 2014/04/01 2014/04/30
123 5 668 2014/05/01 2016/01/01
123 6 668 2016/01/02 2017/09/08
123 7 131 2017/09/09 2017/10/10
123 8 131 2018/10/11 2019/01/01
123 9 550 2019/01/02 2099/01/01
select *,
dense_rank()over(partition by id order by id,product)
from table
Expected results:
id version product startdate enddate count
123 0 2443 2010/09/01 2011/01/02 1
123 1 131 2011/01/03 2011/03/09 2
123 2 131 2011/08/10 2012/09/10 2
123 3 3009 2012/09/11 2014/03/31 3
123 4 668 2014/04/01 2014/04/30 4
123 5 668 2014/05/01 2016/01/01 4
123 6 668 2016/01/02 2017/09/08 4
123 7 131 2017/09/09 2017/10/10 5
123 8 131 2018/10/11 2019/01/01 5
123 9 550 2019/01/02 2099/01/01 6

Try the following
SELECT
id,version,product,startdate,enddate,
1+SUM(v)OVER(PARTITION BY id ORDER BY version) n
FROM
(
SELECT
*,
IIF(LAG(product)OVER(PARTITION BY id ORDER BY version)<>product,1,0) v
FROM TestTable
) q

Select from table removing similar rows - PostgreSQL

There is a table with document revisions and authors. Looks like this:
doc_id rev_id rev_date editor title,content so on....
123 1 2016-01-01 03:20 Bill ......
123 2 2016-01-01 03:40 Bill
123 3 2016-01-01 03:50 Bill
123 4 2016-01-01 04:10 Bill
123 5 2016-01-01 08:40 Alice
123 6 2016-01-01 08:41 Alice
123 7 2016-01-01 09:00 Bill
123 8 2016-01-01 10:40 Cate
942 9 2016-01-01 11:10 Alice
942 10 2016-01-01 11:15 Bill
942 15 2016-01-01 11:17 Bill
I need to find out moments when document was transferred to another editor - only first rows of every edition series.
Like so:
doc_id rev_id rev_date editor title,content so on....
123 1 2016-01-01 03:20 Bill ......
123 5 2016-01-01 08:40 Alice
123 7 2016-01-01 09:00 Bill
123 8 2016-01-01 10:40 Cate
942 9 2016-01-01 11:10 Alice
942 10 2016-01-01 11:15 Bill
If I use DISTINCT ON (doc_id, editor) it resorts a table and I see only one per doc and editor, that is incorrect.
Of course I can dump all and filter with shell tools like awk | sort | uniq. But it is not good for big tables.
Window functions like FIRST_ROW do not give much, because I cannot partition by doc_id, editor not to mess all them.
How to do better?
Thank you.

You can use lag() to get the previous value, and then a simple comparison:
select t.*
from (select t.*,
lag(editor) over (partition by doc_id order by rev_date) as prev_editor
from t
) t
where prev_editor is null or prev_editor <> editor;

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

KDB/Q: how to join and fill null with 0 - kdb

I am joining 2 tables. How do I replace NULL with 0 a column from one of the table? My code to join newTable: table1 lj xkey `date`sym xkey table2 I am aware that 0^ helps you to do this, but I dont know how to apply here

Related

Get distinct values in Pyspark and if duplicate value then should be placed in another column

Remove table duplicates under certain conditions

Unpivot data in PostgreSQL

Add unique rows for each group when similar group repeats after certain rows

Select from table removing similar rows - PostgreSQL

Categories

Resources