Check existence of sub-table in a table in kdb - kdb

I have a table 't' and a sub-table 'st'.
If the values of subtable matches with that of table then we want to update the 't' table with value Y in new column named 'exists' else N.
Table:
q)t:([] id:("ab";"cd";"ef";"gh";"ij"); refid:("";"ab";"";"ef";""); typ:`BUY`SELL`BUY`SELL`BUY)
q)t
id refid typ
---------------
"ab" "" BUY
"cd" "ab" SELL
"ef" "" BUY
"gh" "ef" SELL
"ij" "" BUY
subtable:
q)st:([] id:("ab";"cd"); typ:`BUY`SELL)
q)st
id typ
---------
"ab" BUY
"cd" SELL
Desired Output:
id refid typ exists
----------------------
"ab" "" BUY Y
"cd" "ab" SELL Y
"ef" "" BUY N
"gh" "ef" SELL N
"ij" "" BUY N
Tried various things like any,each,right each, in - but could get desired result.

One option is to use left join (lj)
q) update `N^exists from t lj `id`typ xkey update exists:`Y from st
id refid typ exists
----------------------
"ab" "" BUY Y
"cd" "ab" SELL Y
"ef" "" BUY N
"gh" "ef" SELL N
"ij" "" BUY N
Here I have defined id and typ columns as keys for st table. Change the keys of table st to the columns you want to match with table t.

Rahul's answer is by far neater than what follows, but the following may be more performant, depending on your table sizes. By matching the columns in t to st (indexing and flipping) and determining a boolean of matching rows, passed into YN boolean dictionary we can produce the desired result.
q)`N`Y (flip cols[st]!t[cols[st]]) in st
`Y`Y`N`N`N
q)update exists:(`N`Y (flip cols[st]!t[cols[st]]) in st) from t
id typ refid exists
----------------------
"ab" BUY "" Y
"cd" SELL "ab" Y
"ef" BUY "" N
"gh" SELL "ef" N
"ij" BUY "" N
q)\t:100000 update exists:(`N`Y (flip cols[st]!t[cols[st]]) in st) from t
446
q)\t:100000 update `N^exists from t lj `id`typ xkey update exists:`Y from st
856
Rahuls lj method may generalize to be faster once attributes are applied.
Edited as per Rahuls observations,

Another alternative to the other answers is to extract the sub table columns from t (cols[st]#t)and check they are in st:
update exists:(cols[st]#t)in st from t
id refid typ exists
----------------------
"ab" "" BUY 1
"cd" "ab" SELL 1
"ef" "" BUY 0
"gh" "ef" SELL 0
"ij" "" BUY 0
If you need to display the result as YN then you can make a slight modification to get the following:
update exists:`N`Y(cols[st]#t)in st from t
id refid typ exists
----------------------
"ab" "" BUY Y
"cd" "ab" SELL Y
"ef" "" BUY N
"gh" "ef" SELL N
"ij" "" BUY N

Related

Case when then in postgresql doesn't work as expected (on null value)

I have this table called people
id
name
lastname
1
John
Smith
2
Robert
Williams
3
Peter
Walker
if I run the query
select CASE WHEN id is null THEN '0' ELSE id END as id
from people
where id='2'
The result is:
| id
| ----
| 2
I want to display id as 0 when it is Null in the table, but when I run
select CASE WHEN id is null THEN '0' ELSE id END as id
from people
where id='4'
id
My expected result is:
| id
| ----
| 0
A top-level SQL query that doesn't match any rows will return zero rows, which is different from returning NULL. This makes sense, because it lets you distinguish between the result for SELECT email FROM users WHERE id=4 when there is no such user and the result when there is a user but their email is null.
However, a subquery that returns no rows will evaluate to NULL the way you expected. So you can rewrite your code like this:
SELECT COALESCE( (SELECT id FROM people WHERE id = '4'), 0 );
COALESCE(x,y) is shorthand for CASE WHEN x IS NULL THEN y ELSE x END. It's helpful in cases like this where the expression for x is long and you don't want to have to write it twice.

PostgreSQL: Merging sets of rows which text fields are contained in other sets of rows

Given the following table, I need to merge the fields in different "id" only if they are the same type (person or dog), and always as the value of every field of an "id" is contained in the values of other "ids".
id
being
feature
values
1
person
name
John;Paul
1
person
surname
Smith
2
dog
name
Ringo
3
dog
name
Snowy
4
person
name
John
4
person
surname
5
person
name
John;Ringo
5
person
surname
Smith
In this example, the merge results should be as follows:
1 and 4 (Since 4's name is present in 1's name and 4's surname is empty)
1 and 5 cannot be merged (the name field show different values)
4 and 5 can be merged
2 and 3 (dogs) cannot be merged. They have only the field "name" and they do not share values.
2 and 3 cannot be merged with 1, 4, 5 since they have different values in "being".
id
being
feature
values
1
person
name
John;Paul
1
person
surname
Smith
2
dog
name
Ringo
3
dog
name
Snowy
5
person
name
John;Ringo
5
person
surname
Smith
I have tried this:
UPDATE table a
SET values = (SELECT array_to_string(array_agg(distinct values),';') AS values FROM table b
WHERE a.being= b.being
AND a.feature= b.feature
AND a.id<> b.id
AND a.values LIKE '%'||a.values||'%'
)
WHERE (select count (*) FROM (SELECT DISTINCT c.being, c.id from table c where a.being=c.being) as temp) >1
;
This doesn't work well because it will merge, for example, 1 and 5. Besides, it duplicates values when merging that field.
One option is to aggregate names with surnames on "id" and "being". Once you get a single string per "id", a self join may find when a full name is completely included inside another (where the "being" is same for both "id"s), then you just select the smallest fullname, candidate for deletion:
WITH cte AS (
SELECT id,
being,
STRING_AGG(values, ';') AS fullname
FROM tab
GROUP BY id,
being
)
DELETE FROM tab
WHERE id IN (SELECT t2.id
FROM cte t1
INNER JOIN cte t2
ON t1.being = t2.being
AND t1.id > t2.id
AND t1.fullname LIKE CONCAT('%',t2.fullname,'%'));
Check the demo here.

PostgreSQL: Selecting one address from almost but not exactly duplicate rows

I have a big table that I'm trying to join another table to, however the table has entries such as:
--- Name | Address | Priority
----------------------------------------
1 | Jane Doe | 123 Baker St | 1
2 | Jane Doe | 345 Clay Dr | 2
3 | Jeff Boe | 231 Street St| 1
4 | Karen Al | 4232 Elm St | 1
5 | Karen Al | 5632 Pine Ct | 2
What I really want to select is one single address per person. The correct address I want is priority 2. However some of the addresses don't have a priority 2, so I can't join only on priority 2.
I've tried the following test query:
SELECT DISTINCT n.ID, LastName, FirstName, MAX(Address), MAX(Address2), City, State, PostalCode, n.Phone
FROM NormalTable n
JOIN Contracts cn ON n.ID = cn.ID
Which returns the table that I sketched out above, with the same person/sameID but different addresses.
Is there a way to do this in one query? I can think of maybe doing one INSERT statement into my final table where I do all the priority 2 addresses and then ANOTHER INSERT statement for IDs that aren't in the table yet, and use the priority 1 address for those. But I'd much prefer if there's a way to do this all in one go where I end up with only the address I want.
You could choice the address you need joining a subquery for max priority
select m.LastName, m.FirstName, m.Address, m.Address2, m.City, m.State, m.PostalCode, m.Phone
from my_table m
inner join (
select LastName, FirstName, max(priority) max_priority
from my_table
group by LastName, FirstName
) t on t.LastName = m.LastName
AND t.FirstName = m.FirstName
AND t.max_priority = m.priority
I think you want something like this
SELECT DISTINCT (Name), Address, Priority
ORDER BY Priority DESC
How this works is that the DISTINCT (Name) only returns one row per name. The row returned for each Name is the first row. Which will be the one with the highest priority because of the ORDER BY.

Optimize a kdb query which used update more than once

We have a table:
q)t:([] id:("ab";"cd";"ef";"gh";"ij"); refid:("";"ab";"";"ef";""); typ:`BUY`SELL`BUY`SELL`BUY)
q)t
id refid typ
---------------
"ab" "" BUY
"cd" "ab" SELL
"ef" "" BUY
"gh" "ef" SELL
"ij" "" BUY
Now our requirement is to add a column named 'event' to the table which is marked as 'N' if id of BUY type matches with refid of SELL type and refid is not null else mark event as 'Y'.
I have written below query which works perfectly fine but has a scope of optimization.
Desired Output:
id refid typ event
---------------------
"ab" "" BUY N
"cd" "ab" SELL N
"ef" "" BUY N
"gh" "ef" SELL N
"ij" "" BUY Y
Query used:
q)update event:`N from (update event:?[([]id) in (select id:refid from t where typ=`SELL, not refid like "");`N;`Y] from t) where typ=`SELL, not refid like ""
Please help me optimize above query.
You could try something like this which works for the date you have provided
q)update eve:?[(typ=`BUY) &(not any(`$id)=/:`$refid);`Y;`N] from t
id refid typ eve
-------------------
"ab" "" BUY N
"cd" "ab" SELL N
"ef" "" BUY N
"gh" "ef" SELL N
"ij" "" BUY Y

Limit for inner Join Table

I have a scenario where I am joining three tables and getting the results.
My problem is i have apply limit for joined table.
Take below example, i have three tables 1) books and 2) Customer 3)author. I need to find list of books sold today with author and customer name however i just need last nth customers not all by passing books Id
Books Customer Authors
--------------- ---------------------- -------------
Id Name AID Id BID Name Date AID Name
1 1 1 ABC 1 A1
2 2 1 CED 2 A2
3 3 2 DFG
How we can achieve this?
You are looking for LATERAL.
Sample:
SELECT B.Id, C.Name
FROM Books B,
LATERAL (SELECT * FROM Customer WHERE B.ID=C.BID ORDER BY ID DESC LIMIT N) C
WHERE B.ID = ANY(ids)
AND Date=Current_date