Enums for tables

Enums for tables - kdb

I found no information about what the enum is over the table domain on https://code.kx.com/q/ref/enumerate/. But something interesting exists there: https://code.kx.com/q/kb/linking-columns. I tried those examples and found an enum structure that behaves in some situations like a normal enum, but has a strange behaviour in others.
q)kt:1!t:([]a:`a`b`c;b:10 20 30)
q)tt:([]k:`a`a`a`b;d:11 21 31 41)
q)show et1:`t!t[`a]?tt[`k]
`t!0 0 0 1
q)show et2:`kt$tt[`k]
`kt$`a`a`a`b
q)meta select k,d,et1,et2 from tt
c | t f a
---| ------
k | s
d | j
et1| j t
et2| s kt
q)select r1.a, r1.b, r2.a, r2.b from update r1:et1, r2:et2 from tt
a b a1 b1
----------
a 10 a 10
a 10 a 10
a 10 a 10
b 20 b 20
From this perspective et1 and et2 both have similar behaviour. But if we check other enum properties, we see differences:
q)et2[0]
`kt$`a
q)et2[0]:`a
q)
q)et1[0]
`t!0
q)et1[0]:0 / neither works this
't
[0] et1[0]:0
^
q)et1[0]:(`a`b!(`a;10)) / nor that
't
[0] et1[0]:(`a`b!(`a;10))
^
The situation seems more weird if we build enums for just a keyed tables: see a difference for a table with one key column and for two:
q)kkt:2!t:([]a:`a`b`c;b:10 20 30;c:11 22 33)
q)kt:1!0!kkt
q)show ekkt:`kkt$((`a;10);(`b;20);(`b;20))
`kkt!0 1 1
q)show ekt:`kt$(`a`b`b)
`kt$`a`b`b
The same hardcoded (with !) enum notation for kkt.
So the question: what are they? - those enums with a familiar $ and with a hardcoded ! notaions for a table? Is it possible to apply enum-extend technique (?) for them and how? And is there any documentation for them?

What you're seeing is the difference between a simple foreign key and a linked column. As mentioned in the documentation, differences include:
a foreign key is specifically designed to link to the keys of a keyed table.
A foreign key does not allow the link if there's an "unknown key" that isn't one of the keys in the keyed table
linked columns can link to any arbitrary column (if even a value doesn't appear in the other table - thus it doesn't guarantee referential integrity)
linked columns are generally used for on-disk tables
q)kt:([eid:1001 1002 1003] name:`Dent`Beeblebrox`Prefect; iq:98 42 126)
q)tdetails2:([] eid:1003 1001 1002 1001 1002 1001 777;sc:126 36 92 39 98 42 7)
q)update linker:`kt!((0!kt)`eid)?eid from `tdetails2
`tdetails2
q)select linker.name from tdetails2
name
----------
Prefect
Dent
Beeblebrox
Dent
Beeblebrox
Dent
The latter would not have been allowed for a simple foreign key.
Also I don't know why you would want to modify /edit the values of an enumeration - don't do that!

Related

Select from a table with Limit expression works, without - fails

For a table t with a custom field c which is dictionary I could use select with limit expression, but simple select failes:
q)r1: `n`m`k!111b;
q)r2: `n`m`k!000b;
q)t: ([]a:1 2; b:10 20; c:(r1; r2));
q)t
a b c
----------------
1 10 `n`m`k!111b
2 20 `n`m`k!000b
q)select[2] c[`n] from t
x
-
1
0
q)select c[`n] from t
'type
[0] select c[`n] from t
^
Is it a bug, or am I missing something?
Upd:
Why does select [2] c[`n] from t work here?

Since c is a list, it does not support key indexing which is why it has returned a type
You need to index into each element instead of trying to index the column.
q)select c[;`n] from t
x
-
1
0
A list of confirming dictionaries outside of this context is equivalent to a table, so you can index like you were
q)c:(r1;r2)
q)type c
98h
q)c[`n]
10b
I would say that the way complex columns are represented in memory makes this not possible. I suspect that any modification that creates a copy of a subset of the elements will allow column indexing as the copy will be formatted as a table.
One example here is serialising and deserialising the column (not recommended to do this). In the case of select[n] it is selecting a subset of 2 elements
q)type exec c from t
0h
q)type exec -9!-8!c from t
98h
q)exec (-9!-8!c)[`n] from t
10b

What is the meaning of `s attribute on a table?

In the Abridged Q Language Manual Arthur mentioned:
`s#table marks the table to use binary search and marks first column sorted
And if we look into 3.6 version:
N:1000000;
t1:t2:([]n:til N; m:N?`6);
t1:update `p#n from t1;
t2:`s#t2;
(meta t1)[`n]`a / `p
(meta t2)[`n]`a / `p
attr t1 / `
attr t2 / `s
\ts:10000 select count i from t1 where n in 1000?N
/ ~7000
\ts:10000 select count i from t2 where n in 1000?N
/ ~7000
we find that yes, t2 has this attribute: s.
But for some reason an attribute on the first column is not s but p. And also search times are the same. And the sizes of both tables with attributes are the same - I used objsize function described in AquaQ blogpost to ensure.
So are there any differences in 3.6+ version of q between 's#table and a table with '#p attribute on a first column?

I think the only way that the s# on the table itself would improve search times is if you were doing lookups using ? as described here: https://code.kx.com/q/ref/find/#searching-tables
q)\ts:100000 t1?t1[0]
105 800
q)\ts:100000 t2?t2[0]
86 800
q)
q)\ts:100000 t1?t1[500000]
108 800
q)\ts:100000 t2?t2[500000]
83 800
q)
q)\ts:100000 t1?t1[999999]
107 800
q)\ts:100000 t2?t2[999999]
83 800
It behaves differently for a keyed table (turns it into a step function) but I think that's beyond the scope of your original question.

Does distinct function apply unique attribute to list in q kdb?

In output of asc function applied on a list, we can see that the output consists of sorted attribute.
q)asc 1 3 11 10 4
`s#1 3 4 10 11
But in the output of distinct function applied on a list, we cannot see unique attribute applied on the list.
q)distinct 1 3 11 10 4
1 3 11 10 4
But, the time taken to search an element in distinct list and unique list(list on which unique attribute is applied) is almost same.
q)n:1000000?1000000
q)d:distinct n
q)\t:100000 d[1021]
45
q)\t:100000 d[632265]
45
q)u:`u#d
q)\t:100000 u[1021]
44
q)\t:100000 u[632265]
48
So, is the distinct function applying unique attribute internally on the list and converting it to hash table?

distinct does not apply unique attribute to list. You may see this with help of attr function
attr distinct 10?10
will return nothing `, when
attr `u#distinct 10?10
returns `u
Also, I think the experiment may be wrong: d[1021] seems to be usual 0(1) time operation of getting element by index. If you use search instead, you'll see the difference (numbers on my PC):
q)\t:10000 d?770751
85
q)\t:10000 u?770751
6

TorQ: .loader.loadallfiles and referential integrity leads to `cast error

I have a table volatilitysurface and a detail table volatilitysurface_smile as part of the detail table I define a foreign key to the master table i.e.
volatilitysurface::([date:`datetime$(); ccypair:`symbol$()] atm_convention:`symbol$(); ...);
volatilitysurface_smile::([...] volatilitysurface:`volatilitysurface$(); ...);
When I try using AquaQ's TorQ .loader.loadallfiles to load the detail table volatilitysurface_smile I need as part of the "dataprocessfunc" function to dynamically build the foreign key field i.e.
rawdatadir:hsym `$("" sv (getenv[`KDBRAWDATA]; "volatilitysurface_smile"));
.loader.loadallfiles[`headers`types`separator`tablename`dbdir`partitioncol`partitiontype`dataprocessfunc!(`x`ccypair...;"ZS...";enlist ",";`volatilitysurface_smile;target;`date;`month;{[p;t] select date,ccypair,volatilitysurface,... from update date:x,volatilitysurface:`volatilitysurface$(x,'ccypair) from t}); rawdatadir];
Note the part:
update date:x,volatilitysurface:`volatilitysurface$(x,'ccypair) from t
The cast error is pointing to the construction of the volatilitysurface key. However, this works outside .loader.loadallfiles and the tables are globally :: and fully defined before calling the .loader.loadallfiles function.
Any ideas how to deal with this use-case? If the detail table foreign key is not initialized then the insertion will fail.

The error may be due to the scoping in the update. As you are running the cast/update within the .loader namespace the tablename would need to be full scoped (`..volatilitysurface).
eg. update date:x,volatilitysurface:`..volatilitysurface$(x,'ccypair) from t
Regards,
Scott

Are you sure that all possible x & ccypair combinations are in the volatilitysurface table? The 'cast error would seem to suggest this is not the case e.g.
q)t:([a:1 2 3;b:`a`b`c] c:"ghi")
q)update t:`t$(a,'b) from ([] a:2 3 1;b:`b`c`a)
a b t
-----
2 b 1
3 c 2
1 a 0
q)update t:`t$(a,'b) from ([] a:2 3 1 5;b:`b`c`a`d)
'cast
[0] update t:`t$(a,'b) from ([] a:2 3 1 5;b:`b`c`a`d)
^
Note in the second case I have the a-b pair of (5;`d), which isn't present in the table t, and so I get the 'cast error
You can determine if there are missing keys, and which they are, like so:
q)all (exec (a,'b) from ([] a:2 3 1;b:`b`c`a)) in key t //check for presence, all present
1b
q)all (exec (a,'b) from ([] a:2 3 1 5;b:`b`c`a`d)) in key t //check for presence, not all present
0b
q)k where not (k:exec (a,'b) from ([] a:2 3 1 5;b:`b`c`a`d)) in key t //check which keys AREN'T present
5 `d
If this is the case, I guess you kind of have two options:
Make sure the volatilitysurface table is loaded correctly - assuming you have full data coverage in your files, presumably every possible key should be present in this table
If there is the possibility of possibly keys not being present in the volatilitysurface table, you could perhaps add dummy records to it before making the foreign key (which could be replaced if an actual record comes in later
The second option could perhaps work something like this:
q.test){if[count k:k where not (k:exec (a,'b) from x) in key `..t;#[`..t;;:;value[`..t](0N;`)]'[k]];update t:`t$(a,'b) from x}([] a:2 3 1;b:`b`c`a)
a b t
-----
2 b 1
3 c 2
1 a 0
q.test){if[count k:k where not (k:exec (a,'b) from x) in key `..t;#[`..t;;:;value[`..t](0N;`)]'[k]];update t:`t$(a,'b) from x}([] a:2 3 1 5 6;b:`b`c`a`d`e)
a b t
-----
2 b 1
3 c 2
1 a 0
5 d 3
6 e 4
q.test)value `..t //check table t, new dummy records added by previous call
a b| c
---| -
1 a| g
2 b| h
3 c| i
5 d|
6 e|
I've done these tests inside a namespace as this is how the dataprocess function will run in TorQ (i.e. at certain places you need to use `..t to access t in the root namespace.) The analogous version of this function for your setup (with some nicer formatting than the one-liners above) would be something like:
{
if[count k:k where not (k:exec (x,'ccypair from volatilitysurface_smile) in key `..volatilitysurface; //check for missing keys
#[`..volatilitysurface;;:;value[`..volatilitysurface](0Nz;`)]'[k]]; //index into null key of table to get dummy record and upsert to global volatilitysurface table
update volatilitysurface:`volatilitysurface$(x,'ccypair) from x //create foreign key
}

How to pass dictionary into query constraint?

If this is the dictionary of constraint:
dictName:`region`Code;
dictValue:(`NJ`NY;`EEE213);
dict:dictName!dictValue;
I would like to pass the dict to a function and depending on how many keys there are and let the query react accordingly. If there is one key region, then I would like to put it as
select from table where region in dict`region;
The same thing is for code. But if I pass two keys, I would like the query knows and pass it as:
select form table where region in dict`region,Code in dict`code;
Is there any way to do this?
I came up this code:
funcForOne:{[constraint]?[`bce;enlist(in;constraint;(`dict;enlist constraint));0b;()]};
funcForAll[]
{[dict]$[(null dict)~1;select from bce;($[(count key dict)=1;($[`region in (key dict);funcForOne[`region];funcForOne[`Code]]);select from bce where region in dict`region,rxmCode in dict`Code])]};
It works for one and two constraint. but when I called funcForAll[] it gives type error. How should I change it? i think it is from null dict~1
I tried count too. but doesn't work too well.
Update
So I did this but I have some error
tab:([]code:`B90056`B90057`B90058`B90059;region:`CA`NY`NJ`CA);
dictKey:`region`Code;dictValue:(`NJ`NY;`B90057);
dict:dictKey!dictValue;
?[tab;f dict;0b;()];
and I got 'NY error. Do you know why? Also,if I pass a null dictionary it doesn't seem working.

As I said funtional form would be the better approach but if your requirement is very limited as you said then you can consider other solution as below:
Note: Assuming all dictionary keys will be in table columns list.
q) f:{[dict] if[0=count dict;:select from t];
select from t where (#[key dict;t]) in {$[any 0<=type each value x;flip ;enlist ]x}[dict] }
Explanation:
1. convert dict to table depending on the values type. Flip if any value is a general list else enlist.
$[any 0<=type each value dict;flip ;enlist ]dict
Get subset of table t which consists only of dictionary keys as columns.
#[key dict;t]
get rows where (2) in (1)
Basically we are using below form of querying and matching:
q)t1:([]id:1 2;s:`a`b);
q)t2:([]id:1 3 ;s:`a`b);
q)select from t1 where ([]id;s) in t2

If you're just using in, you can do something like:
f:{{[x;y](in),'key[y],'(),x}[;x]enlist each value[x]}
So that:
q)d
a| 10 1
b| ,`a
q)f d
in `a 10 1
in `b ,`a
q)t
a b c
------
1 a 10
2 b 20
3 c 30
q)?[t;f d;0b;()]
a b c
------
1 a 10
Note that because of the enlist each the resulting list is enlisted so that singletons work too:
q)d:enlist[`a]!enlist 1
q)d
a| 1
q)?[t;f d;0b;()]
a b c
------
1 a 10
Update to secondary question
This still works with empty dict, i.e. ()!(). I'm passing in the dictionary variable.
In your 2nd question your dictionary is not constructed correctly (also remember q is case sensitive). Also your values need to be enlisted. Look up functional select in the reference pages on the kx site, you'll see that you need to enlist the symbol lists to differentiate them from column name declarations
`region`code!(enlist `NY`NJ;enlist `B90057)

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Enums for tables - kdb

Related

Select from a table with Limit expression works, without - fails

What is the meaning of `s attribute on a table?

Does distinct function apply unique attribute to list in q kdb?

TorQ: .loader.loadallfiles and referential integrity leads to `cast error

How to pass dictionary into query constraint?

Categories

Resources