meta does not work when using get on a splayed table - kdb

I have seen this question, but I think my issue is a bit different :
kdb splayed table meta error
I am saving a table splayed in a location with the following command :
pthToSplayed upsert .Q.en[pthtohdbroot;] table
I don't have any sym so nothing gets enumerated (.Q.en is there because in the future it might have some symbols).
All works well, but when I try to do meta select from get[pathToTable] where date = .z.d I get a ..sym error.
However, the strange part is that the first time I am saving the table down ... the meta works fine. when I exit and start the proc again the problem seems to appear. what exactly happens here? I would appreciate links to a whitepaper or kx website which explains where this issue comes from.
UPDATE:
Nothing weird about the meta of the table. just a vanilla meta.
`:/home/user/hdbroot/tableName/ set .Q.en[`:/home/user/hdbroot;] table
As I am updating this table I am using upsert instead of set for subsequent operations.

Your issue is that the enumeration file named sym needs to be loaded in to the process memory.
If you load the folder using \l you will not have the issue:
q)\l hdbRoot
Or you can explicitly load the sym file:
q)tableName:get `:hdbRoot/tableName/
q)sym:get `:hdbRoot/sym
The reason it works before a restart is because .Q.en automatically loads sym in to memory for you when you call it.
If there are no symbol columns there should be no error.
Save table with no symbol column:
$q
q)`:hdbRoot/tableName/ set .Q.en[`:hdbRoot] ([] a:1 2 3)
`:hdbRoot/tableName/
q)\\
New session meta works:
$q
q)tableName:get `:hdbRoot/tableName/
q)meta tableName
c| t f a
-| -----
a| j
You can check the types of each column to confirm there are no enumerations:
q)type each flip tableName
Prior to V3.6 of kdb+, the type of an enumeration could range from 20h to 76h, with each new domain given an incremented type. In V3.6 onwards all enumerations 20h
If any column is a symbol type, even if empty then a sym file is created:
q)`:hdbRoot/tableName/ set .Q.en[`:hdbRoot] ([] a:`long$();b:`$())
`:hdbRoot/tableName/
q)get `:hdbRoot/sym
`symbol$()

Related

Updating the sym column in a trade table

I would like to update my sym column in my trade table so that at the end of every sym there is a _1 appended onto the end of it.
I have tried update sym:sym _ "_1" from trade which gave me a par error so I then tried the fncol function from the dbmaint.q script which was
`fncol[`:path/to/hdb;`trade;`sym;,"_1]`
which also gave me an error on / which I'm not sure why. If anyone has any idea how to fix this or could point me in the right direction that would be great
This is probably not as trivial as it looks on paper due to the fact that it's an on-disk table (can't use update directly hence par error) and the sym column is possibly enumerated? (hence why you couldn't append string)
If the sym column is enumerated then they need to be re-enumerated after the "_1" is appended, something like:
load`:/path/to/mySymFile; /make sure sym file is loaded
fncol[`:path/to/hdb;`trade;`sym;{`:/path/to/mySymFile?`$string[x],\:"_1"}];
However personally I don't think this would be a great idea and you'd be polluting your sym file with a bunch of new symbols. Why not just append the "_1" at runtime? Does it have to be persisted?
If your sym column is actually a string column and not enumerated, then you would just need:
fncol[`:path/to/hdb;`trade;`sym;{x,\:"_1"}];

kdb - How to pass a table by reference to kdb function

Define the question
Given an empty table myt defined by
myt:([] id:`int$(); score:`int$())
It is trivial to insert one or more records into it, for example
`myt upsert `id`score!1 100
But when it comes to defining a function to insert into a given table, it seems a different trick.
A first try version could be
upd:{[t] t upsert `id`score!42 314;}
upd[myt]
Apparently it updates nothing to myt itself but a local copy version of it.
Difficulties of Possible solutions
Possible solution 1: using the global variable instead
Let myt be a global variable, the variable will then be accessed inside a function.
upd:{`myt upsert `id`score!42 314;}
upd[]
It looks a good solution, expect if many myts are required. Under this situation, one have to provide a lot of copy for upd function as following
upd0:{`myt0 upsert `id`score!42 314;}
upd1:{`myt1 upsert `id`score!42 314;}
upd2:{`myt2 upsert `id`score!42 314;}
...
So, the global variable solution is not a good solution here.
Possible solution 2: amending table outside function
One can also solve the problem by amending myt just outside the function, returning the modified result by removing the ending ;.
upd:{[t] t upsert `id`score!42 314} / return inserted valued
myt:upd[myt]
It works! But after running this code for millions of times, it works slower and slower. Because this solution discards the "in-place" property of upsert operator, the copy overhead increases as the size of table getting larger.
Pass argument by reference?
Maybe the concept of "pass-by-reference" solution here. Or maybe q has its own solution for this problem and I have not get the essential idea.
[UPDATE] Solved by adding "`" to call-by-name
As cillianreilly answers, it is simple to add a "`" symbol in front of myt to declare it as a global variable when pass it into function. So the perfect solution is direct.
upd:{[t] t upsert `id`score!42 314;}
upd[`myt] / it works
Your first version should achieve what you want. If you pass the table name as a symbol, it will update the global variable and return the table name. If you pass the table itself, it will return the updated table, which you can use in an assignment, as you found in possible solution 2. Note that the actual table will not have been updated by this operation.
q){[t;x]t upsert x}[myt;`id`score!42 314]
id score
--------
42 314
q)count myt
0
q){[t;x]t upsert x}[`myt;`id`score!42 314]
`myt
q)count myt
1
For possible solution 1, why would you need hundreds of myt tables? Regardless, there is no need to hardcode the table name into the function. You can just pass the table name as a symbol as demonstrated above, which will update the global for you. The official kx kdb tick example given on their github uses insert for exactly this scenario, but in practice a lot of developers use upsert. https://github.com/KxSystems/kdb-tick/blob/master/tick/r.q#L6
Hope this helps.

KDB/Q splayed table column type changing from 0h to 77h

I have a table that has some columns of type 0h with normal strings in them. Meta shows type C.
I am saving this table to a splayed DB with
.Q.dpft[hsym `$path; dt;`sym;`t]]
However when I load the splayed table later
\l path_to_my_table
the type of all 0h columns changes to 77h. How can I avoid that?
I am using 3.6
As a means of giving you a means to answer the question and decide engineering tradeoffs - not because I believe introducing new code for this is the best way to go about it - you might consider how this is avoided when writing to TP logs (-11! cannot stream from anymap files).
This was a consideration in some of our test code when my team looked at this upgrade in 2019, and we wrote a generic routine to take a list of input and manually write it to a TP log format. Repurposing for your use, it would be something like:
f:get`:file
h:hopen hdel`:file
h f
hclose h

Count all tables in one instance in kdb

I would like to count all tables in the same instance.
I have not used kdb for a while and I forgot how to make this work.
This is what I got:
tablelist:tables[]
{select count i from x} each tablelist
but I got a type error
Your statement doesn't contain a trailing semi colon ; at the end of the first line which will cause an error in an IDE like qpad (assuming you are running it as written).
If not running from an IDE I would check my hdb for any possible missing data and run some sanity checks (i.e can I select from each of my tables normally, do types match across partitions, i is a virtual column representing row count so issues with non-conforming types in your other columns is probably not a cause but investigating may yield the right answer)
One way to achieve what you're trying is (using dummy data):
q){flip select counts:count i,tab:1#x from x}each tablelist:tables[]
counts tab
-------------
5469 depth
3150 quotes
3005 trades
Here I select the count for each table, but also add on the name of the table, flip each result into a dictionary, which results in a list of dictionaries of conforming types and key names which is in fact a table, hence my result. In this way you have a nice way to track what you're actually counting.
Each select query you run is returning a table in the form:
x
-
3
It would be better to use exec as opposed to select to simply return the value of the count e.g:
q){exec count i from x} each tables[]
3 2
Your current method would be attempting to return a list of tables: e.g:
q){select count i from x} each tables[]
+(,`x)!,,3
+(,`x)!,,2
However, the type error makes me think there may be an issue with your tables as this should not error for in-memory tables.
Here's one way
count each `. tables[]
I am using 3.6 2018.05.17 and your expression worked for me. I then change the select to an exec to return just a list of counts.
q){exec count i from x} each tables[]
Below code helps us get the count of each table along with tablename.
q)flip (`table;`msgcount)! flip {x, count value x}#'tables[]
To get only the count and not the tablename along with it.
q){count value x}#'tables[]

How to correctly enum and partition a kdb table?

I put together a few lines to partition my kdb table, which contains string columns of course and thus must to be enumerated.
I wonder if this code is completely correct or if it can be simplified further. In particular, I have some doubt about the need to create a partitioned table schema given the memory table and the disk table will have exactly the same layout. Also, there might be a way to avoid creating the temporary tbl_mem and tbl_mem_enum tables:
...
tbl_mem: select ts,sym,msg_type from oms_mem lj sym_mem;
tbl_mem_enum: .Q.en[`$sym_path] tbl_mem;
delete tbl_mem from `.;
(`$db;``!((17;2;9);(17;2;9))) set ([]ts:`time$(); ticker:`symbol$(); msg_type:`symbol$());
(`$db) upsert (select ts,ticker:sym,msg_type from tbl_mem_enum)
delete tbl_mem_enum from `.;
PS: I know, I shouldn't use "_" to name variables, but then what do I use to separate words in a variable or function name? . is also a kdb function.
I think you mean that your table contains symbol columns - these are the columns that you need to enumerate (strings don't need enumeration). You can do the write and enumeration in a single step. Also if you are using the same compression algo/level on all columns then it may be easier to just use .z.zd:
.z.zd:17 2 9i;
(`$db) set .Q.en[`$sym_path] select ts, ticker:sym, msg_type from oms_mem lj sym_mem;
It's generally recommended to use camelCase instead of '_'. Some useful info here: http://www.timestored.com/kdb-guides/q-coding-standards