kdb+/q: Prevent initialised tables from returning a comma - kdb

Consider the following example:
test:([] name:`symbol$(); secondColumn:`int$());
insert[`test;(`John;1)];
myvar:exec name from test;
Now myvar is now:
q)myvar
,`John
So to select the actual result, I have to do:
q)myvar[0]
`John
I understand this is because of the initialisation, so is there a way to make myvar contain the actual value immediately?

Array access with [0] or first is the correct way if you want an "atomic" variable.
myvar:first exec name from test;

A list with a single element in KDB can be created in multiple ways (which is something you are getting in myvar)
q)enlist `John
,`John
q)(),`John
,`John
A KDB table is basically a flip of a dictionary of lists.
`name`secondColumn!(`John`James;1 2) /Dictionary of lists
name | John James
secondColumn| 1 2
q)test2:flip `name`secondColumn!(`John`James;1 2)
name secondColumn
------------------
John 1
James 2
Both of the following commands achieve the same results :
q)exec name from test2
q)test2[`name]
`John`James
When you selected the test column using the exec command it returned all the elements of the list (a list with one element)
Apart from the ways explained in the accepted answer, there are few more ways (slightly different however) you can get the first element returned from the table.
q)exec name[0] from test
q)test[`name][0]
q)exec first name from test

Related

Should I be able to assign in nested dictionaries in KDB?

According to the docs, the assignment of `lt below should have upsert mechanics:
s:()!()
s[`MSFT]:(`state`sym)!(`init`MSFT)
| state sym
----| ----------
MSFT| init MSFT
s[`MSFT][`lt]: 3
'assign
[0] s[`MSFT][`lt]: 3
^
But instead I get an error.
Wham I doing wrong?
This goes back to the same problem you had before with typed dictionaries (dictionaries whose values are all the same type so kdb tries to keep it that way) - this time it's happening twice at two depths!
If you define:
s:()!()
s[`MSFT]:(`state`sym)!(`init`MSFT)
then kdb assumes the shape of the "values" based on your first insert to the dictionary. In this case, kdb enforces that any value in that dictionary (even the one for MSFT) is a dictionary with keys state and sym. That means you can't force a new shape on it by adding a third key (at least not in the way you're attempting to).
On top of that - the sub-dictionary that you've created is itself a typed dictionary whose values are all symbol so kdb will force it to stay symbol values (aka you can't suddenly make "3" a value).
The final issue here is the one Matthew pointed out - you can't assign using double brackets [][] you can only assign with one (and use depth if necessary).
Putting all of this together:
/define s to allow generic datatype
q)s:(1#`)!enlist[::]
/also don't allow the inner dictionary to be typed
q)s[`MSFT]:(``state`sym)!(::;`init;`MSFT)
/now you can assign
q)s[`MSFT;`lt]:3
q)s
| ::
MSFT| ``state`sym`lt!(::;`init;`MSFT;3)
A keyed table is a map from a table to a table, so what you're indexing in to s with needs to itself be a table. So
(enlist `) ! enlist `MSFT
Second, if you are starting with an empty keyed table, you need to enlist the key and value.
q)s: () ! ()
q)s[enlist (enlist `) ! enlist `MSFT]: enlist (`state`sym) ! `init`MSFT
q)s
| state sym
----| ----------
MSFT| init MSFT
When your table is no longer empty, you don't need to enlist the key and value.
q)s[(enlist `) ! enlist `GOOG]: (`state`sym) ! `init`GOOG
q)s
| state sym
----| ----------
MSFT| init MSFT
GOOG| init GOOG

Given an existing table, generate code for defining an empty table with the same schema

I would like to take an existing table that I deserialized from a binary file or obtained from a remote process and generate code that will create an empty copy of the table so that I can have a human-readable representation of the schema that I can use to easily re-create the table.
For example, assume I have a trade table in memory and I want to generate code that will return an empty table of the same schema.
q)show meta trade
c | t f a
-----| -----
time | n
sym | s g
price| f
size | i
stop | b
cond | c
ex | c
I'm aware I can obtain an empty copy of trade by running 0#trade. However, I'd like to have a general function (let's say it's called getSchema) that will behave something like this:
q) getSchema trade
"trade:([]time:`timespan$(); sym:`g#`symbol$(); price:`float$(); size:`int$(); stop:`boolean$(); cond:`char$(); ex:`char$())"
I think it would be straightforward to implement this by processing the result of meta trade, but I was wondering if there was was a more straightforward or publicly available implementation of this function. Thanks.
I haven't seen such function available in public. An example is below, but it does not cover all cases (this is left for an exercise)
getSchema: {
typeMapping: "nsfibc"!("timespan";"symbol";"float";"int";"boolean";"char");
c: exec c from meta x;
t: exec t from meta x;
statement: string[x],":([]";
statement,: "; " sv string[c],'": `",/:(typeMapping#t),\:"$()";
statement,: ")";
statement
};
//Expects table's name as symbol
getSchema`trade
What is not covered:
attributes: Attribute should go in the middle of "; " sv string[c],'": _code_for_attribute_ `",/:(typeMapping#t),\:"$()" statement
types: typeMapping must be enriched to cover the rest of Q types
keys. If table is keyed, then keyed columns are listed inside square brackets t: ([keyed_columns] other_columns)
foreign keys. To be fair they are seldom in use, so I would ignore them
The following should work nicely, however it will only render types atomic types, which will conform to how tables are normally defined. This method just utilizes the string representation of the empty schema that you can achieve via -3!
This should handle keyed tables and attributes but not foreign key.
getSchema:{[x]
/ Render the types with -3!
typs:ssr[-3!value flip 0!0#get x;"\"\"";"`char$()"];
/ Append the column names to the types
colNameType:(string[cols 0!get x],\:":"),'";" vs 1_-1_typs;
/ Ensure the correct keys are shown, add in ;
colnametyp:#[colNameType;count[keys x] - 1;,;"]"],\:";";
/ Combine
raze string[x],":([",(-1_raze colnametyp),")"
}
q)trade:2!([]time:`timespan$(); sym:`g#`symbol$(); price:`float$(); size:`int$(); stop:`boolean$(); cond:`char$(); ex:`char$());
q)getSchema `trade
"trade:([time:`timespan$();sym:`g#`symbol$()];price:`float$();size:`int$();stop:`boolean$();cond:`char$();ex:`char$())"

Passing a column name as an argument for KDB select query?

I would like to pass a column name into a Q function to query a loaded table.
Example:
getDistinct:{[x] select count x from raw}
getDistinct "HEADER"
This doesn't work as the Q documentation says I cannot pass column as arguments. Is there a way to bypass this?
When q interprets x it will treat it as a string, it has no reference to the column, so your output would just be count "HEADER".
If you want to pass in the column as a string you need to build the whole select statement then use value
{value "select count ",x," from tab"} "HEADER"
However, the recommended method would be to use a functional select. Below I use parse to build the functional select equivalent using the parse tree.
/Create sample table
tab:([]inst:10?`MSFT`GOOG`AAPL;time:10?.z.p;price:10?10f)
/Generate my parse tree to get my functional form
.Q.s parse "select count i by inst from tab"
/Build this into my function
{?[`tab;();(enlist x)!enlist x;(enlist `countDistinct)!enlist (#:;`i)]} `inst
Note that you have to pass the column in as a symbol. Additionally the #:i is just the k equivalent to count i.
Update for multiple columns
tab:([]inst:10?`MSFT`GOOG`AAPL;time:10?.z.p;price:10?10f;cntr:10`HK`SG`UK`US)
{?[`tab;();(x)!x;(enlist `countDistinct)!enlist (#:;`i)]} `inst`cntr
To get the functional form of a select statement, I recommend using buildSelect. Also, reduce the scope of parenthesis, i.e. use enlist[`countDistinct] instead of (enlist `countDistinct).

Access locally scoped variables from within a string using parse or value (KDB / Q)

The following lines of Q code all throw an error, because when the statement "local" is parsed, the local variable is not in the correct scope.
{local:1; value "local"}[]
{[local]; value "local"}[1]
{local:1; eval parse "local"}[]
{[local]; eval parse "local"}[1]
Is there a way to reach the local variable from inside the parsed string?
Note: This is a simplification of the actual problem I'm grappling with, which is to write a function that executes a query, accepting a list of columns which it should return. I imagine the finished product looking something like this:
getData:{[requiredColumns, condition]
value "select ",(", " sv string[requiredColumns])," from myTable where someCol=condition"
}
The condition parameter in this query is the one that isn’t recognised and I do realise I could append it’s value rather than reference it inside a string, but the real query uses lots of local variables including tables etc, so it’s not as easy as just pulling all the variables out of the string before calling value on it.
I'm new to KDB and Q, so if anyone has a better way to achieve the same effect I'm happy to be schooled on the proper way to achieve this outcome in Q. Would still be interested to know in the variable access thing is possible though.
In the first example, you are right that local is not within the correct scope, as value is looking for the global variable local.
One way to get around this is to use a namespace, which will define the variable globally, but can only be accessed by calling that namespace. In the modified example below I have defined local in the .ns namespace
{.ns.local:1; value ".ns.local"}[]
For the problem you are facing with selecting, if requiredColumns is a symbol list of columns you can just use the take operator # to select them.
getData:{[requiredColumns] requiredColumns#myTable}
For more advanced queries using variables you may have to use functional select form, explained here. This will allow you to include variables in the where and by clause of the select statement
The same example in functional form would be (no by clause, only select and where):
getData:{[requiredColumns;condition] requiredColumns:(), requiredColumns;
?[myTable;enlist (=;`someCol;condition);0b;requiredColumns!requiredColumns]}
The first line ensures that requiredColumns is a list even if the user enters a single column name
value will look for a variable in the global scope that's why you are getting an error. You can directly use local variables like you are doing that in your function.
Your function is mostly correct, just need a slight correction to append condition(I have mentioned that below). However, a better approach would be to use functional select in this case.
Using functional select:
q) t:([]id:`a`b; val:3 4)
q) gd: {?[`t;enlist (=;`val;y);0b;((),x)!(),x]}
q) gd[`id;3] / for single column
Output:
id
-
1
q) gd[`id`val;3] / for multiple columns
In case your condition column is of type symbol, then enlist your condition value like:
q) gd: {?[`t;enlist (=;`id;y);0b;((),x)!(),x]}
q) gd[`id;enlist `a]
You can use parse to get a functional form of qsql queries:
q) parse " select id,val from t where id=`a"
?
`t
,,(=;`id;,`a)
0b
`id`val!`id`val
Using String concat(your function):
q)getData:{[requiredColumns;condition] value "select ",(", " sv string[requiredColumns])," from t where id=", .Q.s1 condition}
q) getData[enlist `id;`a] / for single column
q) getData[`id`val;`a] / for multi columns

Kdb+/q: Looping over initialised list

Let's say I create an empty table:
test:([] name:`symbol$(); balance:`int$());
Now let's populate this list with one row:
insert[`test;(`John;1001)];
Now if I want to loop over this table as follows:
n:0;
k:0;
f:{x%100}
do[count test; k+:f[test.balance[n]]; n+:1]
Then it gave me an error because it tried to use (evaluate) the empty initialisation value with the function f.
Is there any particular reason why this doesn't work?
And how can I make sure it does work?
What you're doing may work but it's far from best practice. Loops and indices are not the way to go.
What you're looking for is essentially
test:([] name:`symbol$(); balance:`int$());
insert[`test;(`John;1001)];
insert[`test;(`Jane;2002)];
q)select sum f[balance] from test
balance
-------
30.03