What is the y variable in tickerplant code function? - kdb

Can someone explain this function in u.q in kdb+tick
del:{w[x]_:w[x;;0]?y};.z.pc:{del[;x]each t};
Questions:
1. What does it do ?
2. Where is y coming from
3. Any sample calling code ?

You need to look at this code in combination with tick.q. Note that functions and variables in the u.q script are stored in the .u namespace, per the line \d .u.
From tick.q -
globals used
.u.w - dictionary of tables->(handle;syms)
.u.i - msg count in log file
.u.j - total msg count (log file plus those held in buffer)
.u.t - table names
.u.L - tp log filename, e.g. `:./sym2008.09.11
.u.l - handle to tp log file
.u.d - date
You mention two functions - del and .z.pc.
.z.pc is called after a connection has been closed. See the link for information on the parameters supplied.
In this case .z.pc is defined to call the del function with params [;x] each t when a connection is closed. From tick.q we can see that t (.u.t) is a list of table names. From .z.pc definition we know that x is the handle to that connection.
So we call del with [;connection handle] each tables. Within the del function, the table will correspond to x, and the connection handle will be y (implicit paramaters).
The code inside of the del function deletes the handle(y) from the subscription list (w - or .u.w) for table x.
There's a lot of information on tick.q available online here. The FD guide linked to at the end is particularly thorough.

Related

kdb - passing in a symbol as parameter for variable name

I've been trying to pass in a symbol from a process outside KDB and am wishing to use this symbol to create a variable where we then load a specific file in case it doesn't exist.
//create some sample data and write to disk
.data.area52.varName1: ([] name:`billy`allen`terry; val:90 100 200);
.data.area52.varName2: 10 #200;
.data.area53.varName1: 9 #100;
.data.area53.varName2: ([] name:`joe`jim`bob; val:10 20 30);
.data.area53.varName3: `random;
`:area52 set .data.area52;
`:area53 set .data.area52;
I am trying to pass in the parameter for reference area52for example which will test a conditional if it exists (do nothing, else) then create the variable loading viaget `:area52
First, I wrote a conditional below which seems to check (however trhe `.areaXX isn't parametrized yet.
// fresh instance of KDB
.data.area52: $[`area52 in key `.data; .data.area52; get `:area52
And then stumbled onto the link here kdb+: use string as variable name
Which gets me most of the way there.
Is there a way to parameterize the beginning of the lambda below passing in some combination of `.area52 ? Much of this variable can be assembled / edited outside KDB and only passing in the
for an example, we can have several hundred `.areaXX that we could pass into KDB as data is changed and refreshed.
{.data.area52:()!(); #[`.data; `area52;:; (get `:area52)]} [];
To safeguard against the situation where the namespace .data doesn't exist you could use the following lambda which takes parameter area52 as an argument.
q){(` sv ``data,x) set ()!(); #[`.data; x; :; get hsym x]}`area52
`.data
Tying it back to the original problem, this lambda will create the variable if it does not exist by reading from the file of the same name:
/ .data does not exist yet
q).data
'.data
[0] .data
^
/ create variable if necessary
q){if[not x in key `.data; (` sv ``data,x)set get hsym x]}`area52
/ check .data
q).data
| ::
area52| ``varName1`varName2!(::;+`name`val!(`billy`allen`terry;90 100 200);20..

Value within a function seems to not be able to detect the local variables and also fails

I want to be able to run something like the following:
f:{[dt] syms:`sym1;eval parse"select from tbl where date = dt, sym=syms"}
f[.z.D]
Given the following :
tbl:([] date:2022.01.01 2022.01.01; Id:1000000 2000000; sym:`sym1`sym2;price:10 20;qty:3 4)
f:{[dt] syms:`sym1; ?[tbl;((=;`date;`dt);(=;`sym;`syms));0b;()]}
f1:{[dt] syms:`sym1; (?) . (tbl;((=;`date;`dt);(=;`sym;`syms));0b;())}
f2:{[dt] syms:`sym1; value (?;tbl;((=;`date;`dt);(=;`sym;`syms));0b;())}
f[.z.D] // works
f1[.z.D] // Gives Error - dt not recognized/out of scope
f2[.z.D] // Gives Error - dt not recognized/out of scope
Value within a function seems to not be able to detect the local variables and surprisingly (?) . also fails. (maybe because this in itself is a function and dt is not defined here?)
Is there any work around for this?
For context, I have a function that takes a select string/functional select, parses it, does some checks and manipulations on the functional form and returns a modified functional form.
I want users to be able to call this function from their own functions and that parameters they have defined in their function can be in the outputted functional form and that functional form can be valued some how.
I don't want users to be forced to pass more variables into my function etc.
What you need to do here is remove the backtick for dt and syms
I would also recommend using a backtick when calling your table name.
Further, you should make sure syms is enlisted if it is only one symbol.
So your function should be:
f:{[dt] syms:(),`sym1; ?[`tbl;((=;`date;dt);(=;`sym;syms));0b;()]}
If you parse your select statement you can see the correct form for functional selects:
q)parse "select from tbl where date=2022.01.01,sym=`sym1"
?
`tbl
,((=;`date;2022.01.01);(=;`sym;,`sym1)) // comma in front of `sym1 means enlist
0b
()
The backtick is not needed as this is a variable, defined in your function, it would be the same as doing:
?[`tbl;((=;`date;2022.01.01);(=;`sym;enlist `sym1));0b;()]
This should allow you to use your function correctly:
q)f[2022.01.01]
date Id sym price qty
---------------------------------
2022.01.01 1000000 sym1 10 3
For more information, see the kx documentation

How to use DISP=SHR on an fopen

With code like:
fopen("DD:LOGLIBY(L1234567)", "w");
and JCL like:
//LOGTEST EXEC PGM=LOGTEST
//LOGLIBY DD DSN=MYUSER.LOG.LIBY,DISP=SHR
I can create PDS(E) members while at the same time browsing the PDS(E) to look at existing members, as expected with DISP=SHR.
If instead I code:
fopen("//'MYUSER.LOG.LIBY(L1234567)'", "w");
The fopen fails if I am browsing the PDS(E) at the time, or the browse of the PDS(E) fails while I have the file open. In other words there is no DISP=SHR. According to the fopen() documentation, DISP=SHR is the default when using file modes of "r" etc, but not "w".
How can I provide DISP=SHR in the second example?
There are two possibilities...
For anyone not familiar with the internal structure of partitioned datasets (that is, PDS or PDS/E), these datasets are logically divided into two parts: a "directory" having pointers to all the individual members, and a "data" area containing the actual records for the individual members:
PDS: <DIRECTORY BLOCKS>
<MEMBER1>: ADDRESS OF DATA FOR MEMBER1 (xxx)
<MEMBER2>: ADDRESS OF DATA FOR MEMBER2 (yyy)
...
<DIRECTORY FREESPACE)
...
<EOF - END OF THE PDS DIRECTORY>
<DATA PORTION>
+xxx = DATA FOR MEMBER1
...
<EOF - END OF MEMBER1>
+yyy = DATA FOR MEMBER2
...
<EOF - END OF MEMBER2>
...
FREE SPACE (ALLOCATED, BUT UNUSED)
...
END OF PDS
Throughout the next few paragraphs, keep in mind that you can open either the entire PDS/PDSE, and this enables you to read/write whatever members you like, or you can allocate and open a single member, and that gets processed like any other sequential file.
First, if you actually to have a DD statement coded as you show in the question, then you may simply need to change your open from fopen(dsn,...) to fopen(dd:ddname,...). If you're running under the UNIX Shell or you do something that results in your process running in a different address space (such as fork()), then this might not work, but it may be worth a try. If you do this with the JCL you show, the challenge would be managing the PDS/E directory - you'd need to issue your own "STOW" when you create/update a new member since the JCL allocates the entire dataset, not just a single member. The sequence would be:
Open the DD for output.
Write your data.
Update the PDS or PDS/E directory with new member information (this is where the STOW function comes in - it updates the directory of the PDS/PDSE to reflect the member you created or updated).
Close the file
If you also need to read members, you'd need to issue FIND (or BLDL/POINT - which can be fseek() in C) to point to the correct member, then read the member. I'm sure it sounds like a hassle, but the advantage of this approach is that you can allocate/open the file once, and process as many individual members as you like.
A second workaround might be to dynamically allocate the file yourself, and then open it using DD:ddname syntax...if you only infrequently access the file, this is probably easier to code. The gory details of dynamic allocation are fully described here: https://www.ibm.com/support/knowledgecenter/SSLTBW_2.4.0/com.ibm.zos.v2r4.ieaa800/reqsvc.htm.
There are several ways to invoke dynamic allocation: you can write a small assembler program, you can use the z/OS UNIX Services BPXWDYN callable service, or you can use the C runtime "dynalloc()" or "svc99()" functions. The dynalloc() function is easy to use, but it only exposes a subset of what dynamic allocation can do...svc99() is more cumbersome to use, but it exposes more functionality.
However you do it, dynamic allocation takes "text units" that roughly correspond to the parameters you find on JCL DD statements. What you're describing sounds like you'd just need to pass DSN and DISP text units, and maybe DDNAME (you can either pass your own DDNAME, or let the system generate one for you).
The C runtime functions make all this easy, but be aware that there are a few oddities, such as the need to pad the parameters to their maximum length. For example, a DSN needs to be 44 characters and padded on the right with blanks - not a C-style null-terminated string.
Here's a small code snippet as an example:
#include <dynit.h>
. . .
int allocate(ddn, dsn, mem)
{
__dyn_t ip; // Parameters to dynalloc()
. . .
// Prepare the parameters to dynalloc()
dyninit(&ip); // Initialize the parameters
ip.__ddname = ddn; // 8-char blank-padded
ip.__dsname = dsn; // 44-char blank-padded
ip.__status = __DISP_SHR; // DISP=(SHR)
ip.__normdisp = __DISP_KEEP; // DISP=(...,KEEP)
ip.__misc_flags = __CLOSE; // FREE=CLOSE
if (*mem) // Optional PDS, PDS/E member
ip.__member = mem; // 8-char blank-padded
// Now we can call dynalloc()...
if (dynalloc(&ip)) // 0: Success, else error
{
// On error, the errcode/infocode explain why - values
// are detailed in z/OS Authorized Services Reference
printf("SVC99: Can't allocate %s - RC 0x%x, Info 0x%x\n",
dsn, ip.__errcode, ip.__infocode);
return FALSE;
}
// If dynalloc works, you can open the file with fopen("DD:ddname",...)
}
Don't forget that when you're done with the file, you generally need to deallocate it.
The code snippet above uses "FREE=CLOSE" - this means that when the file is closed, z/OS will automatically free the allocation...if you only open and process the dataset once, this is a convenient approach. If you need to repeatedly open and close the file, then you wouldn't use FREE=CLOSE, but instead call dynamic allocation a second time after you're done with your processing and want to free the file.
If you need to concurrently access multiple files, beware that you'll need to generate multiple unique DDNAMEs. You can either do this in your own code, or you can use the form of dynamic allocation that automatically builds and returns a usable DDNAME (of the form "SYSnnnnn").
Also, don't forget that updating a dataset under DISP=SHR can be dangerous in some situations, especially if the dataset involved can be a conventional PDS as well as a PDS/E. The big danger is that two applications open the dataset for output concurrently...both will write data into the same place, and the result will likely be a damaged PDS directory.
There are some other oddities in the UNIX Services environment, particularly if you use fork() or exec() and expect file handles to work in subprocesses since allocations are generally tied to a particular z/OS address space. Services like spawn() can let the child process run in the same address space, so this is one possibility.

Access locally scoped variables from within a string using parse or value (KDB / Q)

The following lines of Q code all throw an error, because when the statement "local" is parsed, the local variable is not in the correct scope.
{local:1; value "local"}[]
{[local]; value "local"}[1]
{local:1; eval parse "local"}[]
{[local]; eval parse "local"}[1]
Is there a way to reach the local variable from inside the parsed string?
Note: This is a simplification of the actual problem I'm grappling with, which is to write a function that executes a query, accepting a list of columns which it should return. I imagine the finished product looking something like this:
getData:{[requiredColumns, condition]
value "select ",(", " sv string[requiredColumns])," from myTable where someCol=condition"
}
The condition parameter in this query is the one that isn’t recognised and I do realise I could append it’s value rather than reference it inside a string, but the real query uses lots of local variables including tables etc, so it’s not as easy as just pulling all the variables out of the string before calling value on it.
I'm new to KDB and Q, so if anyone has a better way to achieve the same effect I'm happy to be schooled on the proper way to achieve this outcome in Q. Would still be interested to know in the variable access thing is possible though.
In the first example, you are right that local is not within the correct scope, as value is looking for the global variable local.
One way to get around this is to use a namespace, which will define the variable globally, but can only be accessed by calling that namespace. In the modified example below I have defined local in the .ns namespace
{.ns.local:1; value ".ns.local"}[]
For the problem you are facing with selecting, if requiredColumns is a symbol list of columns you can just use the take operator # to select them.
getData:{[requiredColumns] requiredColumns#myTable}
For more advanced queries using variables you may have to use functional select form, explained here. This will allow you to include variables in the where and by clause of the select statement
The same example in functional form would be (no by clause, only select and where):
getData:{[requiredColumns;condition] requiredColumns:(), requiredColumns;
?[myTable;enlist (=;`someCol;condition);0b;requiredColumns!requiredColumns]}
The first line ensures that requiredColumns is a list even if the user enters a single column name
value will look for a variable in the global scope that's why you are getting an error. You can directly use local variables like you are doing that in your function.
Your function is mostly correct, just need a slight correction to append condition(I have mentioned that below). However, a better approach would be to use functional select in this case.
Using functional select:
q) t:([]id:`a`b; val:3 4)
q) gd: {?[`t;enlist (=;`val;y);0b;((),x)!(),x]}
q) gd[`id;3] / for single column
Output:
id
-
1
q) gd[`id`val;3] / for multiple columns
In case your condition column is of type symbol, then enlist your condition value like:
q) gd: {?[`t;enlist (=;`id;y);0b;((),x)!(),x]}
q) gd[`id;enlist `a]
You can use parse to get a functional form of qsql queries:
q) parse " select id,val from t where id=`a"
?
`t
,,(=;`id;,`a)
0b
`id`val!`id`val
Using String concat(your function):
q)getData:{[requiredColumns;condition] value "select ",(", " sv string[requiredColumns])," from t where id=", .Q.s1 condition}
q) getData[enlist `id;`a] / for single column
q) getData[`id`val;`a] / for multi columns

Call graph generation from matlab src code

I am trying to create a function call graph for around 500 matlab src files. I am unable to find any tools which could help me do the same for multiple src files.
Is anyone familiar with any tools or plugins?
In case any such tools are not available, any suggestions on reading 6000 lines of matlab code
without documentation is welcome.
Let me suggest M2HTML, a tool to automatically generate HTML documentation of your MATLAB m-files. Among its feature list:
Finds dependencies between functions and generates a dependency graph (using the dot tool of GraphViz)
Automatic cross-referencing of functions and subfunctions with their definition in the source code
Check out this demo page to see an example of the output of this tool.
I recommend looking into using the depfun function to construct a call graph. See http://www.mathworks.com/help/techdoc/ref/depfun.html for more information.
In particular, I've found that calling depfun with the '-toponly' argument, then iterating over the results, is an excellent way to construct a call graph by hand. Unfortunately, I no longer have access to any of the code that I've written using this.
I take it you mean you want to see exactly how your code is running - what functions call what subfunctions, when, and how long those run for?
Take a look at the MATLAB Code Profiler. Execute your code as follows:
>> profile on -history; MyCode; profile viewer
>> p = profile('info');
p contains the function history, From that same help page I linked above:
The history data describes the sequence of functions entered and exited during execution. The profile command returns history data in the FunctionHistory field of the structure it returns. The history data is a 2-by-n array. The first row contains Boolean values, where 0 means entrance into a function and 1 means exit from a function. The second row identifies the function being entered or exited by its index in the FunctionTable field. This example [below] reads the history data and displays it in the MATLAB Command Window.
profile on -history
plot(magic(4));
p = profile('info');
for n = 1:size(p.FunctionHistory,2)
if p.FunctionHistory(1,n)==0
str = 'entering function: ';
else
str = 'exiting function: ';
end
disp([str p.FunctionTable(p.FunctionHistory(2,n)).FunctionName])
end
You don't necessarily need to display the entrance and exit calls like the above example; just looking at p.FunctionTable and p.FunctionHistory will suffice to show when code enters and exits functions.
There are already a lot of answers to this question.
However, because I liked the question, and I love to procrastinate, here is my take at answering this (It is close to the approach presented by Dang Khoa, but different enough to be posted, in my opinion):
The idea is to run the profile function, along with a digraph to represent the data.
profile on
Main % Code to be analized
p = profile('info');
Now p is a structure. In particular, it contains the field FunctionTable, which is a structure array, where each structure contains information about one of the calls during the execution of Main.m. To keep only the functions, we will have to check, for each element in FunctionTable, if it is a function, i.e. if p.FunctionTable(ii).Type is 'M-function'
In order to represent the information, let's use a MATLAB's digraph object:
N = numel(p.FunctionTable);
G = digraph;
G = addnode(G,N);
nlabels = {};
for ii = 1:N
Children = p.FunctionTable(ii).Children;
if ~isempty(Children)
for jj = 1:numel(Children)
G = addedge(G,ii,Children(jj).Index);
end
end
end
Count = 1;
for ii=1:N
if ~strcmp(p.FunctionTable(ii).Type,'M-function') % Keep only the functions
G = rmnode(G,Count);
else
Nchars = min(length(p.FunctionTable(ii).FunctionName),10);
nlabels{Count} = p.FunctionTable(ii).FunctionName(1:Nchars);
Count = Count + 1;
end
end
plot(G,'NodeLabel',nlabels,'layout','layered')
G is a directed graph, where node #i refers to the i-th element in the structure array p.FunctionTable where an edge connects node #i to node #j if the function represented by node #i is a parent to the one represented by node #j.
The plot is pretty ugly when applied to my big program but it might be nicer for smaller functions:
Zooming in on a subpart of the graph:
I agree with the m2html answer, I just wanted to say the following the example from the m2html/mdot documentation is good:
mdot('m2html.mat','m2html.dot');
!dot -Tps m2html.dot -o m2html.ps
!neato -Tps m2html.dot -o m2html.ps
But I had better luck with exporting to pdf:
mdot('m2html.mat','m2html.dot');
!dot -Tpdf m2html.dot -o m2html.pdf
Also, before you try the above commands you must issue something like the following:
m2html('mfiles','..\some\dir\with\code\','htmldir','doc_dir','graph','on')
I found the m2html very helpful (in combination with the Graphviz software). However, in my case I wanted to create documentation of a program included in a folder but ignoring some subfolders and .m files. I found that, by adding to the m2html call the "ignoreddir" flag, one can make the program ignore some subfolders. However, I didn't find an analogue flag for ignoring .m files (neither does the "ignoreddir" flag do the job). As a workaround, adding the following line after line 1306 in the m2html.m file allows for using the "ignoreddir" flag for ignoring .m files as well:
d = {d{~ismember(d,{ignoredDir{:}})}};
So, for instance, for generating html documentation of a program included in folder "program_folder" but ignoring "subfolder_1" subfolder and "test.m" file, one should execute something like this:
m2html( 'mfiles', 'program_folder', ... % set program folder
'save', 'on', ... % provide the m2html.mat
'htmldir', './doc', ... % set doc folder
'graph', 'on', ... % produce the graph.dot file to be used for the visualization, for example, as a flux/block diagram
'recursive', 'on', ... % consider also all the subfolders inside the program folders
'global', 'on', ... % link also calls between functions in different folders, i.e., do not link only the calls for the functions which are in the same folder
'ignoreddir', { 'subfolder_1' 'test.m' } ); % ignore the following folders/files
Please note that all subfolders with name "subfolder_1" and all files with name "test.m" inside the "program_folder" will be ignored.