kdb+/q tickerplant : what is `timesym? - kdb

I am following test tickerplant and feedhandler setup instructions. However, when I try to run q tick.q someTable tick_log -p 5555 I get the following error: 'timesym. There is nothing about it in the above mentioned "C API for kdb+" white paper. What I did:
downloaded https://github.com/KxSystems/kdb-tick/blob/master/tick.q
mkdir tick
put u.q from https://github.com/KxSystems/kdb-tick/blob/master/tick/u.q into tick/
put someTable schema into tick/someTable.q (contains sym and time columns)
mkdir tick_log
ran the above q tick.q someTable tick_log -p 5555 command
Could you please help me to understand what is the meaning of the timesym variable and how I should supply it? Am I missing some steps?
Thank you very much for your help!

'timesym is an error thrown by the tickerplant if the first two columns in the table being consumed are not time & sym respectively. You can spot where the error is occuring on line 30 of tick.q, in the .u.tick function (search for timesym).
In order to resolve this issue, you need to ensure time & sym are the first two columns of the table (in this order). Alternatively, you could change the tickerplant code to suit your table.

Related

how to select the rows between two different two time spans in pgadmin4

I have a table in pgadmin4. one of the columns is date-time . I want to extract all columns when date time is greater than 0 and smaller than 6 and also when date-time is between 18 and 23.
I tried the following query but the result was an empty table.
select
*
from tl
where date_part('hour',tl.tstamp) >=0 and date_part('hour',tl.tstamp) <6
and date_part('hour',tl.tstamp) >=18 and date_part('hour',tl.tstamp) <=23;
each of these time spans work separately well but when I try to execute them together I have the mentioned problem.
I would appreciate it if somebody could tell me what is the problem and how can I fix it.

Definition of COBOL variable to use for DB2 date arithmetic

I have cases where I want to a add or substract a variable number of days from a timestamp.
The most simple example is this:
SELECT CURRENT_TIMESTAMP - :MOD-DAY DAY
INTO :MYTIMESTAMP
FROM SYSIBM.SYSDUMMY1
My problem is figuring out the right cobol definition for MOD-DAY.
As far as I'm aware, we are running DB2 version 11.
According to https://www.ibm.com/support/knowledgecenter/SSEPEK_11.0.0/sqlref/src/tpc/db2z_datearithmetic.html
the DB2 definition of the variable must be DECIMAL(8,0)
That could be 9(08) or S9(08) but in both cases, and any other variation I have thought up so far, I get the compile error
DSNH312I E DSNHSMUD LINE 1181 COL 49 UNDEFINED OR UNUSABLE HOST VARIABLE "MOD-DAY"
I have of course made certain that MOD-DAY has been defined, so the key word must be UNUSABLE
The error code definition for DSNH312I is pretty generic:
https://www.ibm.com/support/knowledgecenter/SSEPEK_10.0.0/msgs/src/tpc/dsnh312i.html
So does anyone know the right COBOL variable definition to use in this case?
Decimal in Mainframe-DB2 means comp-3.
So the field should be defined as S9(08) comp-3
If you look at the Cobol Copybooks generated by DB2 for DB2 Tables / Views you will see both the DB2 definition and the generated Cobol fields. That can be another way to solve queries like this

SQL Server performance function vs no function

I have a query (relationship between CONTRACT <-> ORDERS) that I decided to break up into 2 parts (contract and orders) so I can reuse in another stored procedure.
When I run the code before the break up, it took around 10 secs to run; however, when I use a function for getting the contract, then pump the data into a temp table first, then join to the other parts it takes 2m:30s - why the difference in time?
The function takes less than a second to run and returns only one row i.e. details of one contract (contract_id is the parameter supplied to the function).
The part that is most effecting the performance the (ORDERS) largest table in the query has 4.1 million rows and joins to a few other tables however; if I just run the sub query for orders in isolation with a particular filter i.e. the contract id it takes less than a second to run and just happens to return zero records based for the contract I am testing on (due to filtering on the type of order it is looking for).
Base on the above information you would think 1 sec at most for the function + 1 sec at most to get the orders + summarize = 2 seconds at most, not 2 and half minutes!
Where am I going wrong, how do I begin to isolate the issue in time difference?
I know someone is going to tell me to paste the code but surely it is an issue of the database vs indexes perhaps vs how the compiler performs when dealing with raw code versus broken up code into parts. Is there an area of the code I can look at before having to post my whole code as I have tried variations of OUTER APPLY vs LEFT JOIN from the contract temp table to the orders subquery and both give me about the same result. Any ideas?
I don't think the issue was with the code but the network I was running it on. Although bizarre in the fact I had 2 versions of the proc running side by side and yesterday or rather before the weekend one was running in 10 secs and it is still running in 10 secs 3 days later and my new version (using the function) was taking anywhere between 2 to 3 minutes. This morning it is running at 2 or 3 seconds!! So I don't know if it is the fact I changed from declaring my table structure and using a table variable instead first to where previously I was using SELECT ... INTO #Contract made the difference or the network or precompiling has an affect. Whatever it is it no longer an issue. Should I delete this post?

loading multiple non-CSV tables to R, and perform a function on each file.

First day on R. I may be expecting too much from it but here is what I'm looking for:
I have multiple files (140 tables), and each table has two columns (V1=values & V2=frequencies). I use the following code to get the Avg from each table:
I was wondering if it's possible to do this once instead of 140 times!
i.e: to load all files and get an exported file that shows Avg of each table in front of the original name of the file.
-I use read.table to load files as read.CSV doesn't work well for some reason.
I'll appreciate any input!
Sum(V1*V2)/Sum(V2)

MATLAB - How to load and handle of a big TXT file (32GB)

First os all, sorry about my english...
I would like to know a better way to load and handle a big TXT file (around 32GB, matrix 83.000.000x66). I already tried some experiments with TEXTSCAN, IMPORT (out of memory), fgets, fget1,.... Except import approach, all methods works but take to much time (much more than 1 week).
I aim to use this database to execute my sampling process and, after that, a neural network for learning the behabiour.
Someone know how to import this type of data faster? I am thinking to make a database dump in other format (instead TXT), for exemplo SQL server and try to handle with this data accessing the database by queries.
Other doubt, after load all data, can I save in .MAT format and handle with this format in my experiments? Other better idea?
Thanks in advance.
It's impossible to hold such big matrix (5,478,000,000 values) in your workspace/memory (unless you've got tons of ram). So the file format (.mat or .csv) doesn't matter!
You definitly have to use a database (or split the file in sevaral smaller ones and calculate step by step (takes very long too).
Personaly, I only have experiances with sqlite3 and did similar with a 1.47mio x 23 matrix/csv file.
http://git.osuv.de/markus/sqlite-demo (Remember that my csv2sqlite.m was just designed to run with GNU Octave [19k seconds at night ...well, it was bad scripted too :) ].
After everything was imported to the sqlite3 database, I simply can access only the data I need within 8-12 seconds (take a look in the comment header of leistung.m).
If your csv file is straight, you can simply import it with sqlite3 itself
For example:
┌─[markus#x121e]─[/tmp]
└──╼ cat file.csv
0.9736834199195674,0.7239387515366997,0.3382008456696883
0.6963824911102146,0.8328410999877027,0.5863203843393815
0.2291736458336333,0.1427739134201017,0.8062332551565472
┌─[markus#x121e]─[/tmp]
└──╼ sqlite3 csv.db
SQLite version 3.8.4.3 2014-04-03 16:53:12
Enter ".help" for usage hints.
sqlite> CREATE TABLE csvtest (col1 TEXT NOT NULL, col2 TEXT NOT NULL, col3 TEXT NOT NULL);
sqlite> .separator ","
sqlite> .import file.csv csvtest
sqlite> select * from csvtest;
0.9736834199195674,0.7239387515366997,0.3382008456696883
0.6963824911102146,0.8328410999877027,0.5863203843393815
0.2291736458336333,0.1427739134201017,0.8062332551565472
sqlite> select col1 from csvtest;
0.9736834199195674
0.6963824911102146
0.2291736458336333
All is done with https://github.com/markuman/go-sqlite (Matlab and Octave compatible! but I guess no one but me has ever used it!)
However, I recommand Version 2-beta in branch 2 (git checkout -b 2 origin/2) running in coop mode (You'll hit max string length from sqlite3 in ego mode). There's a html doku for version 2 too. http://go-sqlite.osuv.de/doc/