Say we have two tables both sorted on the time column:
t1:`time xasc ([]time:5?100;v:5?1000)
t2:`time xasc ([]time:5?100;v:5?1000)
Is there an efficient way to get the same result as `time xasc t1,t2 , using the fact that the two tables are already sorted? I looked at aj but I wasn't able to find the "combine two tables" functionality I need here.
There is no native merge-sort/binary-sort in kdb so the optimal available approach is asc x,y. If you go down the path of replicating a merge/binary sort in kdb then you're unlikely to get it faster than the native asc x,y. You could alternatively try to write a merge/binary sort in C and import a shared library to use in kdb
I have two tables:
CompanyCases is a table of companies and case numbers like this:
CompanyId, CaseNumber
CaseRelations is a table of cases and their related cases like this:
CaseNumber, RelatedCase
There can be several companies related to one case, and one case can be related to several companies.
I need a query that will give me all related cases for a company id. The trick is that when I find the related case, that can also have a related case, which can have a related case etc.
My first assumption was that it would not be that deep, so I could just do self joins like:
Select
cc.CompanyId,
cc.CaseNumber,
CR1.CaseNumber,
CR1.RelatedCase,
CR2.CaseNumber,
CR2.RelatedCase
FROM CompanyCases cc
LEFT JOIN CaseRelations CR1 ON CR1.CaseNumber = cc.CaseNumber
LEFT JOIN CaseRelations CR2 ON CR2.CaseNumber = CR1.RelatedCase
And then keep joining as many levels as is needed. The problem is that the cases loop. So it can go like this:
CaseNumber RelatedCase
1 2
2 3
3 1
So I can keep joining forever without reaching a full column of nulls. Also it is at least 5 levels deep so this is not a great solution. I don't mind using recursive CTEs either but I think I will get the same problem with the circular cases.
I hope I described it well enough - Does anyone know how to solve this?
Thanks in advance :)
I put together a few lines to partition my kdb table, which contains string columns of course and thus must to be enumerated.
I wonder if this code is completely correct or if it can be simplified further. In particular, I have some doubt about the need to create a partitioned table schema given the memory table and the disk table will have exactly the same layout. Also, there might be a way to avoid creating the temporary tbl_mem and tbl_mem_enum tables:
...
tbl_mem: select ts,sym,msg_type from oms_mem lj sym_mem;
tbl_mem_enum: .Q.en[`$sym_path] tbl_mem;
delete tbl_mem from `.;
(`$db;``!((17;2;9);(17;2;9))) set ([]ts:`time$(); ticker:`symbol$(); msg_type:`symbol$());
(`$db) upsert (select ts,ticker:sym,msg_type from tbl_mem_enum)
delete tbl_mem_enum from `.;
PS: I know, I shouldn't use "_" to name variables, but then what do I use to separate words in a variable or function name? . is also a kdb function.
I think you mean that your table contains symbol columns - these are the columns that you need to enumerate (strings don't need enumeration). You can do the write and enumeration in a single step. Also if you are using the same compression algo/level on all columns then it may be easier to just use .z.zd:
.z.zd:17 2 9i;
(`$db) set .Q.en[`$sym_path] select ts, ticker:sym, msg_type from oms_mem lj sym_mem;
It's generally recommended to use camelCase instead of '_'. Some useful info here: http://www.timestored.com/kdb-guides/q-coding-standards
I have a select statement and a cursor to iterate the rows I get. the problem is that I have many columns (more than 500), and so "fetch .. into #variable" is impossible for me. how can I iterate the columns (one by one, I need to process the data)?
Thanks in advance,
n.b
Two choices.
1/ Use SSIS or ADO.Net to pour through your dataset row by row.
2/ Consider what you're actually needing to achieve and find a set-based approach.
My preference is for option 2. Let us know what you need done and we'll find a way.
Rob
You can build a SQL string using sys.columns or INFORMATION_SCHEMA queries. Here's a post I wrote on that.
Aheo asks if it is ok to have a table with just one column. How about one with no columns, or, given that this seems difficult to do in most modern "relational" DBMSes, a relation with no attributes?
There are exactly two relations with no attributes, one with an empty tuple, and one without. In The Third Manifesto, Date and Darwen (somewhat) humorously name them TABLE_DEE and TABLE_DUM (respectively).
They are useful to the extent that they are the identity of a variety of relational operators, playing roles equivalent to 1 and 0 in ordinary algebra.
A table with a single column is a set -- as long as you don't care about ordering the values, or associating any other info with them, it seems fine. You can check for membership in it, and basically that's all you can do. (If you don't have a UNIQUE constraint on the single column I guess you could also count number of occurrences... a multiset).
But what in blazes would a table with no columns (or a relation with no attributes) mean -- or, how would it be any good?!
DEE and cartesian product form a monoid. In practice, if you have Date's relational summarize operator, you'd use DEE as your grouping relation to obtain grand-totals. There are many other examples where DEE is practically useful, e.g. in a functional setting with a binary join operator you'd get n-ary join = foldr join dee
"There are exactly two relations with no attributes, one with an empty tuple, and one without. In The Third Manifesto, Date and Darwen (somewhat) humorously name them TABLE_DEE and TABLE_DUM (respectively).
They are useful to the extent that they are the identity of a variety of relational operators, playing a roles equivalent to 1 and 0 in ordinary algebra."
And of course they also play the role of "TRUE" and "FALSE" in boolean algebra. Meaning that they are useful when propositions such as "The shop is open" and "The alarm is set" are to be represented in a database.
A consequence of this is that they can also be usefully employed in any expression of the relational algebra for their properties of "acting as an IF/ELSE" : joining to TABLE_DUM means retaining no tuples at all from the other argument, joining to TABLE_DEE means retaining them all. So joining R to a relvar S which can be equal to either TABLE_DEE or TABLE_DUM, is the RA equivalent of "if S then R else FI", with FI standing for the empty relation.
Hm. So the lack of "real-world examples" got to me, and I tried my best. Perhaps surprisingly, I got half way there!
cjs=> CREATE TABLE D ();
CREATE TABLE
cjs=> SELECT COUNT (*) FROM D;
count
-------
0
(1 row)
cjs=> INSERT INTO D () VALUES ();
ERROR: syntax error at or near ")"
LINE 1: INSERT INTO D () VALUES ();
A table with a single column would make sense as a simple lookup. Let's say you have a list of strings you want to filter against for user inputed text. That table would store the words you would want to filter out.
It is difficult to see utility of TABLE_DEE and TABLE_DUM from SQL Database perspective. After all it is not guaranteed that your favorite db vendor allows you creating one or the other.
It is also difficult to see utility of TABLE_DEE and TABLE_DUM in relational algebra. One have to look beyond that. To get you a flavor how these constants can come alive consider relational algebra put into proper mathematical shape, that is as close as it is possible to Boolean algebra. D&D Algebra A is a step in this direction. Then, one can express classic relational algebra operations via more fundamental ones and those two constants become really handy.