How do I adds a substring to every column in kdb - kdb

My current columns in kdb is (Time;Buy;Sell). What should I do to change my column names to (Time_hist;Buy_hist;Sell_hist)?
Thank you!

Following expression will append "_hist" to all column names
(`$(string cols t),\:"_hist") xcol t
where t is table.
string cols t - retrieves all column names and converts them to strings
(string cols t),\:"_hist" appends "_hist" to each column name on the left
colnames xcol t renames table column names. See xcol for more details

You can use xcol to rename columns in kdb:
q)tab:([] Time:(.z.t-10;.z.t-5;.z.t);Buy:23 35 42;Sell:22 33 40)
q)tab
Time Buy Sell
---------------------
15:51:50.746 23 22
15:51:50.751 35 33
15:51:50.756 42 40
q)`Time_hist`Buy_hist`Sell_hist xcol tab
Time_hist Buy_hist Sell_hist
-------------------------------
15:51:50.746 23 22
15:51:50.751 35 33
15:51:50.756 42 40
More documentation can be found at:
https://code.kx.com/q/ref/cols/

Related

query specific table columns

I have table with specific column names. They have the prefix 'file_'.
For example:
Column Name
Value
name
somename
date
2000-01-01
size
15
file_type1
1
file_type2
34
.....
....
file_typeN
12
The file types columns 'file_typeN' can be added by another team to table (even may be deleted).
So I want to create sql query to select only values for columns with prefix 'file_'.
The one query for the table my_files_description_table, which can have different number of columns with 'file_' prefix.
Something like:
select <only columns with 'file_' prefix> from my_files_description_table;
I can query all columns with 'file_' prefix:
SELECT column_name FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = 'my_files_description_table' and column_name like 'file_%';
But I don't know what to do with that.
I need the query that for this table
Column Name
Value
name
somename
date
2000-01-01
size
15
file_type1
1
file_type2
34
should return
Column Name
Value
file_type1
1
file_type2
34
And for this table
Column Name
Value
name
somename
date
2000-01-01
size
15
file_type1
2
file_type2
5
file_type3
134
file_type4
12
should return
Column Name
Value
file_type1
2
file_type2
5
file_type3
134
file_type4
12
I use PostgreSQL 9.6.

Usage of DISTINCT in reversed int pairs duplicates elimination

I have a following question:
create table memorization_word_translation
(
id serial not null
from_word_id integer not null
to_word_id integer not null
);
This table stores pairs of integers, that are often in reverse order, for example:
35 36
35 37
36 35
37 35
37 39
39 37
Question is - if I make a query, for example:
select * from memorization_word_translation
where from_word_id = 35 or to_word_id = 35
I would get
35 36
35 37
36 35 - duplicate of 35 36
37 35 - duplicate of 35 37
How is to use DISTINCT in this example to filter out all duplicates even if they are reversed?
I want to keep it only like this:
35 36
35 37
You can do it with ROW_NUMBER() window function:
select from_word_id, to_word_id
from (
select *,
row_number() over (
partition by least(from_word_id, to_word_id),
greatest(from_word_id, to_word_id)
order by (from_word_id > to_word_id)::int
) rn
from memorization_word_translation
where 35 in (from_word_id, to_word_id)
) t
where rn = 1
See the demo.
demo:db<>fiddle
You could try a it with a small sorting algorithm (here a comparison) in combination with DISTINCT ON.
The DISTINCT ON clause works an arbitrary columns or terms, e.g. on a tuple. This CASE clause sorts the two columns into tuples and removes tied (ordered) ones. The source columns can be returned in your SELECT statement:
select distinct on (
CASE
WHEN (from_word_id >= to_word_id) THEN (from_word_id, to_word_id)
ELSE (to_word_id, from_word_id)
END
)
*
from memorization_word_translation
where from_word_id = 35 or to_word_id = 35

KDB selecting first row from each group

Very silly question... Consider the table t1 below which is sorted by sym.
t1:([]sym:(3#`A),(2#`B),(4#`C);val:10 40 12 50 58 75 22 103 108)
sym val
A 10
A 40
A 12
B 50
B 58
C 75
C 22
C 103
C 108
I want to select the first row corresponding to each sym, like this:
(`sym`val)!(`A`B`C;10j, 50j, 75j)
sym val
A 10
B 50
C 75
There's got to be a one-liner to do this. To get the LAST row for each sym, it would be as simple as select by sym from t1. Any hints?
select first val by sym from t1
Or for multiple columns, you can reverse the table and run your query:
select by sym from reverse t1
You could use fby
q)select from t1 where i=(first;i) fby sym
sym val
-------
A 10
B 50
C 75

kdb: dynamically denormalize a table (convert key values to column names)

I have a table like this:
q)t:([sym:(`EURUSD`EURUSD`AUDUSD`AUDUSD);server:(`S01`S02`S01`S02)];volume:(20;10;30;50))
q)t
sym server| volume
-------------| ------
EURUSD S01 | 20
EURUSD S02 | 10
AUDUSD S01 | 30
AUDUSD S02 | 50
I need to de-normalize it to display the data nicely. The resulting table should look like this:
sym | S01 S02
------| -------
EURUSD| 20 10
AUDUSD| 30 50
How do I dynamically convert the original table using distinct values from server column as column names for the new table?
Thanks!
Basically you want 'pivot' table. Following page has a very good solution for your problem:
http://code.kx.com/q/cookbook/pivoting-tables/
Here are the commands to get the required table:
q) P:asc exec distinct server from t
q) exec P#(server!volume) by sym:sym from t
One tricky thing around pivoting a table is - the keys of the dictionary should be of type symbol otherwise it won't generate the pivot table structure.
E.g. In the following table, we have a column dt with type as date.
t:([sym:(`EURUSD`EURUSD`AUDUSD`AUDUSD);dt:(0 1 0 1+.z.d)];volume:(20;10;30;50))
Now if we want to pivot it with columns as dates , it will generate a structure like :
q)P:asc exec distinct dt from t
q)exec P#(dt!volume) by sym:sym from t
(`s#flip (enlist `sym)!enlist `s#`AUDUSD`EURUSD)!((`s#2018.06.22 2018.06.23)!30j, 50j;(`s#2018.06.22 2018.06.23)!20j, 10j)
To get the dates as the columns , the dt column has to be typecasted to symbol :
show P:asc exec distinct `$string date from t
`s#`2018.06.22`2018.06.23
q)exec P#((`$string date)!volume) by sym:sym from t
sym | 2018.06.22 2018.06.23
------| ---------------------
AUDUSD| 30 50
EURUSD| 20 10

need help writing a date sensitive T-SQL query

I need help writing a T-SQL query that will generate 52 rows of data per franchise from a table that will often contain gaps in the 52 week sequence per franchise (i.e., the franchise may have reported data bi-weekly or has not been in business for a full year).
The table I'm querying against looks something like this:
FranchiseId | Date | ContractHours | PrivateHours
and I need to join it to a table similar to this:
FranchiseId | Name
The output of the query needs to look like this:
Name | Date | ContractHours | PrivateHours
---- ---------- ------------- ------------
AZ1 08-02-2011 292 897
AZ1 07-26-2011 0 0 -- default to 0's for gaps in sequence
...
AZ1 08-03-2010 45 125 -- row 52 for AZ1
AZ2 08-02-2011 382 239
...
AZ2 07-26-2011 0 0 -- row 52 for AZ2
I need this style of output for every franchise, i.e., 52 rows of data with default rows for any gaps in the 52 week sequence, in a single result set. Thus, if there are 100 franchises, the result set should be 5200 rows.
What I've Tried
I've tried the typical suggestions of:
Create a table with all possible dates
LEFT OUTER JOIN this to the table of data needed
The problems I'm running into are
ensuring that for every franchise their are 52 rows and
filling in gaps with the franchise name and 0 for hours, I can't
have the following in the result set:
Name | Date | ContractHours | PrivateHours
---- ---------- ------------- ------------
NULL 08-02-2011 NULL NULL
I don't know where to go from here? Is there an efficient way to write a T-SQL query that will produce the required output?
The bare bones is this
Generate 52 week ranges
Cross join with Franchise
LEFT JOIN the actual date
ISNULL to substitute zeroes
So, like this, untested
;WITH cDATE AS
(
SELECT
CAST('20100101' AS date /*smalldatetime*/) AS StartOfWeek,
CAST('20100101' AS date /*smalldatetime*/) + 6 AS EndOfWeek
UNION ALL
SELECT StartOfWeek + 7, EndOfWeek + 7
FROM cDATE WHERE StartOfWeek + 7 < '20110101'
), Possibles AS
(
SELECT
StartOfWeek, FranchiseID
FROM
cDATE CROSS JOIN Franchise
)
SELECT
P.FranchiseID,
P.StartOfWeek,
ISNULL(SUM(O.ContractHours), 0),
ISNULL(SUM(O.PrivateHours), 0)
FROM
Possibles P
LEFT JOIN
TheOtherTable O ON P.FranchiseID = O.FranchiseID AND
O.Date BETWEEN P.StartOfWeek AND P.EndOfWeek
GROUP BY
P.FranchiseID