kdb+: Save table into a csv file - export-to-csv

I have the below table "dates" , it has a sym column with symbols and a d column with list of strings and would like to save it into a regular CSV file. Couldn't find a good way to do it. Any suggestions?
q)dates
sym d
----------------------------------------------------------------------------
6AH0 "1970.03.16" "1980.03.17" "1990.03.19" "2010.03.15"
6AH6 "1976.03.15" "1986.03.17" "1996.03.18" "2016.03.14"
6AH7 "1977.03.14" "1987.03.16" "1997.03.17" "2017.03.13"
6AH8 "1978.03.13" "1988.03.14" "1998.03.16" "2018.03.19"
6AH9 "1979.03.19" "1989.03.13" "1999.03.15" "2019.03.18"
When I try to do the regular save the below error happens:
q)save `:dates.csv
k){$[t&77>t:#y;$y;x;-14!'y;y]}
'type
q))

The internal table->csv conversion function within Kdb+ is not able to handle nested lists in columns. The d column in your table is a list of list of chars. However, the conversion function is able to handle a simply nested column (depth of 1).
Therefore, you can convert the d column to a list of chars and then save to CSV using the internal function:
/ generate a table of dummy data
q)show dates:flip `sym`d!(`6AH0`6AH6`6AH7;string (3;0N)#12?.z.d)
sym d
--------------------------------------------------------
6AH0 "2008.02.04" "2015.01.02" "2003.07.05" "2005.02.25"
6AH6 "2012.10.25" "2008.08.28" "2017.01.25" "2007.12.27"
6AH7 "2004.02.01" "2005.06.06" "2013.02.11" "2010.12.20"
/ convert 'd' column to simple list - the (" " sv') is the conversion func here
q)#[`dates;`d;" " sv']
`dates
/ review what was done
q)show dates
sym d
--------------------------------------------------
6AH0 "2008.02.04 2015.01.02 2003.07.05 2005.02.25"
6AH6 "2012.10.25 2008.08.28 2017.01.25 2007.12.27"
6AH7 "2004.02.01 2005.06.06 2013.02.11 2010.12.20"
/ save to csv
q)save `:dates.csv
`:dates.csv
/ review saved csv
q)\cat dates.csv
"sym,d"
"6AH0,2008.02.04 2015.01.02 2003.07.05 2005.02.25"
"6AH6,2012.10.25 2008.08.28 2017.01.25 2007.12.27"
"6AH7,2004.02.01 2005.06.06 2013.02.11 2010.12.20"

As per the csv specification, you'll want to flatten the list out and separate each with a comma and double quote the list.
'save' is limited in that the file must be named the same as the global variable you are saving.
If I was tasked with your question I'd do it like so;
`:myFileNamedWhatever.csv 0: csv 0: select sym,csv sv'd from dates
Explanation;
csv 0: table /csv is a variable, literally defined as "," - its good for readability. csv 0: table converts the table to a comma separated list of strings
`:file 0: listOfStrings /this takes a LIST of strings and pushes them to the file handle. Each element of the list is a new line in the file
I'd prefer this approach as it is general and allows the saving of various types. You can use it within a function etc..
At a later date I decided that I wanted it saved as a pipe (or anything) separated file;
`:myNewFile.psv 0: "|" 0: select sym,"|"sv'd from table

Related

How To Take Left Side String From A Particular position using PostgreSql

I have a table in that table there is a column called eq1.sources in that column, entries are like mentioned below. Now I would like to extract the string from the left side to till card slot number only.
Example:
fdn:realm:pam:network:55.150.40.841:shelf-1:cardSlot-1:card:daughterCardSlot-1:daughterCard
for this entry I need only
fdn:realm:pam:network:55.150.40.841:shelf-1:cardSlot-1
similarly
fdn:realm:sam:network:35.250.40.834:shelf-1:cardSlot-1:card
for this entry I need
fdn:realm:sam:network:35.250.40.834:shelf-1:cardSlot-1
I have tried substring(eq1.sources,0,position (':card:daughter' in eq1.sources)). this is working only for row numbers 1,2,4,5,6,7,9,10 but row number 3,8,11 not working as the entries not continued with ':card:daughter'.
The column name for the below entries is eq1.sources.
1.fdn:realm:pam:network:55.150.40.841:shelf-1:cardSlot-1:card:daughterCardSlot-1:daughterCard
2.fdn:realm:pam:network:35.250.40.824:shelf-1:cardSlot-1:card:daughterCardSlot-1:daughterCard
3.fdn:realm:sam:network:35.250.40.834:shelf-1:cardSlot-1:card
4.fdn:realm:pam:network:55.159.40.994:shelf-1:cardSlot-2:card:daughterCardSlot-1:daughterCard
5.fdn:realm:pam:network:35.250.140.104:shelf-1:cardSlot-2:card:daughterCardSlot-1:daughterCard
6.fdn:realm:pam:network:55.170.40.1:shelf-1:cardSlot-2:card:daughterCardSlot-1:daughterCard
7.fdn:realm:pam:network:35.450.40.24:shelf-1:cardSlot-3:card:daughterCardSlot-1:daughterCard
8.fdn:realm:sam:network:35.250.40.14:shelf-1:cardSlot-3:card
9.fdn:realm:pam:network:55.150.40.854:shelf-1:cardSlot-4:card:daughterCardSlot-1:daughterCard
10.fdn:realm:pam:network:35.250.40.84:shelf-1:cardSlot-5:card:daughterCardSlot-1:daughterCard
11.fdn:realm:sam:network:35.250.40.84:shelf-1:cardSlot-6:card
Expecting a PostgreSQL query to extract left side substring from a particular position in a row.
Expected output is
1.fdn:realm:sam:network:35.250.40.834:shelf-1:cardSlot-1
2.fdn:realm:sam:network:35.250.40.14:shelf-1:cardSlot-3:card
from
1.fdn:realm:sam:network:35.250.40.834:shelf-1:cardSlot-1:card:daughterCardSlot-1:daughterCard
2.fdn:realm:sam:network:35.250.40.14:shelf-1:cardSlot-3:card
First split the string into an array with : as a delimiter (this is the t subquery) and then pick the first 7 array elements and join them again into a string with : delimiter.
select array_to_string(arr[1:7], ':') as sources
from
(
select string_to_array(sources, ':') as arr
from the_table
) as t;
See demo.

Convert a csv string to a table in csv

If we have a file containing csv then we can read it using 0:
say, we have a file x.csv on the disk then converting it to a table is easy as below
("SFJ";enlist",")0:`:/x.csv
But, how can we covert a csv string to table?
string:
"sym,px,vol
GG,10.2,100
AA,11.2,1000"
Expected output: table
sym px vol
"GG" 10.2 100
"AA" 11.2 1000
A string can be passed in using 0: instead of a file handle, and the table will be created as normal:
q)s:("sym,px,vol";"GG,10.2,100";"AA,11.2,1000")
q)s
"sym,px,vol"
"GG,10.2,100"
"AA,11.2,1000"
q)("SFJ";enlist",")0:s
sym px vol
-------------
GG 10.2 100
AA 11.2 1000
If you needed to programmatically get to Eliot's s from one big string csv there are a few options depending on the format of the csv string.
// \n delimited
s:` vs "sym,px,vol\nGG,10.2,100\nAA,11.2,1000"
// if you know the row and col count.
s:3 3#"," vs "sym,px,vol,GG,10.2,100,AA,11.2,1000"
// if you just know the col count
s:"sym,px,vol,GG,10.2,100,AA,11.2,1000"
f:{[str;noCol]
str:"," vs str;
noRow:`long$(count str)%noCol;
(noRow, noCol)#str
}
f[s;3]
All three output this ("sym,px,vol";"GG,10.2,100";"AA,11.2,1000")

kdb+: Save table with a column with a list of float into a csv file

I have a table "floats" with two columns: sym and prices. sym elements are strings and prices elements are list of floats.
q)LF:((3.0;1.0;2.0);(5.0;7.0;4.0);(2.0;8.0;9.0))
q)show floats:flip `sym`prices!(`6AH0`6AH6`6AH7;LF)
sym prices
-----------
6AH0 3 1 2
6AH6 5 7 4
6AH7 2 8 9
I want to export the table "floats" on a csv file but I get this error:
q)save `:floats.csv
'type
[0] save `:floats.csv
I followed this post kdb+: Save table into a csv file which solves the problem if the column is a list of string. Unfortunately when I try to convert the "prices" column to a list of chars and then save to CSV using the internal function, the procedure returns errors:
q))#[`floats;`prices;" " sv']
'type
[7] #[`floats;`prices;" " sv']
^
q))#[`floats;`prices;string]
'noamend: `. `floats
[10] #[`floats;`prices;string]
^
q))#[`floats;string `prices;" " sv']
'noamend: `. `floats
[10] #[`floats;string `prices;" " sv']
^
Please help me in converting the "prices" column to a list of chars and then save to CSV using the internal function or provide valid alternatives to export the table on a text file.
First, you need to convert float to string then use sv with adverb each right denoted by /: .
floats: update " " sv/: string each prices from floats

Get substring into a new column

I have a table that contains a column that has data in the following format - lets call the column "title" and the table "s"
title
ab.123
ab.321
cde.456
cde.654
fghi.789
fghi.987
I am trying to get a unique list of the characters that come before the "." so that i end up with this:
ab
cde
fghi
I have tried selecting the initial column into a table then trying to do an update to create a new column that is the position of the dot using "ss".
something like this:
t: select title from s
update thedot: (title ss `.)[0] from t
i was then going to try and do a 3rd column that would be "N" number of characters from "title" where N is the value stored in "thedot" column.
All i get when i try the update is a "type" error.
Any ideas? I am very new to kdb so no doubt doing something simple in a very silly way.
the reason why you get the type error is because ss only works on string type, not symbol. Plus ss is not vector based function so you need to combine it with each '.
q)update thedot:string[title] ss' "." from t
title thedot
---------------
ab.123 2
ab.321 2
cde.456 3
cde.654 3
fghi.789 4
There are a few ways to solve your problem:
q)select distinct(`$"." vs' string title)[;0] from t
x
----
ab
cde
fghi
q)select distinct(` vs' title)[;0] from t
x
----
ab
cde
fghi
You can read here for more info: http://code.kx.com/q/ref/casting/#vs
An alternative is to make use of the 0: operator, to parse around the "." delimiter. This operator is especially useful if you have a fixed number of 'columns' like in a csv file. In this case where there is a fixed number of columns and we only want the first, a list of distinct characters before the "." can be returned with:
exec distinct raze("S ";".")0:string title from t
`ab`cde`fghi
OR:
distinct raze("S ";".")0:string t`title
`ab`cde`fghi
Where "S " defines the types of each column and "." is the record delimiter. For records with differing number of columns it would be better to use the vs operator.
A variation of WooiKent's answer using each-right (/:) :
q)exec distinct (` vs/:x)[;0] from t
`ab`cde`fghi

concatenating text to a column in pig

I have a day column and a month column and would like to concatenate the year to it and store it in CHARARRAY format with the hyphens.
so I have: month:CHARARRAY, day:CHARARRAY
Meaning, for example, if the day column contains '03' and the month column contains '04', I would like to create a date column that contains: '2014-04-03'
This is my code:
CONCAT('2014-',month,'-',day) as date;
It doesn't work and I'm not quite sure how to concatenate additional text onto the column.
I would like to note that I'm not sure converting to date format is an option for me. I would prefer to keep it in CHARARRAY format since I would like to join with another file that has date stored in CHARARRAY format.
Assuming this is the data file called dateExample.csv:
Surender,02,03,1988
Raja,12,09,1998
Raj,05,10,1986
This is the script for pig:
A = LOAD 'dateExample.csv' USING PigStorage(',') AS(name:chararray,day:chararray,month:long,year:chararray);
X = FOREACH A GENERATE CONCAT((chararray)day,CONCAT('-',CONCAT((chararray)month,CONCAT('-',(chararray)year))));
dump X;
You will get the desired output:
(02-3-1988)
(12-9-1998)
(05-10-1986)
Explanation:
When we try to concat like this:
X = FOREACH A GENERATE CONCAT(day,CONCAT('-',CONCAT(month,CONCAT('-',year))));
We get following exception :
ERROR 1045:
<line 2, column 45> Could not infer the matching function for org.apache.pig.builtin.CONCAT as multiple or none of them fit. Please use an explicit cast.
So we need to explicitly cast the day,month and year values to chararray and it works!!