Remove empty string from a list of strings in kdb - kdb

I have a list of strings:
q)l:("abc";"";"def";"");
How can we remove empty strings from the list l?
Desired Output:
("abc";"def")
My failed attempts:
q)l except ""
q)l except\: ""
q)l except 1#""

Using enlist on the empty string will work:
q)l except enlist""
"abc"
"def"
In many cases, 1# and enlist can be used interchangeably, as long as the list is not empty. Applying 1# on an empty list will return an enlisted null of the empty list type provided:
q)1#`long$()
,0N
q)1#`symbol$()
,`
q)1#""
," "

Related

kdb+: Save table with a column with a list of float into a csv file

I have a table "floats" with two columns: sym and prices. sym elements are strings and prices elements are list of floats.
q)LF:((3.0;1.0;2.0);(5.0;7.0;4.0);(2.0;8.0;9.0))
q)show floats:flip `sym`prices!(`6AH0`6AH6`6AH7;LF)
sym prices
-----------
6AH0 3 1 2
6AH6 5 7 4
6AH7 2 8 9
I want to export the table "floats" on a csv file but I get this error:
q)save `:floats.csv
'type
[0] save `:floats.csv
I followed this post kdb+: Save table into a csv file which solves the problem if the column is a list of string. Unfortunately when I try to convert the "prices" column to a list of chars and then save to CSV using the internal function, the procedure returns errors:
q))#[`floats;`prices;" " sv']
'type
[7] #[`floats;`prices;" " sv']
^
q))#[`floats;`prices;string]
'noamend: `. `floats
[10] #[`floats;`prices;string]
^
q))#[`floats;string `prices;" " sv']
'noamend: `. `floats
[10] #[`floats;string `prices;" " sv']
^
Please help me in converting the "prices" column to a list of chars and then save to CSV using the internal function or provide valid alternatives to export the table on a text file.
First, you need to convert float to string then use sv with adverb each right denoted by /: .
floats: update " " sv/: string each prices from floats

Get substring into a new column

I have a table that contains a column that has data in the following format - lets call the column "title" and the table "s"
title
ab.123
ab.321
cde.456
cde.654
fghi.789
fghi.987
I am trying to get a unique list of the characters that come before the "." so that i end up with this:
ab
cde
fghi
I have tried selecting the initial column into a table then trying to do an update to create a new column that is the position of the dot using "ss".
something like this:
t: select title from s
update thedot: (title ss `.)[0] from t
i was then going to try and do a 3rd column that would be "N" number of characters from "title" where N is the value stored in "thedot" column.
All i get when i try the update is a "type" error.
Any ideas? I am very new to kdb so no doubt doing something simple in a very silly way.
the reason why you get the type error is because ss only works on string type, not symbol. Plus ss is not vector based function so you need to combine it with each '.
q)update thedot:string[title] ss' "." from t
title thedot
---------------
ab.123 2
ab.321 2
cde.456 3
cde.654 3
fghi.789 4
There are a few ways to solve your problem:
q)select distinct(`$"." vs' string title)[;0] from t
x
----
ab
cde
fghi
q)select distinct(` vs' title)[;0] from t
x
----
ab
cde
fghi
You can read here for more info: http://code.kx.com/q/ref/casting/#vs
An alternative is to make use of the 0: operator, to parse around the "." delimiter. This operator is especially useful if you have a fixed number of 'columns' like in a csv file. In this case where there is a fixed number of columns and we only want the first, a list of distinct characters before the "." can be returned with:
exec distinct raze("S ";".")0:string title from t
`ab`cde`fghi
OR:
distinct raze("S ";".")0:string t`title
`ab`cde`fghi
Where "S " defines the types of each column and "." is the record delimiter. For records with differing number of columns it would be better to use the vs operator.
A variation of WooiKent's answer using each-right (/:) :
q)exec distinct (` vs/:x)[;0] from t
`ab`cde`fghi

insert string with special characters into KDB+

What type should I use to create a table in KDB+ and insert a string with special charcters: spaces, #, -, etc. - it looks like KDB+ treat all these and similar characters specially, because when I create a table like this:
t: ([] str: ())
And insert the string "abc # efgf - ABC.FS #.... TEST TEST" - long string with different characters, including spaces, - and # like this:
`t insert "abc # efgf - ABC.FS #.... TEST TEST"
KDB returns type exception.
Your problem here doesn't come from the special characters, it comes from the fact that a string is a list of characters. You need to use enlist to insert the string as a single element into the table.
In fact, this case is a bit atypical because you only have one column in the table, so you actually need to use enlist twice, as kdb expects a list of column data as the second argument in insert. So for this table use
`t insert enlist enlist "blah blah # # #"
If you had a table with more than one column, then you only need one enlist for the string, e.g.
t:([]id:(); str:())
`t insert (1; enlist "blah blah # # #")
I'm not sure why you get 'type error though as mine got 'length error...
Anyway, insert expect right hand side to be a list containing items that match the number of columns. i.e.
q)t:([]a:();b:())
q)`t insert (1;2) / single record matching 2 columns
,0
To insert multiple records, right hand side will be a nested list, each item is a list.
q)`t insert (2 3;4 5)
1 2
So in your case, to insert a single record of string, you need a singleton list that contains an enlisted string:
q)t: ([] str: ())
q)`t insert enlist enlist "abc # efgf - ABC.FS #.... TEST TEST"
,0
q)t
str
-------------------------------------
"abc # efgf - ABC.FS #.... TEST TEST"

Getting the values in between

In Report Builder 3.0 I have a string of comma separated values e.g. "Value1, Value2, Value3, Value4". I have used Split() to get the first and the last position but how do I get everything that is in between "Value1" and "Value4". Can I remove the first and last position in the array that the split function creates? The result i am looking for is "Value2, Value3".
I can think of a couple of methods:
=Trim(Split(Fields!valueString.Value, ",")(1))
& ", "
& Trim(Split(Fields!valueString.Value, ",")(2))
Similar to what you're already doing, just concatenates the two values back together. Trim just helps avoid any issues with whitespace.
=Trim(Mid(Fields!valueString.Value
, InStr(Fields!valueString.Value, ",") + 1
, InStrRev(Fields!valueString.Value, ",") - InStr(Fields!valueString.Value, ",") - 1))
This just uses string manipulation based on comma positions. Again Trim is used to clean up whitespace.
Neither of these are necessarily perfect when your input string is an unexpected format, so depending on your data extra checks might be required, but for Value1, Value2, Value3, Value4 both expressions return Value2, Value3 as required.

regexp_matches better way to get rid of returning curly brackets

Is there some better way to trim {""} in result of regexp_matches than:
trim(trailing '"}' from trim(leading '{"' from regexp_matches(note, '[0-9a-z \r\n]+', 'i')::text))
regexp_matches() returns an array of all matches. The string representation of an array contains the curly braces that's why you get them.
If you just want a list of all matched items, you can use array_to_string() to convert the result into a "simple" text data type:
array_to_string(regexp_matches(note, '[0-9a-z \r\n]+', 'i'), ';')
If you are only interested in the first match, you can select the first element of the array:
(regexp_matches(note, '[0-9a-z \r\n]+', 'i'))[1]