Underscore fill in KDB - kdb

Is there another way to fill _ to either left or right of the given string than the following?
q)a:"Apple"
q)"_"^rotate[5;a#til 12]
"_______Apple"
q)"_"^a#til 12
"Apple_______"`

The $ operator is overloaded to pad the string with null(space):
q)"_"^12$"Apple"
"Apple_______"
q)"_"^-12$"Apple"
"_______Apple"

You can append any string by using ","
q) (12#"-"),"Apple"
"____________Apple"
q) "Apple",12#"_"
"Apple____________"

Related

how do we iterate and store the results in a variable in kdb

I have a string say example "https://www.google.com" and a count of paging say 5
how do i iterate the URL to append p/page=incrementing numbers of paging and store the result in a variable as list?
"https://www.google.com/page=1"
"https://www.google.com/page=2"
"https://www.google.com/page=3"
"https://www.google.com/page=4"
"https://www.google.com/page=5"
So the end result will look like this having a variable query_var which will hold a list of string example below
query_var:("https://www.google.com/page=1";"https://www.google.com/page=2";"https://www.google.com/page=3";"https://www.google.com/page=4";"https://www.google.com/page=5");
count query_var \\5
You can use the join function , with the each-right adverb /:
query_var: "https://www.google.com/page=" ,/: string 1_til 6
Make it a function to support a varied count of pages:
f:{"https://www.google.com/page=" ,/: string 1_til x+1}
q)10=count f[10]
1b
Q1
You don’t even need a lambda to make this a function. It’s a sequence of unary functions, so you can compose them.
q)f: "https://www.google.com/page=",/: string ::
q)f 1+til 6
"https://www.google.com/page=1"
"https://www.google.com/page=2"
"https://www.google.com/page=3"
"https://www.google.com/page=4"
"https://www.google.com/page=5"
"https://www.google.com/page=6"
Q2
q)("https://www.google.com";;"info/history")
enlist["https://www.google.com";;"info/history"]
q)"/"sv'("https://www.google.com";;"info/history")#/: string `aapl`msft`nftx
"https://www.google.com/aapl/info/history"
"https://www.google.com/msft/info/history"
"https://www.google.com/nftx/info/history"
List notation is syntactic sugar for enlist. The list with a missing item is a projection of enlist and can be iterated.
Again, a sequence of unaries is all composable without a lambda:
q)g: "/"sv'("https://www.google.com";;"info/history")#/: string ::
q)g `aapl`msft`nftx
"https://www.google.com/aapl/info/history"
"https://www.google.com/msft/info/history"
"https://www.google.com/nftx/info/history"
If you can assume a unique character that doesn't appear elsewhere in your url (e.g. #) then ssr is a simple and easily-readable approach:
ssr["https://www.google.com/page=#";"#";]each string 1+til 5
ssr["https://www.google.com/#/info/history";"#";]each string`aapl`msft`nftx

Regex expression in q to match specific integer range following string

Using q’s like function, how can we achieve the following match using a single regex string regstr?
q) ("foo7"; "foo8"; "foo9"; "foo10"; "foo11"; "foo12"; "foo13") like regstr
>>> 0111110b
That is, like regstr matches the foo-strings which end in the numbers 8,9,10,11,12.
Using regstr:"foo[8-12]" confuses the square brackets (how does it interpret this?) since 12 is not a single digit, while regstr:"foo[1[0-2]|[1-9]]" returns a type error, even without the foo-string complication.
As the other comments and answers mentioned, this can't be done using a single regex. Another alternative method is to construct the list of strings that you want to compare against:
q)str:("foo7";"foo8";"foo9";"foo10";"foo11";"foo12";"foo13")
q)match:{x in y,/:string z[0]+til 1+neg(-/)z}
q)match[str;"foo";8 12]
0111110b
If your eventual goal is to filter on the matching entries, you can replace in with inter:
q)match:{x inter y,/:string z[0]+til 1+neg(-/)z}
q)match[str;"foo";8 12]
"foo8"
"foo9"
"foo10"
"foo11"
"foo12"
A variation on Cillian’s method: test the prefix and numbers separately.
q)range:{x+til 1+y-x}.
q)s:"foo",/:string 82,range 7 13 / include "foo82" in tests
q)match:{min(x~/:;in[;string range y]')#'flip count[x]cut'z}
q)match["foo";8 12;] s
00111110b
Note how unary derived functions x~/: and in[;string range y]' are paired by #' to the split strings, then min used to AND the result:
q)flip 3 cut's
"foo" "foo" "foo" "foo" "foo" "foo" "foo" "foo"
"82" ,"7" ,"8" ,"9" "10" "11" "12" "13"
q)("foo"~/:;in[;string range 8 12]')#'flip 3 cut's
11111111b
00111110b
Compositions rock.
As the comments state, regex in kdb+ is extremely limited. If the number of trailing digits is known like in the example above then the following can be used to check multiple patterns
q)str:("foo7"; "foo8"; "foo9"; "foo10"; "foo11"; "foo12"; "foo13"; "foo3x"; "foo123")
q)any str like/:("foo[0-9]";"foo[0-9][0-9]")
111111100b
Checking for a range like 8-12 is not currently possible within kdb+ regex. One possible workaround is to write a function to implement this logic. The function range checks a list of strings start with a passed string and end with a number within the range specified.
range:{
/ checking for strings starting with string y
s:((c:count y)#'x)like y;
/ convert remainder of string to long, check if within range
d:("J"$c _'x)within z;
/ find strings satisfying both conditions
s&d
}
Example use:
q)range[str;"foo";8 12]
011111000b
q)str where range[str;"foo";8 12]
"foo8"
"foo9"
"foo10"
"foo11"
"foo12"
This could be made more efficient by checking the trailing digits only on the subset of strings starting with "foo".
For your example you can pad, fill with a char, and then simple regex works fine:
("."^5$("foo7";"foo8";"foo9";"foo10";"foo11";"foo12";"foo13")) like "foo[1|8-9][.|0-2]"

XSLT Substring not woring [duplicate]

Let's say I have this string: "123_12345_123456"
I would like to extract everything before the second "_" (underscore)
I tried:
fn:tokenize("123_1234_12345", '_')[position() le 2]
That returns:
123
1234
What I actually want is:
123_1234
How do I achieve that?
I am using XQuery 1.0
Regular expressions are flexible and compact:
replace('123_1234_12345', '_[^_]+$', '')
Another solution that may be better readable is to a) tokenize the string, b) keep the tokens you want to preserve and c) join them again:
string-join(
tokenize('123_1234_12345', '_')[position() = 1 to 2],
'_'
)
Taking the basic idea from Michael Kay's deleted answer, it could be implemented like this:
substring($input, 1, index-of(string-to-codepoints($input), 95)[2] - 1)

Using IN and Startswith operators together

I am trying to add a startswith operator to my formula below, because I need it to return all values starting TRA or MTA.
IF {STK_LOCATION.LOC_CODE}
IN ['TRA', 'MTA']
THEN {STK_LOCATION.LOC_STOCK_CODE}
ELSE {STK_LOCATION.LOC_STOCK_CODE} + LEFT({STK_LOCATION.LOC_CODE},4)
The IN function compares your input with the whole string. Try startswith:
IF({STK_LOCATION.LOC_CODE} startswith["TRA","MTA"]) THEN
{STK_LOCATION.LOC_STOCK_CODE}
ELSE
{STK_LOCATION.LOC_STOCK_CODE} + LEFT({STK_LOCATION.LOC_CODE},4)
And use always double quotes when you use strings, CR is not SQL

How to strip everything except digits from a string in Scala (quick one liners)

This is driving me nuts... there must be a way to strip out all non-digit characters (or perform other simple filtering) in a String.
Example: I want to turn a phone number ("+72 (93) 2342-7772" or "+1 310-777-2341") into a simple numeric String (not an Int), such as "729323427772" or "13107772341".
I tried "[\\d]+".r.findAllIn(phoneNumber) which returns an Iteratee and then I would have to recombine them into a String somehow... seems horribly wasteful.
I also came up with: phoneNumber.filter("0123456789".contains(_)) but that becomes tedious for other situations. For instance, removing all punctuation... I'm really after something that works with a regular expression so it has wider application than just filtering out digits.
Anyone have a fancy Scala one-liner for this that is more direct?
You can use filter, treating the string as a character sequence and testing the character with isDigit:
"+72 (93) 2342-7772".filter(_.isDigit) // res0: String = 729323427772
You can use replaceAll and Regex.
"+72 (93) 2342-7772".replaceAll("[^0-9]", "") // res1: String = 729323427772
Another approach, define the collection of valid characters, in this case
val d = '0' to '9'
and so for val a = "+72 (93) 2342-7772", filter on collection inclusion for instance with either of these,
for (c <- a if d.contains(c)) yield c
a.filter(d.contains)
a.collect{ case c if d.contains(c) => c }