in KDB/Q, how do I generate all combinations given an alphabet universe (doesn't have to be string, could be like a list of numbers) for a given size n?
What I am looking for is similar to this in python, but in q.
How to generate all possible strings in python?
Using cross would be the easiest method:
q){y(x cross)/x}["ABC";1]
"AA"
"AB"
"AC"
"BA"
"BB"
"BC"
"CA"
"CB"
"CC"
q){y(x cross)/x}["ABC";2]
"AAA"
"AAB"
"AAC"
...
Related
Using q’s like function, how can we achieve the following match using a single regex string regstr?
q) ("foo7"; "foo8"; "foo9"; "foo10"; "foo11"; "foo12"; "foo13") like regstr
>>> 0111110b
That is, like regstr matches the foo-strings which end in the numbers 8,9,10,11,12.
Using regstr:"foo[8-12]" confuses the square brackets (how does it interpret this?) since 12 is not a single digit, while regstr:"foo[1[0-2]|[1-9]]" returns a type error, even without the foo-string complication.
As the other comments and answers mentioned, this can't be done using a single regex. Another alternative method is to construct the list of strings that you want to compare against:
q)str:("foo7";"foo8";"foo9";"foo10";"foo11";"foo12";"foo13")
q)match:{x in y,/:string z[0]+til 1+neg(-/)z}
q)match[str;"foo";8 12]
0111110b
If your eventual goal is to filter on the matching entries, you can replace in with inter:
q)match:{x inter y,/:string z[0]+til 1+neg(-/)z}
q)match[str;"foo";8 12]
"foo8"
"foo9"
"foo10"
"foo11"
"foo12"
A variation on Cillian’s method: test the prefix and numbers separately.
q)range:{x+til 1+y-x}.
q)s:"foo",/:string 82,range 7 13 / include "foo82" in tests
q)match:{min(x~/:;in[;string range y]')#'flip count[x]cut'z}
q)match["foo";8 12;] s
00111110b
Note how unary derived functions x~/: and in[;string range y]' are paired by #' to the split strings, then min used to AND the result:
q)flip 3 cut's
"foo" "foo" "foo" "foo" "foo" "foo" "foo" "foo"
"82" ,"7" ,"8" ,"9" "10" "11" "12" "13"
q)("foo"~/:;in[;string range 8 12]')#'flip 3 cut's
11111111b
00111110b
Compositions rock.
As the comments state, regex in kdb+ is extremely limited. If the number of trailing digits is known like in the example above then the following can be used to check multiple patterns
q)str:("foo7"; "foo8"; "foo9"; "foo10"; "foo11"; "foo12"; "foo13"; "foo3x"; "foo123")
q)any str like/:("foo[0-9]";"foo[0-9][0-9]")
111111100b
Checking for a range like 8-12 is not currently possible within kdb+ regex. One possible workaround is to write a function to implement this logic. The function range checks a list of strings start with a passed string and end with a number within the range specified.
range:{
/ checking for strings starting with string y
s:((c:count y)#'x)like y;
/ convert remainder of string to long, check if within range
d:("J"$c _'x)within z;
/ find strings satisfying both conditions
s&d
}
Example use:
q)range[str;"foo";8 12]
011111000b
q)str where range[str;"foo";8 12]
"foo8"
"foo9"
"foo10"
"foo11"
"foo12"
This could be made more efficient by checking the trailing digits only on the subset of strings starting with "foo".
For your example you can pad, fill with a char, and then simple regex works fine:
("."^5$("foo7";"foo8";"foo9";"foo10";"foo11";"foo12";"foo13")) like "foo[1|8-9][.|0-2]"
I want to check which of the two strings in greater, for which I have was using below logic but it fails in few cases
q){$[1b in x>=y;x;y]}["b";"b"]
"b"
q){$[1b in x>=y;x;y]}["c";"b"]
"c"
q){$[1b in x>=y;x;y]}["azz";"dff"] // Wrong output (Reason for failure - "azz">"dff" --> 011b)
"azz" / desired output dff
Please suggest another way to get the greatest string of the provided strings?
Since comparison operator compares character by character hence in "azz" and "dff" if the output can be displayed as "dff" only after comparison for "d" from "dff" and "a" from "azz" as "a" is less than "d".
You can convert the string to symbol and use <, >, etc.. These operators perform lexicographic comparisons for symbols.
https://code.kx.com//q4m3/4_Operators/
q) `azz < `dff
1b
If you insist on strings then you can leverage iasc to creat a "smaller-or-equal"-like function:
q) not first iasc ("azz"; "dff")
1b
Using 1b in is equivalent to any in this case as "azz">"dff" equates to 011b. Your conditional will evaluate true as 2 letters in "azz" are greater than "dff".
It is better to cast x and y to symbols and compare as this will evaluate with 1 boolean:
(`$"azz")>=`$"dff"
0b
{$[(`$x)>=`$y;x;y]}["azz";"dff"]
"dff"
Alternatively you could sort desc order and take the first result:
{first desc(x;y)}["azz";"dff"]
"dff"
I'm trying to filter based on substrings within a string. These strings can contain A through E, or any combination of the five (such as ["C"] or ["A","C","D","E"]). Is there a way I could search through the entire string for each letter before returning a value?
The code I have currently (below) stops when the first IF statement is true. My goal is to be able to classify the entries by the letters in the string and use this calculation as a filter. So, an entry with the string ["A"] would be filtered under "A", but the string ["C","E"] would be filtered under both "C" and "E". Thank you for your help.
IF CONTAINS([Q2.6],"A") then "A"
ELSEIF CONTAINS([Q2.6],"B") then "B"
ELSEIF CONTAINS([Q2.6],"C") then "C"
ELSEIF CONTAINS([Q2.6],"D") then "D"
ELSEIF CONTAINS([Q2.6],"E") then "E"
END
This function: Trim(In.Col, Right(In.Col, 2), 'T') works unless the last >2 characters are the same.
What I want:
abczzzz -> abczz
What I get:
abczzzz -> abc
How do I solve this?
The "T" option removes all trailing occurrences. Since you are limiting your input to only two characters with the Right() function, the second occurence will never be a trailing char.
It sounds though like you are just doing a substring..? If so, then you might just want to do a substring [ ] instead.
expression [ [ start, ] length ]
In.Col[(string length) - 2]
I prefer the Left() function, although it's equivalent here, as it's self-documenting.
Left(InLink.MyString, Len(InLink.MyString) - 2)
In Thunderbird you have a field after each rule like:
"Subject" - "Contains" - "Value Field"
I tried searching for a documentation of some sorts but failed. What exactly is the Value Field? In the sense that if I type in "P a r t y" will it match the subject "Hello this is a P a r t y" or will it match any subject containing letters "P" , "a" , "r" , "t" and "y"?
Only "Hello this is a P a r t y". Value field is intended without quotes when you define the filter.