I am trying to add a startswith operator to my formula below, because I need it to return all values starting TRA or MTA.
IF {STK_LOCATION.LOC_CODE}
IN ['TRA', 'MTA']
THEN {STK_LOCATION.LOC_STOCK_CODE}
ELSE {STK_LOCATION.LOC_STOCK_CODE} + LEFT({STK_LOCATION.LOC_CODE},4)
The IN function compares your input with the whole string. Try startswith:
IF({STK_LOCATION.LOC_CODE} startswith["TRA","MTA"]) THEN
{STK_LOCATION.LOC_STOCK_CODE}
ELSE
{STK_LOCATION.LOC_STOCK_CODE} + LEFT({STK_LOCATION.LOC_CODE},4)
And use always double quotes when you use strings, CR is not SQL
Related
Using q’s like function, how can we achieve the following match using a single regex string regstr?
q) ("foo7"; "foo8"; "foo9"; "foo10"; "foo11"; "foo12"; "foo13") like regstr
>>> 0111110b
That is, like regstr matches the foo-strings which end in the numbers 8,9,10,11,12.
Using regstr:"foo[8-12]" confuses the square brackets (how does it interpret this?) since 12 is not a single digit, while regstr:"foo[1[0-2]|[1-9]]" returns a type error, even without the foo-string complication.
As the other comments and answers mentioned, this can't be done using a single regex. Another alternative method is to construct the list of strings that you want to compare against:
q)str:("foo7";"foo8";"foo9";"foo10";"foo11";"foo12";"foo13")
q)match:{x in y,/:string z[0]+til 1+neg(-/)z}
q)match[str;"foo";8 12]
0111110b
If your eventual goal is to filter on the matching entries, you can replace in with inter:
q)match:{x inter y,/:string z[0]+til 1+neg(-/)z}
q)match[str;"foo";8 12]
"foo8"
"foo9"
"foo10"
"foo11"
"foo12"
A variation on Cillian’s method: test the prefix and numbers separately.
q)range:{x+til 1+y-x}.
q)s:"foo",/:string 82,range 7 13 / include "foo82" in tests
q)match:{min(x~/:;in[;string range y]')#'flip count[x]cut'z}
q)match["foo";8 12;] s
00111110b
Note how unary derived functions x~/: and in[;string range y]' are paired by #' to the split strings, then min used to AND the result:
q)flip 3 cut's
"foo" "foo" "foo" "foo" "foo" "foo" "foo" "foo"
"82" ,"7" ,"8" ,"9" "10" "11" "12" "13"
q)("foo"~/:;in[;string range 8 12]')#'flip 3 cut's
11111111b
00111110b
Compositions rock.
As the comments state, regex in kdb+ is extremely limited. If the number of trailing digits is known like in the example above then the following can be used to check multiple patterns
q)str:("foo7"; "foo8"; "foo9"; "foo10"; "foo11"; "foo12"; "foo13"; "foo3x"; "foo123")
q)any str like/:("foo[0-9]";"foo[0-9][0-9]")
111111100b
Checking for a range like 8-12 is not currently possible within kdb+ regex. One possible workaround is to write a function to implement this logic. The function range checks a list of strings start with a passed string and end with a number within the range specified.
range:{
/ checking for strings starting with string y
s:((c:count y)#'x)like y;
/ convert remainder of string to long, check if within range
d:("J"$c _'x)within z;
/ find strings satisfying both conditions
s&d
}
Example use:
q)range[str;"foo";8 12]
011111000b
q)str where range[str;"foo";8 12]
"foo8"
"foo9"
"foo10"
"foo11"
"foo12"
This could be made more efficient by checking the trailing digits only on the subset of strings starting with "foo".
For your example you can pad, fill with a char, and then simple regex works fine:
("."^5$("foo7";"foo8";"foo9";"foo10";"foo11";"foo12";"foo13")) like "foo[1|8-9][.|0-2]"
Similarly to this question about json_extract_path_text, when I run this query in Redshift, I would expect json_extract_array_element_text to remove the backslashes from the "\"b\"" value:
select
j,
json_extract_array_element_text(j, 0) as a
from (select '["\\"b\\""]' as j);
Instead, it appears that the string value is extracted verbatim, and the results look like this:
["\"b\""]
\"b\"
Is this intentional? If yes, what would be the idiomatic way to remove the backslashes?
I would like to have an expression equal to MyParmater + '_' + utcnow()
My current attempt is: #{pipeline().parameters.Col}_{utcnow()}
but iy fails
Use the concat function:
#concat(pipeline().parameters.Col,'_',utcNow())
I need to find the index of the first character that is not ]. Normally to match any character except X, you use the pattern [^X]. The problem is that [^]] simply closes the first bracket too early. The first part, [^], will match any character.
In the documentation for the LIKE operator, if you scroll down to the section "Using Wildcard Characters As Literals" it shows a table of methods to indicated literal characters like [ and ] inside a pattern. It makes no mention of using [ or ] inside double brackets. If the pattern is being used with the LIKE operator, you would use the ESCAPE clause. LIKE doesn't return an index and PATINDEX doesn't seem to have a parameter for an escape clause.
Is there no way to do this?
(This may seem arbitrary. To put some context around it, I need to match ] immediately followed by a character that is not ] in order to locate the end of a quoted identifier. ]] is the only character escape inside a quoted identifier.)
This isn't possible. The Connect item PATINDEX Missing ESCAPE Clause is closed as won't fix.
I'd probably use CLR and regular expressions.
A simple implementation might be
using System.Data.SqlTypes;
using System.Text.RegularExpressions;
public partial class UserDefinedFunctions
{
[Microsoft.SqlServer.Server.SqlFunction]
public static SqlInt32 PatIndexCLR(SqlString pattern, SqlString expression)
{
if (pattern.IsNull || expression.IsNull)
return new SqlInt32();
Match match = Regex.Match(expression.ToString(), pattern.ToString());
if (match.Success)
{
return new SqlInt32(match.Index + 1);
}
else
{
return new SqlInt32(0);
}
}
}
With example usage
SELECT [dbo].[PatIndexCLR] ( N'[^]]', N']]]]]]]]ABC[DEF');
If that is not an option a possible flaky workaround might be to substitute a character unlikely to be in the data without this special significance in the grammar.
WITH T(Value) AS
(
SELECT ']]]]]]]]ABC[DEF'
)
SELECT PATINDEX('%[^' + char(7) + ']%', REPLACE(Value,']', char(7)))
FROM T
(Returns 9)
I have a string and I need two characters to be returned.
I tried with strsplit but the delimiter must be a string and I don't have any delimiters in my string. Instead, I always want to get the second number in my string. The number is always 2 digits.
Example: 001a02.jpg I use the fileparts function to delete the extension of the image (jpg), so I get this string: 001a02
The expected return value is 02
Another example: 001A43a . Return values: 43
Another one: 002A12. Return values: 12
All the filenames are in a matrix 1002x1. Maybe I can use textscan but in the second example, it gives "43a" as a result.
(Just so this question doesn't remain unanswered, here's a possible approach: )
One way to go about this uses splitting with regular expressions (MATLAB's strsplit which you mentioned):
str = '001a02.jpg';
C = strsplit(str,'[a-zA-Z.]','DelimiterType','RegularExpression');
Results in:
C =
'001' '02' ''
In older versions of MATLAB, before strsplit was introduced, similar functionality was achieved using regexp(...,'split').
If you want to learn more about regular expressions (abbreviated as "regex" or "regexp"), there are many online resources (JGI..)
In your case, if you only need to take the 5th and 6th characters from the string you could use:
D = str(5:6);
... and if you want to convert those into numbers you could use:
E = str2double(str(5:6));
If your number is always at a certain position in the string, you can simply index this position.
In the examples you gave, the number is always the 5th and 6th characters in the string.
filename = '002A12';
num = str2num(filename(5:6));
Otherwise, if the formating is more complex, you may want to use a regular expression. There is a similar question matlab - extracting numbers from (odd) string. Modifying the code found there you can do the following
all_num = regexp(filename, '\d+', 'match'); %Find all numbers in the filename
num = str2num(all_num{2}) %Convert second number from str