What would be the best way to shorten below SQL code?
REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(TRIM(MYFIELD),'-R1',''),'-R2',''),'-R3',''),'-R4',''),'-R5',''),'-R6',''),'-R7',''),'-R8',''),'-R9',''),'-RA',''),'-RB',''),'-RC',''),'-RD',''),'-RE',''),'-RF','') AS TESTFIELD
here is what I have tried:
REGEXP_REPLACE(MYFIELD,'-R[0-100][a-fA-F]','')
Original Data
N-RX ABCD
GROUP OPTION -01
ADVANTAGE 65 SELECT B-R11
ADVANTAGE 65 SELECT B-RA
ADVANTAGE 65 SELECT B-R09
ADVANTAGE 65 SELECT B-RB
ADVANTAGE 65 SELECT B/2A
Result Needed:
N-RX ABCD
GROUP OPTION -01
ADVANTAGE 65 SELECT B
Solution:
REGEXP_REPLACE(Trim(MyField), '[-|/]R[0-9a-zA-Z*][0-9a-zA-Z*]*$', '')
Your regular expression is your current issue. Try something like:
REGEXP_REPLACE(DACL_PDLV_5_DE, '-R[0-9a-fA-F][0-9]*$', '')
This matches '-R' followed by a digit or a-f or A-F, optionally followed by another digit, but only at the end of the string.
If you could have a two-digit hex value you will want to adjust accordingly.
Related
Given a symbol, how to check whether it has a particular prefix?
I had below code. It checks if a symbol begins with aaaaa but returns 1b for aaa which is wrong. I can add a length check but that seems verbose. Is there a cleaner way?
{"aaaaa"~-5#string x}[`$"aaa"]
Could you use like?
q)`aaa like "aaa*"
1b
q)`aaa like "aaaaa*"
0b
It seems like the issue is with "take" since "aaa" is shorter than 5. It's extending "aaa" by 2/3 of itself in order to meet that length.
You could modify your function so you have the following:
q){"aaaaa"~(x) til 5}["aaa"]
0b
q){"aaaaa"~(x) til 5}["aaaaaaaa"]
1b
Expanding on Matthew's answer if you want to make a function out of it do the following:
q)f:{x like "aaaaa*"}
q)f[`aaa]
0b
q)f[`aaaaa]
1b
q)f[`aaaaabcde]
1b
And if you want to make it more dynamic you could add a second variable for the matching prefix.
q)f2[`aaa;"aaa"]
1b
q)f2:{x like y,"*"}
q)f2[`aaa;"aaaaa"]
0b
q)f2[`aaa;"aaa"]
1b
Let me know if you see any issues.
I have text like this in different rows in a column
xxxxxxxxxxx ab_88_2018 xxxxxx
ab_88_2018 xxxxxx
AB_88_2018 xxxxxx
ab_2018_88 XXXXXX
So I want only 88 out of the text into another column.
What can be the query?
Its not 88, but two numbers in that position
Is the 88 always a 2 digit number? If so, this is working for me for Postgres and Redshift and I believe gets you what you want:
SELECT
CASE
WHEN LOWER(column)
~ '.*[a-z]{2}\_[0-9]{2}\_[0-9]{4}.*' THEN SPLIT_PART(column,'_',2)
WHEN LOWER(column)
~ '[a-z]{2}\_[0-9]{4}\_[0-9]{2}.*' THEN LEFT(SPLIT_PART(column,'_',3),2)
END As get_two_digit_number
The ~ (tilde) is similar to LIKE but allows you to do pattern matching through regex. See regexr.com and paste your examples and the code between the '' to see what it's matching
SPLIT_PART is taking the string that matches the pattern, and then breaking it on a character of my choosing, here it's the '_'. The last number is which break to return
Using 'xxxxxxxxxxx ab_88_2018 xxxxxx' as an example, SPLIT_PART('xxxxxxxxxxx ab_88_2018 xxxxxx','',2) will return '88'as 88 is the second part after ''. If you entered 1 it would return everything before the '_'
I am REALLY confused about pack and unpack definition for perl.
Below is the excerpt from perl.doc.org
The pack function converts values to a byte sequence containing
representations according to a given specification, the so-called
"template" argument. unpack is the reverse process, deriving some values
from the contents of a string of bytes.
So I get the idea that pack takes human readable things(such as A) and turn it into binary format. Am I wrong on this interpretation??
So that is my interpreation but then same doc immediately proceeds to put this example which put my understanding exactly the opposite.
my( $hex ) = unpack( 'H*', $mem );
print "$hex\n";
What am I missing?
The pack function puts one or more things together in a single string. It represents things as octets (bytes) in a way that it can unpack reliably in some other program. That program might be far away (like, the distance to Mars far away). It doesn't matter if it starts as something human readable or not. That's not the point.
Consider some task where you have a numeric ID that's up to about 65,000 and a string that might be up to six characters.
print pack 'S A6', 137, $ARGV[0];
It's easier to see what this is doing if you run it through a hex dumper as you run it:
$ perl pack.pl Snoopy | hexdump -C
00000000 89 00 53 6e 6f 6f 70 79 |..Snoopy|
The first column counts the position in the output so ignore that. Then the first two octets represent the S (short, 'word', whatever, but two octets) format. I gave it the number 137 and it stored that as 0x8900. Then it stored 'Snoopy' in the next six octets.
Now try it with a shorter name:
$ perl test.pl Linus | hexdump -C
00000000 89 00 4c 69 6e 75 73 20 |..Linus |
Now there's a space character at the end (0x20). The packed data still has six octets. Try it with a longer name:
$ perl test.pl 'Peppermint Patty' | hexdump -C
00000000 89 00 50 65 70 70 65 72 |..Pepper|
Now it truncates the string to fit the six available spaces.
Consider the case where you immediately send this through a socket or some other way of communicating with something else. The thing on the other side knows it's going to get eight octets. It also knows that the first two will be the short and the next six will be the name. Suppose the other side stored that it $tidy_little_package. It gets the separate values by unpacking them:
my( $id, $name ) = unpack 'S A6', $tidy_little_package;
That's the idea. You can represent many values of different types in a binary format that's completely reversible. You send that packed string wherever it needs to be used.
I have many more examples of pack in Learning Perl and Programming Perl.
I have a problem where I have a column of data (codes) in a .csv file (can change format to .xlsx or anything else if needed) that is not all correct. For example, a cell contains the following:
"E86 F03 R64 03 R 64 86 F U "
And I would like to ONLY keep the entries that in the format <1 character><2-3 digit integer> and remove the other stuff. Using the above example, I would like to update the cell to look like the following:
"E86 F03 R64"
My major issue is that I cannot seem to figure out how to search the file for a generic format like <1 character><2-3 digit integer>. I would also be open to suggestions outside of PowerShell such as using an Excel formula. Would anyone be able to assist me with such an issue?
("E86 F03 R64 03 R 64 86 F U ".split() -match '^[a-z]\d{2,3}$') -join ' '
For example
Input: 0123BBB123456 Output: 123456
Input: ABC00123 Output: 00123
Input: 123AB0345 Output: 0345
In other words, the code should start stripping characters from the right and stop when a character that is no 0-9 is encountered.
I have to run this agains several millions of records, so I am looking for an efficient set based approach, not a cursor approach that performs substring functions in a loop for each record.
I am having issues trying to format this for reading. Give me a few minutes.
Frustrating...I think that the browser that I am using, IE6 (mandated by my company) is making this challenging. This site doesnt work well with 6.
How about;
;with test(value) as (
select '0123BBB123456' union
select 'ABC00123' union
select '123AB0345' union
select '123'
)
select
value,
right(value, patindex('%[^0-9]%', reverse('?' + value)) - 1)
from test
0123BBB123456 123456
123 123
123AB0345 0345
ABC00123 00123