PostgreSQL similar to operator behaviour - postgresql

I find PostgreSQL similar to operator works little strange. I accidentally checked for space in below query but surprised with the result.
select 'Device Reprocessing' similar to '%( )%' --return true
select 'Device Reprocessing' similar to '%()%' --return true
select 'DeviceReprocessing' similar to '%()%' --return true
Why 2nd and the 3rd query returns true? Is empty pattern always return true?
What I understand about SIMILAR TO operator is returns true or false depending on whether its pattern matches the given string.

You have defined a group with nothing in it, meaning anything will match. I think you will find any string matches %()%, even an empty string.
Normally you would use this grouping to list options so:
select 'DeviceReprocessing' similar to '%(Davinci|Dog)%'
Would return false since it contains neither "Davinci" nor "Dog", but this:
select 'DeviceReprocessing' similar to '%(vice|Dog)%'
would return true since it does contain at least one of the options.
Your first condition is true because the expression does contain a space.
I actually prefer the Regular Expression notation that does not require the % wildcards:
select 'DeviceReprocessing' ~ 'vice|Dog'

Related

postgres regexp_matches strange behavior

Following the short docs on regexp_matches:
Return all captured substrings resulting from matching a POSIX regular expression against the string.
Example: regexp_matches('foobarbequebaz', '(bar)(beque)') returns {bar,beque}
With that in mind, I'd expect the result of regexp_matches('barbarbar', '(bar)') to be {bar,bar,bar}
However, only {bar} is returned.
Is this the expected behavior? Am I missing something?
Note:
calling regexp_matches('barbarbar', '(bar)', 'g') does return all 3 bars, but in table form:
regexp_matches text[]
{bar}
{bar}
{bar}
This behavior is described more in details in 9.7.3. POSIX Regular Expressions :
The regexp_matches function returns a set of text arrays of captured
substring(s) resulting from matching a POSIX regular expression
pattern to a string. It has the same syntax as regexp_match. This
function returns no rows if there is no match, one row if there is a
match and the g flag is not given, or N rows if there are N matches
and the g flag is given. Each returned row is a text array containing
the whole matched substring or the substrings matching parenthesized
subexpressions of the pattern, just as described above for
regexp_match. regexp_matches accepts all the flags shown in Table
9.24, plus the g flag which commands it to return all matches, not just the first one.
This is expected behavior. The function returns a set of text[] which means that multiple matches are presented in multiple rows. Why is it organized this way? The goal is to make it possible to find more than one token from a single match. In this case, they are presented in the form of an array. The documentation delivers a telling example:
SELECT regexp_matches('foobarbequebazilbarfbonk', '(b[^b]+)(b[^b]+)', 'g');
regexp_matches
----------------
{bar,beque}
{bazil,barf}
(2 rows)
The query returns two matches, each of them containing two tokens found.

Issue with string comparison contains "_" in Postgres

Issue with comparison in Postgres (Version 11) while comparing _ sign.
I have a string (shown below) I want it to compare it with a word and wants to check whether this word _WIN_ exists in the string. If yes it should give true.
But when I search like this '%_WIN_%' it is giving as TRUE even though the searched string doesn't exactly contains this word.
Can anyone please suggest what I'm doing wrong?
select 'New_vit_Vitamin_D_IND-tonline_WINTERSEAS_2020.02.09' ILIKE '%_WIN_%'
Note: Expected RESULT should be FALSE but giving as TRUE
The underscore is the wildcard for a single character in SQL. If you want to search for the character itself you need to escape it:
ILIKE '%\_WIN\_%' ESCAPE '\'

Regrex query in DB2-LUW

I need a regrex query to match any string having given character. So i tried for example
SELECT wt.CHGUSER FROM "CDB"."WTBALL" wt where REGEXP_LIKE (wt.CHGUSER, '^\d*115*$');
So i am expecting to fetch all the strings having 115 somewhere in between each string. I tried many combinations but i am getting empty column or weird combination.
Are you sure You need a regex? You write "all the strings having 115 somewhere in between each string", but test for a all-digit string with "115" somewhere...
Btw. this could be done also without regex:
WHERE LOCATE('115', wt.CHGUSER) > 0
AND TRANSLATE(wt.CHGUSER, '', '0123456789') --if You really want to test all-digit string
why not use the native "LIKE" expression?
where wt.CHGUSER like '%115%'
This will give different results than your regexp because your expression is looking for '115' so long as there is a digit immediate before and after it. A more generic regexp, which matches your question, would be '.*115.*'
What about -
REGEXP_LIKE (wt.CHGUSER, '^*\d115\d*$');

In DB2 SQL RegEx, how can a conditional replacement be done without CASE WHEN END..?

I have a DB2 v7r3 SQL SELECT statement with three instances of REGEXP_SUBSTR(), all with the same regex pattern string, each of which extract one of three groups.
I'd like to change the first SUBSTR to REGEXP_REPLACE() to do a conditional replacement if there's no match, to insert a default value similarly to the ELSE section of a CASE...END. But I can't make it work. I could easily use a CASE, but it seems more compact & efficient to use RegEx.
For example, I have descriptions of food containers sizes, in various states of completeness:
12X125
6X350
1X1500
1500ML
1000
The last two don't have the 'nnX' part at the beginning, in which case '1X' is assumed and needs to be inserted.
This is my current working pattern string:
^(?:(\d{1,3})(?:X))?((?:\d{1,4})(?:\.\d{1,3})?)(L|ML|PK|Z|)$
The groups returned are: quantity, size, and unit.
But only the first group needs the conditional replacement:
(?:(\d{1,3})(?:X))?
This RexEgg webpage describes the (?=...) operator, and it seems to be what I need, but I'm not sure. It's in the list of operators for my version of DB2, but I can't make it work. Frankly, it's a bit deeper than my regex knowledge, and I can't even make it work in my favorite online regex tester, Regex101.
So...does anyone have any idea or suggestions..? Thanks.
Try this (replace "digits not followed by X_or_digit"):
with t(s) as (values
'12X125'
, '6X350'
, '1X1500'
, '1500'
, '1125'
)
select regexp_replace(s, '^([\d]+(?![X\d]))', '1X\1')
from t;

Tableau Filter Formula

I am trying to filter workgroup name that only contains BL or CL so I used the formula...
STARTSWITH([wrkgrp_shrt_nm], "BL") or STARTSWITH([wrkgrp_shrt_nm], "CL" )
I get the little green check, but when I hit apply it is blank and nothing pulls through
I tried another formula...
if right([wrkgrp_shrt_nm],2) = 'BL' then 1 elseif
right([wrkgrp_shrt_nm],2) = 'CL' then 1 elseif
right([wrkgrp_shrt_nm],2) then 0
end
but I am only getting an error
any suggestions?
If you want "contains", you can just call contains()
contains(wrkgrp_shrt_nm, 'BL') or contains(wrkgrp_shrt_nm, 'CL')
Does the same thing as the find() solution Fred posted, just a little easier to read in this case. I'm not sure why Fred says you cannot use IF. I use IF all the time without problems.
BTW, in case you were wondering, the square brackets around field names are optional if the field name does not include spaces or punctuation, and function names are not case sensitive.
To clarify, you're asking for "contains BL or CL", but your formula specify STARTSWITH which will be true is your field [wrkgrp_shrt_nm] starts with the string "BL" or the string "CL".
If you want "contains", you could use FIND:
FIND([wrkgrp_shrt_nm], 'BL' ) > 0 OR FIND([wrkgrp_shrt_nm], 'CL' ) > 0
You cannot use IF in a condition field, but you can use inline IF (IIF), however it's not necessary in your case.
Edit:
I can totally be wrong with my comment on IF (because I'm still new in Tableau) but I tried IF in a condition field of a filter (as the OP asked) and I can't make it work. I use IF all the time in Calculated Fields however. I'll try again...