Check whether variable conforms to string format - tsql

I need a way to check whether the string contents of a variable conforms to a certain format. An example of the format I need is 52M-14Jun04-1, i.e. 11A-11Aaa11-1.
Occasionally there are also strings that contain an asterisk in place of the first letter, i.e. 11*11Aaa11-1.
Many thanks,
Jens

Here's what I came up with using LIKE:
DECLARE #Input varchar(20) = '52M-14Jun04-1'
DECLARE #Result varchar(20)
SELECT #Result =
(CASE WHEN
#Input LIKE '[0-9][0-9][A-Z*]-[0-9][0-9][A-Z][a-z][a-z][0-9][0-9]-[0-9]'
THEN
'Matches'
ELSE
'Does not match'
END)
An explanation of the pattern:
[0-9] Any digit between 0 and 9
[A-Z*] Any character A through Z (uppercase) or *
- A hyphen
[a-z] Any character a through z (lowercase)

Related

How to get last part of nvarchar with variable size in T-SQL?

Imagine that I have the following value in my nvarchar variable:
DECLARE #txt nvarchar(255)
SET #txt = '32|foo|foo2|123'
Is there a way to easily get the last part just after the last | that is 123 in this case ?
I could write a split function but I'm not interested in the first parts of this string. Is there another way to get that last part of the string without getting the first parts ?
Note that all parts of my string have variable sizes.
You can use a combination of LEFT, REVERSE and CHARINDEX for this.
The query below reverses the string, finds the first occurance of |, strips out other characters and then straightens the string back.
DECLARE #txt nvarchar(255)
SET #txt = '32|foo|foo2|123'
SELECT REVERSE(LEFT(REVERSE(#txt),CHARINDEX('|',REVERSE(#txt))-1))
Output
123
Edit
If your string only has 4 parts or less and . isn't a valid character, you can also use PARSENAME for this.
DECLARE #txt nvarchar(255)
SET #txt = '32|foo|foo2|123'
SELECT PARSENAME(REPLACE(#txt,'|','.'),1)
You can reverse your string to get the desired result:
DECLARE #txt nvarchar(255) = '32|foo|foo2|123'
SELECT REVERSE(SUBSTRING(REVERSE(#txt), 1, CHARINDEX('|', REVERSE(#txt)) -1))

remove non-numeric characters in a column (character varying), postgresql (9.3.5)

I need to remove non-numeric characters in a column (character varying) and keep numeric values in postgresql 9.3.5.
Examples:
1) "ggg" => ""
2) "3,0 kg" => "3,0"
3) "15 kg." => "15"
4) ...
There are a few problems, some values are like:
1) "2x3,25"
2) "96+109"
3) ...
These need to remain as is (i.e when containing non-numeric characters between numeric characters - do nothing).
Using regexp_replace is more simple:
# select regexp_replace('test1234test45abc', '[^0-9]+', '', 'g');
regexp_replace
----------------
123445
(1 row)
The ^ means not, so any character that is not in the range 0-9 will be replaced with an empty string, ''.
The 'g' is a flag that means all matches will be replaced, not just the first match.
For modifying strings in PostgreSQL take a look at The String functions and operators section of the documentation. Function substring(string from pattern) uses POSIX regular expressions for pattern matching and works well for removing different characters from your string.
(Note that the VALUES clause inside the parentheses is just to provide the example material and you can replace it any SELECT statement or table that provides the data):
SELECT substring(column1 from '(([0-9]+.*)*[0-9]+)'), column1 FROM
(VALUES
('ggg'),
('3,0 kg'),
('15 kg.'),
('2x3,25'),
('96+109')
) strings
The regular expression explained in parts:
[0-9]+ - string has at least one number, example: '789'
[0-9]+.* - string has at least one number followed by something, example: '12smth'
([0-9]+.\*)* - the string similar to the previous line zero or more times, example: '12smth22smth'
(([0-9]+.\*)*[0-9]+) - the string from the previous line zero or more times and at least one number at the end, example: '12smth22smth345'

how to get a substring from right with T-sql

Suppose I have a string like:
abc.efg.hijk.lmnop.leaf
I want the substring: abc.efg.hijk.lmnop.
Means: Find out the first comma . from right, then get the substring from left to this comma
How to use t-sql string function return the substring with one expresssion?
First your'll need to reverse the string and find the character index of the first period, then subtract this number from the length of the entire string. This value needs to be used at the length parameter of the sub-string function.
Try this:
DECLARE #S VARCHAR(55) = 'abc.efg.hijk.lmnop.leaf'
SELECT SUBSTRING(#S, 1, LEN(#S) - CHARINDEX('.', REVERSE(#S)))

PostgreSQL convert a string with commas into an integer

I want to convert a column of type "character varying" that has integers with commas to a regular integer column.
I want to support numbers from '1' to '10,000,000'.
I've tried to use: to_number(fieldname, '999G999G999'), but it only works if the format matches the exact length of the string.
Is there a way to do this that supports from '1' to '10,000,000'?
select replace(fieldname,',','')::numeric ;
To do it the way you originally attempted, which is not advised:
select to_number( fieldname,
regexp_replace( replace(fieldname,',','G') , '[0-9]' ,'9','g')
);
The inner replace changes commas to G. The outer replace changes numbers to 9. This does not factor in decimal or negative numbers.
You can just strip out the commas with the REPLACE() function:
CREATE TABLE Foo
(
Test NUMERIC
);
insert into Foo VALUES (REPLACE('1,234,567', ',', '')::numeric);
select * from Foo; -- Will show 1234567
You can replace the commas by an empty string as suggested, or you could use to_number with the FM prefix, so the query would look like this:
SELECT to_number(my_column, 'FM99G999G999')
There are things to take note:
When using function REPLACE("fieldName", ',', '') on a table, if there are VIEW using the TABLE, that function will not work properly. You must drop the view to use it.

Test for numeric value?

The vendor data we load in our staging table is rather dirty. One column in particular captures number data but 40% of the time has garbage characters or random strings.
I have to create a report that filters out value ranges in that column. So, I tried playing with a combination of replace/translate like so
select replace(translate(upper(str),' ','all possible char'),' ','')
from table
but it fails whenever it encounters a char I did not code. Therefore, the report can never be automated.
Javascript has the isNaN() function to determine whether a value is an illegal number (True if it is and false if not).
How can I do the same thing with DB2?? Do you have any idea?
Thanks in advance.
A fairly reliable (but somewhat hackish) way is to compare the string to its upper- and lower-case self (numbers don't have different cases). As long as your data that is bringing in characters only includes Latin characters, you should be fine:
SELECT input, CASE
WHEN UPPER(input) = LOWER(input) THEN TO_NUMBER(input)
ELSE 0
END AS output
FROM source
Another option would be to use the TRANSLATE function:
SELECT input,
CASE
WHEN TRANSLATE(CAST(input as CHAR(10)), '~~~~~~~~~~~~~', '0123456789-. ') = '~~~~~~~~~~' THEN CAST(input AS DECIMAL(12, 2))
ELSE 0
END AS num
FROM x
WITH x (stringval) AS
(
VALUES ('x2'),(''),('2.2.'),('5-'),('-5-'),('--5'),('.5'),('2 2'),('0.5-'),(' 1 '),('2 '),('3.'),('-4.0')
)
SELECT stringval,
CASE WHEN (
-- Whitespace must not appear in the middle of a number
-- (but trailing and/or leading whitespace is permitted)
RTRIM(LTRIM( stringval )) NOT LIKE '% %'
-- A number cannot start with a decimal point
AND LTRIM( stringval ) NOT LIKE '.%'
-- A negative decimal number must contain at least one digit between
-- the negative sign and the decimal point
AND LTRIM( stringval ) NOT LIKE '-.%'
-- The negative sign may only appear at the beginning of the number
AND LOCATE( '-', LTRIM(stringval)) IN ( 0, 1 )
-- A number must contain at least one digit
AND TRANSLATE( stringval, '0000000000', '123456789') LIKE '%0%'
-- Allow up to one negative sign, followed by up to one decimal point
AND REPLACE(
TRANSLATE( RTRIM(LTRIM(stringval)), '000000000', '123456789'),
'0', '') IN ('','-','.','-.')
)
THEN 'VALID'
ELSE 'INVALID'
END AS stringisvalidnumber
FROM x
;
Check this out:
SELECT Mobile,
TRANSLATE(Mobile, '~~~~~~~~~~', '0123456789') AS FirstPass,
TRANSLATE(TRANSLATE(Mobile, '~~~~~~~~~~', '0123456789'), '', '~') AS Erroneous,
REPLACE(TRANSLATE(Mobile, '', TRANSLATE(TRANSLATE(Mobile, '~~~~~~~~~~', '0123456789'), '', '~')), ' ', '') AS Corrected
FROM Person WHERE Mobile <> '' FETCH FIRST 100 ROWS ONLY
The table is "Person" and the field that you want to check is "Mobile".
If you work a little bit more on this, you can build an UPDATE to fix the entire table