I have a question about matching the first 5 digits in a string in a regular expression - numbers

I would like to limit a string of numbers to 14 digits, and require that the first 5 digits are: 26173. The rest of the digits can be any number between 1-9. Example: 26173000740380.

The regexp 26173\d{9} specifies the first 5 characters must be 26173 and the following 9 characters must be any decimal number \d.
If the remaining 9 characters must be between 1 - 9 you could use 26173[1-9]{9}. Both examples are using java regexp syntax.
Regexp planet is a good site for testing regular expressions
https://www.regexplanet.com/advanced/java/index.html

Related

how to format number separated by comma for every three integer digits in presto

I want to format a number separated by comma for every three integer digits. for example 12345.894 -- > 12,345.894. I have no clue how to format it. I have tried for an example but no luck.
format('%,.2f', 12345.894)
The above code will round decimal to 2 digits so it returns 12,345.89. In my case, I want to keep the decimal 12,345.894.
You could use regular expression:
SELECT regexp_replace(cast(123456.8943 as VARCHAR), '(\d)(?=(\d{3})+\.)', '$1,')
Results:
-------
123,456.8943
Some explanation:
First we cast to varchar as regex works on string.
The regex actually says: replace any digit \d you see only if it has one or more + groups of 3 digits \d{3} just before the "." (dot) sign \.. The digit is replaced by the same digit $1 but with comma after it ,.
The example can be seen here.
You can see more discussions on the regex here.
If you want 3 decimal numbers you can use %,.3f as the format string:
presto> select format('%,.3f', 12345.894);
_col0
------------
12,345.894
(1 row)

Inserting hyphens into length limited String using regex

Within a Swift project I have some regex which at present ensures that an input can only be 10 characters long:
"^[\\da-zA-Z]{10,10}$"
I need to tweak this slightly, so that the string which this is working on will have the below format:
#####-####
i.e, inserting a character after the fifth character.
So far I have tried combining what I have with some other regex, however this is incorrect and I can't figure out what I need to do differently to make this work:
"^[\\da-zA-Z]{10,10}$(.{5}),$1-$2"
If you have as string of 10 characters and you want to replace the character after the sixth character you could use 2 capturing groups.
Capture the first 5 characters in the first group, then match the sixth character which you want to replace and capture the last 4 in the second group.
^([\\da-zA-Z]{5})[\\da-zA-Z]([\\da-zA-Z]{4})$
regex demo
In the replacement use $1-$2 which in total will be 10 characters as in your desired pattern #####-####
Note that {10,10} can be written as {10}

PATINDEX does not recognize dot and comma

I have a column that should contain phone numbers but it contains whatever the user wanted. I need to create an update to remove all the characters after an invalid character.
To do this I am using a regex as PATINDEX('%[^0-9+-/()" "]%', [MobilNr]) and it seemed to work until I had some numbers as +1235, 36446 and to my surprise the result is 0 instead of 6. Also if the number contains . it returns 0.
Does PATINDEX ignores dot(".") and comma(",")? Are there other characters that PATINDEX will ignore?
It's not that PATINDEX ignores the comma and the dot, it's your pattern that created this problem.
With PATINDEX, the hyphen char (-) has a special meaning - it's in fact an operator that denotes an inclusive range - like 0-9 denotes all digits between 0 and 9 - so when you do +-/ it means all the chars between + and / (inclusive, of course). The comma and dot chars are within this range, that's why you get this result.
Fixing the pattern is easy: either use | as a logical or, or simply move the hyphen to the end of the pattern:
SELECT PATINDEX('%[^0-9/()" "+-]%', '+1235, 36446') -- Result: 6

Regex for currency - Exclude commas from the count limit

I am using the following regex in my app:
^(([0-9|(\\,)]{0,10})?)?(\\.[0-9]{0,2})?$
So it allows 10 characters before the decimal and 2 character after it.
But I am inserting one additional functionality of formatting textfield as currency while typing. So if I have 1234567 it becomes 1,234,567 after formatting. The regex fails when I enter 10 characters instead of 10 digits. Ideally it should be that regex ignores the commas when counting 10.
I tried this too ^(([0-9|(\\,)]{0,13})?)?(\\.[0-9]{0,2})?$ but it doesn't seem the right approach.
Can anyone help me get a proper regex instead of using this tweak.
You may use
"^(?:,*[0-9]){0,10}(?:\.[0-9]{0,2})?$"
Or, if there must be a digit after . in the fractional part use
"^(?:,*[0-9]){0,10}(?:\.[0-9]{1,2})?$"
See the regex demo. The (?:,*[0-9]){0,10} part is what does the job: it matches any 0+ , chars followed with a single digit 0 to 10 times. If , can also appear before ., add ,* after the ((?:,*[0-9]){0,10})?.
Details
^ - start of string
(?:,*[0-9]){0,10} - 0 to 10 occurrences of 0+ commas followed with a digit
(?:\.[0-9]{0,2})? - an optional sequence of:
\. - a period
[0-9]{0,2} - 0 to 2 digits (if there must be a digit after . use [0-9]{1,2})
$ - end of string.

Using sed to replace a number located between two other numbers

I need to replace a numeric value, that occurs in a specific line of a series of config files in a pattern like this:
string number_1 number_to_replace number_2
I want to obtain something like this:
string number_1 number_replaced number_2
The difficulties I encountered are:
number_1 or number_2 can be equal to number_to_replace, so a simple replacement is not possible.
number_1 and number_2 vary between config files so I don't know them in advance.
The closest attempt I got until now is:
echo "field 4 4 4" | sed 's/\s4\s/3/'
Which ouputs:
field34 4
This is close, given that I want to replace the intermediate number I added another "\s" to try to use the known fact that the line starts with a character.
echo "field 4 4 4" | sed 's/\s\s4\s/3/'
Which gives:
field 4 4 4
So, nothing is replaced this time. How can I proceed? A somewhat detailed explanation would be ideal, because my knowledge of replacing expressions that involve patterns in nearly zero.
Thanks.
You can do something like below, which matches your exact sequence of digits as in the example. You could replace 3 with any digit of your choice.
sed 's/\([0-9]\{1,\}\)[[:space:]]\([0-9]\{1,\}\)[[:space:]]\([0-9]\{1,\}\)/\1 3 \3/'
Notice that I've used the POSIX bracket expression to match the whitespace character which should be supported in any variant of sed you are using. Note that \s is supported in only the GNU variants.
The literal meaning of the regex definition is to match a single digit followed by a space, then a digit and space and another digit. The captured groups are stored from \1. Since your intention is to remove the 2nd digit, you replace that with the word of your choice.
If the extra escapes causes it unreadable, use the -E flag for extended regex support. I've used the default BRE version