I'm struggling to write a regex that matches the following requirements:
up to 20 characters (English letters and numbers)
may have one optional dash ( - ) but can't start or end with it
I could come up with this patters: ^[a-zA-Z0-9-]{0,20}$ but this one allows for multiple dashes and one may enter the dash at the begin/end of the input string.
You can use
^(?=.{0,20}$)(?:[a-zA-Z0-9]+(?:-[a-zA-Z0-9]+)?)?$
See the regex demo.
Details:
^ - start of string
(?=.{0,20}$) - zero to twenty chars allowed in the string
(?: - a non-capturing group start:
[a-zA-Z0-9]+ - one or more alphanumeric chars
(?:-[a-zA-Z0-9]+)? - an optional sequence of a - and one or more
alphanumeric chars
)? - end of the non-capturing group, repeat one or zero times (i.e. the pattern match is optional)
$ - end of string.
Have a try with:
^(?:[^\W_]{1,20}|(?!.{22})[^\W_]+-[^\W_]+)$
See an online demo
^ - Start-line anchor;
(?: - Open non-capture group;
[^\W_]{1,20} - Match between 1-20 alphanumeric characters;
| - Or;
(?!.{22})[^\W_]+-[^\W_]+ - Negative lookahead to assert position is not followed by 22 characters, and next we matching 1+ alphanumeric characters between an hyphen;
)$ - Close non-capture group before matching end-line anchor.
Note that the above assumes upto 20 alphanumeric characters but with one optional hyphen that would take the max count to 21 characters.
Another idea by use of a lookahead and word boundary at the end.
^(?!.{21})[A-Za-z\d]+-?[A-Za-z\d]*\b$
^(?!.{21}) the lookahead checks at start for max 20 characters
[A-Za-z\d]+ starting with one or more alphanumeric characters
-?[A-Za-z\d]* optional hyphen followed by any amount alnum
\b$ the word boundary forces to end with an alphanumeric char
See this demo at regex101
FYI: If \pL (letter) can be used to shorten: ^(?!.{21})[\pL\d]+-?[\pL\d]*\b$
Related
I would like to have just one regex to capture words separated by one space-character in combination with the opposite capture occurrences more than one space-character
I would like to have the following example covered:
This line with sometimes more than 1 space needs to be captured in 3 matches with 2 groups.
I expect the following groups:
([This line with][ ])([sometimes more than][ ])([1][ ])space needs to be captured in 3 matches with 2 groups.
To capture one of the two is no problem.
i.e.
to capture more than one space-char:
([\s]{2,})
and to capture words separated by only one space-char(see https://stackoverflow.com/a/60288115/3710053):
\S+(?:\s\S+)*
You might use an alternation to match either a word followed by a repeating pattern of a single space and a word OR match 2 or more spaces
\S+(?: \S+)*| {2,}
Explanation
\S+ Match 1+ non whitespace chars
(?: \S+)* Repeat 0+ times matching a space and 1+ non whitespace chars
| Or
{2,} Repeat 2 or more times matching a space
Regex demo
If you want to match whitespace chars instead, you could replace the space with \s but note that it could also possibly match newlines.
Edit
For the updated question, you could use 2 capturing groups:
(\S+(?: \S+)*)( {2,})
Explanation
( Capture group 1
\S+ Match 1+ non whitespace chars
(?: \S+)* Repeat 0+ times matching a space and 1+ non whitespace chars
) Close group 1
( Capture group 2
{2,} Match 2 or more spaces
) Close group 2
Regex demo
have this input
2019-12-04T21:24:24 or 2019-12-04 21:24:24
I tried to match if "T" is present or " " is present
I see two solutions
match all between 10 and 11 lenght
match only letter and whitespace
I tried this but nothing happen
^[a-zA-Z]{10,11}$
^.{10,11}$
I think there's a misunderstanding in your regex : what you've written means "Does the input is equivalent to a succession of 10 or 11 characters?", which will always be false for a DateTime. You should select the 11th letter then check if this character matches (T|\s) (either the letter T or a space).
You want
^[0-9]{4}-[0-9]{2}-[0-9]{2}[T ][0-9]{2}:[0-9]{2}:[0-9]{2}$
See the regex demo.
Details:
^ - start of string
[0-9]{4}-[0-9]{2}-[0-9]{2} - four digits, -, two digits, -, two digits
[T ] - T or a space
[0-9]{2}:[0-9]{2}:[0-9]{2} - two digits, :, two digits, :, two digits
$ - end of string.
just add this code to your textfield
for example:
inputFormatters: [FilteringTextInputFormatter.allow(RegExp("[ آ-ی]"))],
for space just space in your list RegExp
I have a column that should contain phone numbers but it contains whatever the user wanted. I need to create an update to remove all the characters after an invalid character.
To do this I am using a regex as PATINDEX('%[^0-9+-/()" "]%', [MobilNr]) and it seemed to work until I had some numbers as +1235, 36446 and to my surprise the result is 0 instead of 6. Also if the number contains . it returns 0.
Does PATINDEX ignores dot(".") and comma(",")? Are there other characters that PATINDEX will ignore?
It's not that PATINDEX ignores the comma and the dot, it's your pattern that created this problem.
With PATINDEX, the hyphen char (-) has a special meaning - it's in fact an operator that denotes an inclusive range - like 0-9 denotes all digits between 0 and 9 - so when you do +-/ it means all the chars between + and / (inclusive, of course). The comma and dot chars are within this range, that's why you get this result.
Fixing the pattern is easy: either use | as a logical or, or simply move the hyphen to the end of the pattern:
SELECT PATINDEX('%[^0-9/()" "+-]%', '+1235, 36446') -- Result: 6
I am using the following regex in my app:
^(([0-9|(\\,)]{0,10})?)?(\\.[0-9]{0,2})?$
So it allows 10 characters before the decimal and 2 character after it.
But I am inserting one additional functionality of formatting textfield as currency while typing. So if I have 1234567 it becomes 1,234,567 after formatting. The regex fails when I enter 10 characters instead of 10 digits. Ideally it should be that regex ignores the commas when counting 10.
I tried this too ^(([0-9|(\\,)]{0,13})?)?(\\.[0-9]{0,2})?$ but it doesn't seem the right approach.
Can anyone help me get a proper regex instead of using this tweak.
You may use
"^(?:,*[0-9]){0,10}(?:\.[0-9]{0,2})?$"
Or, if there must be a digit after . in the fractional part use
"^(?:,*[0-9]){0,10}(?:\.[0-9]{1,2})?$"
See the regex demo. The (?:,*[0-9]){0,10} part is what does the job: it matches any 0+ , chars followed with a single digit 0 to 10 times. If , can also appear before ., add ,* after the ((?:,*[0-9]){0,10})?.
Details
^ - start of string
(?:,*[0-9]){0,10} - 0 to 10 occurrences of 0+ commas followed with a digit
(?:\.[0-9]{0,2})? - an optional sequence of:
\. - a period
[0-9]{0,2} - 0 to 2 digits (if there must be a digit after . use [0-9]{1,2})
$ - end of string.
I have tried to create regex for the below:
STRING sou_u02_mlpv0747_CCF_ASB001_LU_FW_ALERT|/opt/app/medvhs/mvs/applications/cm_vm5/fwhome/UnifiedLogging|UL_\d{8}_CCF_ASB001_LU_sou_u02_mlpv0747_Primary.log.csv|FATAL|red|1h||fw_alert
REGEX----> /^[^#]\w+\|[^\|]+\|\w+\|\w+\|\w*\|\w*\|([^\|]+|)\|\w*$/
I am unable to figure out the mistake here.
I created the above by referring another regex which working fine and given below
/^[^#]\w+\|[^\|]+\|([^\|]+|)\|[rm]\|(in|out|old|new|arch|missing)\|\w+\|([0-9-,]+|)\|\w*\|\w*$/
sou_u02_mlpv0747_CCF_ASB001_LU_ODR|/opt/app/medvhs/mvs/applications/cm_vm5/components/CCF_ASB001_LU/SPOOL/ODR||r|out|30m|0400-1959|30m|gprs_in_stag
can some one please help me. Any leads would be highly apprciated.
Let's start from a brief look at your source text (the first that you included).
It is composed of "sections" separated with | char.
This char (|) must be matched by \|. Remember about the preceding
backslash, otherwise, a "bare" | would mean the alternative separator
(you used it in one place).
And now take a look at each section (between |):
Some of them contain only a sequence of word chars (and can be matched
by \w+).
Other sections, however, contain also other chars, e.g. slashes,
backslash, braces and dots, so each such section is actually a sequence
of chars other than "|" and must be matched by [^|]+ (here,
between [ and ], the vertical bar may be unescaped).
Now let's write each section and its "type":
sou_u02_..._FW_ALERT - word chars.
/opt/app/.../UnifiedLogging - other chars (because of slashes).
UL_\d{8}_..._Primary.log.csv - other chars (because of \d{8}
and dots).
FATAL|red|1h - 3 sections composed of word chars.
An empty section, between 2 consecutive | chars.
fw_alert - word chars.
And now, how to match these groups, and the separating |:
Point 1: \w+\| - word chars and (escaped) vertical bar.
Point 2 and 3 (together): (?:[^|]+\|){2} - a non-capturing
group - (?:...), containing a sequence of "other" chars - [^|]+
and a vertical bar - \|, occurring 2 times {2}.
Point 4 (three "word char" groups): (?:\w+\|){3} - similiar to
the previous point.
Point 5: Just as in your solution - ([^|]+|)\|, a capturing group -
(...), with 2 alternatives ...|.... The first alternative is
[^|]+ (a sequence of "other" chars), and the second alternative
is empty. After the capturing group there is \| to match the vertical
bar.
Point 6: \w+ - word chars. This time no \|, as this is the last
section.
The regex assembled so far must be:
prepended with a ^ (start of string) and
appended with a $ (end of string).
So the whole regex, matching your source text can be:
^\w+\|(?:[^|]+\|){2}(?:\w+\|){3}([^|]+|)\|\w+$
Actually, the only capturing group can be written another way,
as ([^|]*) - without alternatives, but with * as the
repetition count, allowing also empty content.
Your choice, which variant to apply.
The third field
UL_\d{8}_CCF_ASB001_LU_sou_u02_mlpv0747_Primary.log.csv
Contains a backslash, \, braces { } and dots .. None of these can be matched by \w
Note also that there is no need to escape a pipe | inside a characters class: [^|]+ is fine