I have the following behavior of a desktop version of a software:
a text field is validated with a regex abc|xyz
this means that a user is allowed to type a, ab, x, xy, abc, xyz, other symbols will not be shown in a text field
I have to port such behavior to iphone.
Strings a, ab, x, xy does not match to regex abc|xyz that's why matching user input will restrict typing any char to a text filed. Is there any way to match a user input as a beginning of a regex string? Initial regex should not be modified (^$ could be added).
The desktop version uses QRegExpValidator class from Qt. QRegExpValidator has Intermediate state that is a, ab, x, xy in my case. It uses matchedLength() to determine Intermediate state
Also I had a look through NSRegularExpression and RegexKitLite
Have a look at PCRE - Perl Compatible Regular Expressions. It is rumored to be used in Apple products.
Related
I'm deobfuscating some code and I forget the operator to use a wildcard while searching for text in VSCode. By this I mean in VSCode whenever you search for code (CMD/CTRL + F), what is the character for a wild card (i.e searching for "date{WILDCARD HERE}" would return "date1","date2","date", etc.)
I don't recall a wildcard option (I've never used it at least). But the search feature supports using regular expressions.
Given your examples of date1, date2, date, etc. assuming it followed a pattern of date<n> where n is a number (or nothing in the case of just "date"), the regular expression of date[1-9]* should achieve what you want.
You can test the expression out on this site. Input the regular expression and some sample data and see how it matches.
I'm having trouble finding the proper Word wildcard string to find numbers that fit the following patterns:
"NN NN NN" or "NN NN NN.NN" (where N is any number 0-9)
The trouble is the first string is a subset of the second string. My goal is to find a single wildcard string that will capture both. Unfortunately, I need to use an operator that is zero or more occurrences for the ".NN" portion and that doesn't exist.
I'm having to do two searches, and I'm using the following patterns:
[0-9]{2}[^s ][0-9]{2}[^s ][0-9]{2}?[!0-9]
[0-9]{2}[^s ][0-9]{2}[^s ][0-9]{2}.[0-9]{2}
The problem is that first pattern (in bold). It works well unless I have the number in a table or something and there is nothing after it to match (or not match, if you will) the [!0-9].
You could use a single wildcard Find:
[0-9]{2}[^s ][0-9]{2}[^s ][0-9][0-9.]{1,4}
or:
[0-9]{2}[^s ][0-9]{2}[^s ][0-9][0-9.]{1;4}
to capture both. Which you use depends on your regional settings.
I have the user input "What is the hostname of serial GX0211229342?". The serial can be a numeric or alphanumeric mix (e.g. 7842344 or H52WBD1 etc).
How can I extract GX0211229342 from the sentence and set it into context in Watson assistant (Watson Conversation)?
Your case is tricky because if the ID is only letters it could be any part of the sentence. Using the $, you have told the regex processor to look for the pattern at the end of the sentence. Hence, it only works for those cases.
What you could do is to make use of a non-capturing group provided by the RE2 syntax. There are some examples of non-capturing group here on SO. Basically, search for something like the following (not tested):
(?:serial)(?:number)?[0-9a-zA-Z]+
The first ("serial") would be detected and ignored, the "number" is optional and would be ignored, then comes the alphanumeric.
If the serial number can be defined by 1 or 2, any number of regular expressions then you have the option of creating a serial number entity based on those regular expressions.
The conversation service will be able to identify the serial numbers based on the entity pattern matching.
I figure it out, using Watson entity pattern, and the regular expression should be this: ([0-9]+[a-zA-Z]+|[a-zA-Z]+[0-9]+)[0-9a-zA-Z]*
it will be used to extract alphanumeric from input.
and one more pattern is [0-9]+ it was used to extract numbers.
Thank you all help.
Background
I have search indexes containing Greek characters. Many people don't know how to type Greek so they enter something called "beta-code". Beta-code can be converted into Greek. For example, beta-code "NO/MOU" would be converted to "νόμου". Characters such as a slash or parenthesis is used to indicate an accent.
Desired Behavior
I want users to be able to search using either beta-code or text in the Greek script. I figured out that the Whoosh Variations class provides the mechanism I need and it almost solves my problem.
Problem
The Variation class works well except for when a slash or a parenthesis are used to indicate an accent in a users' query. The problem is the query are parsed such that the the special characters used to denote the accent result in the words being split up. For example, a search for "NO/MOU" results in the Variations class being asked to find variations of "no" and "mou" instead of "NO/MOU".
Question
Is there a way to influence how the query is parsed such that slashes and parentheses are included in the search words (i.e. that a search for "NO/MOU" results in a search for a token of ""NO/MOU" instead of "no" and "mou")?
The search parser uses a Tokenizer class for breaking up the search string into individual terms. Whoosh will use the class that is associated with the schema. For example, the case below, the SimpleAnalyzer() will be used when searching the "content" field.
Schema( verse_id = NUMERIC(unique=True, stored=True),
content = TEXT(analyzer=SimpleAnalyzer()) )
By default, the SimpleAnalyzer() uses the following regular expression to tokenize search terms: "\w+(.?\w+)*"
To use a different regular expression, assign the first argument to the SimpleAnalyzer to another regular expression. For example, to include beta-code characters (slashes, parentheses, etc.) in tokens, use the following SimpleAnalyzer:
SimpleAnalyzer( rcompile(r"[\w/*()=\+|&']+(\.?[\w/*()=\+|&']+)*") )
Searches will now allow terms to include the special beta-code characters and the Variations class will be able to convert the term to the unicode version.
Can actions in Lex access individual regex groups?
(NOTE: I'm guessing not, since the group characters - parentheses - are according to the documentation used to change precedence. But if so, do you recommend an alternative C/C++ scanner generator that can do this? I'm not really hot on writing my own lexical analyzer.)
Example:
Let's say I have this input: foo [tagName attribute="value"] bar and I want to extract the tag using Lex/Flex. I could certainly write this rule:
\[[a-z]+[[:space:]]+[a-z]+=\"[a-z]+\"\] printf("matched %s", yytext);
But let's say I would want to access certain parts of the string, e.g. the attribute but without having to parse yytext again (as the string has already been scanned it doesn't really make sense to scan part of it again). So something like this would be preferable (regex groups):
\[[a-z]+[[:space:]]+[a-z]+=\"([a-z]+)\"\] printf("matched attribute %s", $1);
You can separate it to start conditions. Something like this:
%x VALUEPARSE ENDSTATE
%%
char string_buf[100];
<INITIAL>\[[a-z]+[[:space:]]+[a-z]+=\" {BEGIN(VALUEPARSE);}
<VALUEPARSE>([a-z]+) (strncpy(string_buf, yytext, yyleng);BEGIN(ENDSTATE);} //getting value text
<ENDSTATE>\"\] {BEGIN(INITIAL);}
%%
About an alternative C/C++ scanner generator - I use QT class QRegularExpression for same things, it can very easy get regex group after match.
Certainly at least some forms of them do.
But the default lex/flex downloadable from sourceforge.org do not seem to list it in their documentation, and this example leaves the full string in yytext.
From IBM's LEX documentation for AIX:
(Expression)
Matches the expression in the parentheses.
The () (parentheses) operator is used for grouping and causes the expression within parentheses to be read into the yytext array. A group in parentheses can be used in place of any single character in any other pattern.
Example: (ab|cd+)?(ef)* matches such strings as abefef, efefef, cdef, or cddd; but not abc, abcd, or abcdef.