Regex pattern to find force unwrapped variables - swift

I want to find all the force unwrapped variables in my Xcode project. For example anything that's similar to:
variableName!.property
or
variableName!,
or
variableName! : otherVariable
or
variableName!)
Or any other similar occurrences of force unwrapped variables. What would be a regex pattern for that that I can use in the Xcode search?

This one will search for only valid variable names (alphanumeric strings starting with a letter) that are followed by an ! which is then followed by a space, tab, newline, or a period, comma, colon, or closing parenthesis . This search also excludes finding instances of try! and as!.
([A-z]+[A-Za-z0-9]*(?<![(try)(as)])![.,:)\n\t\r ])
This next pattern will match try! and as! if you are interested in finding those as well.
([A-z]+[A-Za-z0-9]*![.,:)\n\t\r ])
It should be noted both of these patterns will also match for variable types that are force unwrapped optionals (a common variable to be force unwrapped being #IBOutlets)
A really good resource for writing and testing regular expressions is regexr.com

Not fool-proof (that would require a full reading of the Swift syntax) but good enough for most cases:
\w[\w\d]*!
Of course you can simply search for ! - there are only a couple uses other than force unwrap: negative test (!=) and boolean (!valid). You may scoop up some string literals in the search, but unless you are writing an automatic tool, it hardly matters.

You should search .+!(\.|,| :|\)) using the Find > Regular Expression tool

Related

What's the common denominator for regex "pattern" in OpenAPI?

I'm using FastAPI, which allows pattern=re.compile("(?P<foo>[42a-z]+)...").
https://editor.swagger.io/ shows an error for this pattern.
My guess is that Python's named group syntax (?P<name>...) is different from ES2018 (?<name>...).
But, come to think of it, the idea of OpenAPI is interoperability, and some other language, esp. a compiled language may use yet another notation, or may not support named groups in the regular expressions at all.
What common denominator of regular expression syntax should I use?
OpenAPI uses json schema, and the json schema spec defines regex as "A regular expression, which SHOULD be valid according to the ECMA-262 regular expression dialect." Here is the relevant ECMA-262 section.
Of course non-javascript implementations probably won't care too much about it, and just use the default regex library of their platform. So good luck with figuring out the common denominator :)
I suggest just using as simple regexes as possible. And add some tests for it, using the library that you use in production.
Json Schema recommends a specific subset of regular expressions because the authors accept that most implementations will not support full ECMA 262 syntax:
https://json-schema.org/understanding-json-schema/reference/regular_expressions.html
A single unicode character (other than the special characters below) matches itself.
.: Matches any character except line break characters. (Be aware that what constitutes a line break character is somewhat dependent on your platform and language environment, but in practice this rarely matters).
^: Matches only at the beginning of the string.
$: Matches only at the end of the string.
(...): Group a series of regular expressions into a single regular expression.
|: Matches either the regular expression preceding or following the | symbol.
[abc]: Matches any of the characters inside the square brackets.
[a-z]: Matches the range of characters.
[^abc]: Matches any character not listed.
[^a-z]: Matches any character outside of the range.
+: Matches one or more repetitions of the preceding regular expression.
*: Matches zero or more repetitions of the preceding regular expression.
?: Matches zero or one repetitions of the preceding regular expression.
+?, *?, ??: The *, +, and ? qualifiers are all greedy; they match as much text as possible. Sometimes this behavior isn’t desired and you want to match as few characters as possible.
(?!x), (?=x): Negative and positive lookahead.
{x}: Match exactly x occurrences of the preceding regular expression.
{x,y}: Match at least x and at most y occurrences of the preceding regular expression.
{x,}: Match x occurrences or more of the preceding regular expression.
{x}?, {x,y}?, {x,}?: Lazy versions of the above expressions.
P.S. Kudos to #erosb for the idea how to find this recommendation.

Why Swift cannot remove spaces when compiling code?

Quite new to Swift, compared to Java and C++...I'm just wondering why Swift doesn't remove spaces when compiling code as following:
if x!=10 {...} //I have to add space before and after != to get rid of issue.
Increment like increment++ as well can not be act as increment in For syntax if I don't put a space between increment++ and { of loop block.
As in Java or C++, space and Tab do not make sense in terms of compiling. Is Swift just like Python in the way of consider space or tab as part of code?
Swift does not consider spaces as important, however it uses them when separating characters into lexemes.
Consider the following:
a != 1
a! =1
a!= 1
a!=1
The first one can be compiled because the lexical analysis correctly recognizes lexems a, != and 1, != being an infix operator.
In the second one, the lexical analysis recognizes lexem a with a postfix operator ! and a 1 with a prefix operator =.
The third one is lexem a with a postfix operator != and lexem 1.
The last one is ambiguous because it can be either a! = 1 or a != 1. The compiler decided probably based on operator priority to use a! = 1.
Spaces are ignored but they still have a meaning when distinguishing between ambiguous cases. The same is actually valid in many languages. The fact that you can define your own operators limits a bit your coding style.
To compare, try a+++b in Java or C++. Will it be a++ + b or a + ++b?
The exclamation mark is not only used as not for example. It is also used to unwrap an optional variable.
There is more syntactic difference to other languages.

Swift Not Equals, Forced Unwrap, and Whitespace

I've been enjoying Swift for a while now, but I found one syntax that is incredibly problematic.
Start with the assumption that:
let foo : String = ""
This is a fairly simple check:
if foo!="value" {
But alas, it won't compile. The compiler complains about trying to unwrap a value that is not an optional. I then change that line to:
if foo != "value {
The compiler is happy and the code behaves as expected. This is a case of significant whitespace, and I'm not content with it. I suspect there are situations that this may compile and behave contrary to my intention. Is there an alternative syntax that I should be using to avoid this type of error?
The alternate syntax is to put spaces around infix operators. They are required. Without spaces, it is treated as a prefix or postfix operator. With spaces it is an infix operator. Swift is very consistent about this. I know you realize this is what's happening; I just don't believe there's any way around it, and any cure would be worse than the disease (I can't come up with any examples where this would likely lead to real-world bugs).
Swift will be forgiving if there is no conflict, and allow 1+1 for instance, but you shouldn't do this, either. I believe good Swift style is to just put the spaces in. Yes, it's a case of significant whitespace. The whitespace here is significant, just as you can't say structFoo when you mean struct Foo.

Scala - Why does dotless not apply to this case

I'm parsing some XML, and I'm chaining calls without a dot. All of these methods take no parameters (except \\, which takes one), so it should be fairly possible to chain them without a dot, right?
This is the code that does not work:
val angle = filteredContextAttributes.head \\ "contextValue" text toDouble
The error is: not found: value toDouble
However, it does work like this:
(filteredContextAttributes.head \\ "contextValue" text) toDouble
text returns only a String and does not take parameters, and I don't see any other parameters needed in \\ to cause an issue.
What am I missing? I don't want to hack it out, but to understand what' the problem.
And also I can't use head without the dot. When removing the dot it says: Cannot resolve symbol head
It's because text is a postfix notation - this means a method follows the object and takes no parameters. The trick with postfix is that it can only appear at the end expression. That's why when you add parenthesis it works (the expression is then bounded by the parenthesis and you get two postfix notations, one ending with text and the second one ending with toDouble). In your example that's not the case as you are trying to call a method further in the chain.
That's also the reason why you need to do filteredContextAttributes.head and not filteredContextAttributes head. I'm sure if you do (filteredContextAttributes head) it will work as again the postfix notation will be at the end of the expression!
There are also prefix and infix notations in Scala and I urge you to read about them to get a hang of when you can skip . and () (for instance why you need () when using the map method etc.).
To add on what #Mateusz already answered, this is the because of mixing postfix notation and arity-0 suffix notation.
There's also a great write up in another answer: https://stackoverflow.com/a/5597154/125901
You can even see a warning on your shorter example:
scala> filteredContextAttributes.head \\ "contextValue" text
<console>:10: warning: postfix operator text should be enabled
by making the implicit value scala.language.postfixOps visible.
This can be achieved by adding the import clause 'import scala.language.postfixOps'
or by setting the compiler option -language:postfixOps.
See the Scala docs for value scala.language.postfixOps for a discussion
why the feature should be explicitly enabled.
Which is a pretty subtle hint that this isn't the best construct style-wise. So, if you aren't specifically working in a DSL, then you should prefer adding in explicit dots and parenthesis, especially when mixing infix, postfix and/or suffix notations.
For example, you can prefer doc \\ "child" over doc.\\("child"), but once you step outside the DSL--in this example when you get your NodeSeq--prefer adding in perens.

Can actions in Lex access individual regex groups?

Can actions in Lex access individual regex groups?
(NOTE: I'm guessing not, since the group characters - parentheses - are according to the documentation used to change precedence. But if so, do you recommend an alternative C/C++ scanner generator that can do this? I'm not really hot on writing my own lexical analyzer.)
Example:
Let's say I have this input: foo [tagName attribute="value"] bar and I want to extract the tag using Lex/Flex. I could certainly write this rule:
\[[a-z]+[[:space:]]+[a-z]+=\"[a-z]+\"\] printf("matched %s", yytext);
But let's say I would want to access certain parts of the string, e.g. the attribute but without having to parse yytext again (as the string has already been scanned it doesn't really make sense to scan part of it again). So something like this would be preferable (regex groups):
\[[a-z]+[[:space:]]+[a-z]+=\"([a-z]+)\"\] printf("matched attribute %s", $1);
You can separate it to start conditions. Something like this:
%x VALUEPARSE ENDSTATE
%%
char string_buf[100];
<INITIAL>\[[a-z]+[[:space:]]+[a-z]+=\" {BEGIN(VALUEPARSE);}
<VALUEPARSE>([a-z]+) (strncpy(string_buf, yytext, yyleng);BEGIN(ENDSTATE);} //getting value text
<ENDSTATE>\"\] {BEGIN(INITIAL);}
%%
About an alternative C/C++ scanner generator - I use QT class QRegularExpression for same things, it can very easy get regex group after match.
Certainly at least some forms of them do.
But the default lex/flex downloadable from sourceforge.org do not seem to list it in their documentation, and this example leaves the full string in yytext.
From IBM's LEX documentation for AIX:
(Expression)
Matches the expression in the parentheses.
The () (parentheses) operator is used for grouping and causes the expression within parentheses to be read into the yytext array. A group in parentheses can be used in place of any single character in any other pattern.
Example: (ab|cd+)?(ef)* matches such strings as abefef, efefef, cdef, or cddd; but not abc, abcd, or abcdef.