Match "com.project.name" but not when it contains something else - swift

I have the following code:
var i = "test"
and
var i = "com.project.name.test"
print("something else")
fatalError("some error")
I have a regex:
"((?!com\.project\.name).)*"
to match any string that does NOT contain "com.project.name".
However, I want to modify it to still have the above condition but not if the line contains print\(.*?\) and fatalError\(.*?\).
Why do I want to do this? Because I can only use regex for SwiftLint custom rules and right now my regex is greedy and matches every single string in the project that the developers forgot to localize..
What I've tried:
"((?!com\\.project\\.name).)*(?!print)(?!fatalError)"
but it does not work and instead matches the same as the original expression.

You may use this regex with a negative lookahead assertions:
^(?!.*(?:com\.project\.name|print\(|fatalError\()).*
RegEx Demo
This negative lookahead assertion uses alternations to fail the match on 3 different matches anywhere in the input:
com\.project\.name
print\(
fatalError\(

Related

Count leading tabs in Swift string

I need to count the number of leading tabs in a Swift string. I know there are fairly simple solutions (e.g. looping over the string until a non-tab character is encountered) but I am looking for a more elegant solution.
I have attempted to use a regex such as ^\\t* along with the .numberOfMatches method but this detects all the tab characters as one match. For example, if the string has three leading tabs then that method just returns 1. Is there a way to write a regex that treats each individual tab character as a single match?
Also open to other ways of approaching this without using a regex.
Here is a non-regex solution
let count = someString.prefix(while: {$0 == "\t"}).count
You may use
\G\t
See the regex demo.
Here,
\G - matches a string start position or end of the previous match position, and
\t - matches a single tab.
Swift test:
let string = "\t\t123"
let regex = try! NSRegularExpression(pattern: "\\G\t", options: [])
let numberOfOccurrences = regex.numberOfMatches(in: string, range: NSRange(string.startIndex..., in: string))
print(numberOfOccurrences) // => 2

Check if string contains any of the following

I'm trying to check if a string contains one of four sub strings in a simpler way than this:
if (imageUrl.contains('.jpg') ||
imageUrl.contains('.png') ||
imageUrl.contains('.tif') ||
imageUrl.contains('.gif')) {
}
Is there a way to do this? For example checking against a list?
You can use a regex pattern instead of a simple string:
imageUrl.contains(new RegExp("\.(jpg|png|tif|gif)"))
Might be somewhat simpler.
RegularExpression can solve your problem. RegEx are used to search patterns in strings.
RegEx example:
^The matches any string that starts with The
end$ matches a string that ends with end
^The end$ exact string match (starts and ends with The end)
abc* matches a string that has ab followed by zero or more c

How to use regex quantifiers * and + for SwiftLint custom rule

I'm trying to write a custom rule for SwiftLint. Following the directions in the readme, I've added the following to .swiftlint.yml:
custom_rules:
multi_clause_guard:
regex: 'guard .*,'
However, this regex is not matching any lines in my project, despite there being plenty of lines where it should match, for example:
guard let x = Int(s), let y = Int(t) else { return }
I've tried various other values for the regex, and it works until you introduce a quantifier.
✅ 'guard .,' will match the line guard a,
✅ 'guard ..,' will match the line guard _a,
❌ 'guard .*,' will not match the line guard a,
❌ 'guard .+,' will not match the line guard a,
Is there a way I can use * and + in a SwiftLint custom rule?
It seems that quantifiers can be applied to character sets that you define explicitly. In this case, it was enough for me to replace . with [\h\S] (which includes horizontal whitespace characters and any other character that's not a whitespace character).
custom_rules:
multi_clause_guard:
regex: 'guard [\h\S]*,'
If anyone knows how to make quantifiers work with ., I'm still interested to know!

Could I specify pattern match priority in lex code?

I've got a related thread in the site(My lex pattern doesn't work to match my input file, how to correct it?)
The problems I met, is about how "greedy" lex will do pattern match, e.g. I've got my lex file:
$ cat b.l
%{
#include<stdio.h>
%}
%%
"12" {printf("head\n");}
"34" {printf("tail\n");}
.* {printf("content\n");}
%%
What I wish to say is, when meet "12", print "head"; when meet "34", print "tail", otherwise print "content" for the longest match that doesn't contain either "12" or "34".
But the fact was, ".*" was a greedy match that whatever I input, it prints "content".
My requirement is, when I use
12sdf2dfsd3sd34
as input, the output should be
head
content
tail
So seems there're 2 possible ways:
1, To specify a match priority for ".*", it should work only when neither "12" and "34" works to match. Does lex support "priority"?
2, to change the 3rd expression, as to match any contiguous string that doesn't contain sub-string of "12", or "34". But how to write this regular expression?
Does (f)lex support priority?
(F)lex always produces the longest possible match. If more than one rule matches the same longest match, the first one is chosen, so in that case it supports priority. But it does not support priority for shorter matches, nor does it implement non-greedy matching.
How to match a string which does not contain one or more sequences?
You can, with some work, create a regular expression which matches a string not containing specified substrings, but it is not particularly easy and (f)lex does not provide a simple syntax for such regular expressions.
A simpler (but slightly less efficient) solution is to match the string in pieces. As a rough outline, you could do the following:
"12" { return HEAD; }
"34" { if (yyleng > 2) {
yyless(yyleng - 2);
return CONTENT;
}
else
return TAIL;
}
.|\n { yymore(); }
This could be made more efficient by matching multiple characters when there is not chance of skipping a delimiter; change the last rule to:
.|[^13]+ { yymore(); }
yymore() causes the current token to be retained, so that the next match appends to the current token rather than starting a new token. yyless(x) returns all but the first x characters to the input stream; in this case, that is used to cause the end delimiter 34 to be rescanned after the CONTENT token is identified.
(That assumes you actually want to tokenize the input stream, rather than just print a debugging message, which is why I called it an outline solution.)

Scala string pattern matching for mathematical symbols

I have the following code:
val z: String = tree.symbol.toString
z match {
case "method +" | "method -" | "method *" | "method ==" =>
println("no special op")
false
case "method /" | "method %" =>
println("we have the special div operation")
true
case _ =>
false
}
Is it possible to create a match for the primitive operations in Scala:
"method *".matches("(method) (+-*==)")
I know that the (+-*) signs are used as quantifiers. Is there a way to match them anyway?
Thanks from a avidly Scala scholar!
Sure.
val z: String = tree.symbol.toString
val noSpecialOp = "method (?:[-+*]|==)".r
val divOp = "method [/%]".r
z match {
case noSpecialOp() =>
println("no special op")
false
case divOp() =>
println("we have the special div operation")
true
case _ =>
false
}
Things to consider:
I choose to match against single characters using [abc] instead of (?:a|b|c).
Note that - has to be the first character when using [], or it will be interpreted as a range. Likewise, ^ cannot be the first character inside [], or it will be interpreted as negation.
I'm using (?:...) instead of (...) because I don't want to extract the contents. If I did want to extract the contents -- so I'd know what was the operator, for instance, then I'd use (...). However, I'd also have to change the matching to receive the extracted content, or it would fail the match.
It is important not to forget () on the matches -- like divOp(). If you forget them, a simple assignment is made (and Scala will complain about unreachable code).
And, as I said, if you are extracting something, then you need something inside those parenthesis. For instance, "method ([%/])".r would match divOp(op), but not divOp().
Much the same as in Java. To escape a character in a regular expression, you prefix the character with \. However, backslash is also the escape character in standard Java/Scala strings, so to pass it through to the regular expression processing you must again prefix it with a backslash. You end up with something like:
scala> "+".matches("\\+")
res1 : Boolean = true
As James Iry points out in the comment below, Scala also has support for 'raw strings', enclosed in three quotation marks: """Raw string in which I don't need to escape things like \!""" This allows you to avoid the second level of escaping, that imposed by Java/Scala strings. Note that you still need to escape any characters that are treated as special by the regular expression parser:
scala> "+".matches("""\+""")
res1 : Boolean = true
Escaping characters in Strings works like in Java.
If you have larger Strings which need a lot of escaping, consider Scala's """.
E. g. """String without needing to escape anything \n \d"""
If you put three """ around your regular expression you don't need to escape anything anymore.