Is there a function to escape all regex-relevant characters? - swift

The regex I'm using in my application is a combination of user-input and code. Because I don't want to restrict the user I would like to escape all regex-relevant characters like "+", brackets , slashes etc. from the entry.
Is there a function for that or at least an easy way to get all those characters in an array so that I can do something like this:
for regexChar in regexCharacterArray{
myCombinedRegex = myCombinedRegex.replaceOccurences(of: regexChar, with: "\\" + regexChar)
}

Yes, there is NSRegularExpression.escapedPattern(for:):
Returns a string by adding backslash escapes as necessary to protect any characters that would match as pattern metacharacters.
Example:
let escaped = NSRegularExpression.escapedPattern(for: "[*]+")
print(escaped) // \[\*]\+

Related

Why I cannot use \ or backslash in a String in Swift?

I have a string like this in below and I want replace space with backslash and space.
let test: String = "Hello world".replacingOccurrences(of: " ", with: "\ ")
print(test)
But Xcode make error of :
Invalid escape sequence in literal
The code in up is working for any other character or words, but does not for backslash. Why?
Backslash is used to escape characters. So to print a backslash itself, you need to escape it. Use \\.
For Swift 5 or later you can avoid needing to escape backslashes using the enhanced string delimiters:
let backSlashSpace = #"\ "#
If you need String interpolation as well:
let value = 5
let backSlashSpaceWithValue = #"\\#(value) "#
print(backSlashSpaceWithValue) // \5
You can use as many pound signs as you wish. Just make sure to mach the same amount in you string interpolation:
let value = 5
let backSlashSpaceWithValue = ###"\\###(value) "###
print(backSlashSpaceWithValue) // \5
Note: If you would like more info about this already implemented Swift evolution proposal SE-0200 Enhancing String Literals Delimiters to Support Raw Text

Regular expression with backslash in swift

i am having problems using replacingOccurrences to replace a word after some specific keywords inside a textview in swift 5 and Xcode 12.
For example:
My textview will have the following string "NAME\JOHN PHONE\555444333"
"NAME" and "PHONE" will be unique so anytime i change the proper field i want to change the name or phone inside this textview.
let's for example change JOHN for CLOE with the code
txtOther.text = txtOther.text.replacingOccurrences(of: "NAME(.*?)\\s", with: "NAME\\\(new_value) ", options: .regularExpression)
print (string)
output: "NAMECLOE " instead of "NAME\CLOE "
I can't get the backslash to get print according to the regular expression.
Or maybe change the regex expression just to change JOHN for CLOE after "NAME"
Thanks!!!
Ariel
You can solve this by using a raw string for your regular expresion, that is a string surrounded with #
let pattern = #"(NAME\\)(.*)\s"#
Note that name and the \ is in a separate group that can be referenced when replacing
let output = string.replacingOccurrences(of: pattern, with: "$1\(new_value) ", options: .regularExpression)
Use
"NAME\\JOHN PHONE\\555444333".replacingOccurrences(
of: #"NAME\\(\S+)"#,
with: "NAME\\\\\(new_value)",
options: .regularExpression
)
Double backslashes in the replacement, backslash is a special metacharacter inside a replacement.
\S+ matches one or more characters different from whitespace, this is shorter and more efficient than .*?\s, and you do not have to worry about how to put back the whitespace.

Extracting range of unpadded string

I'd like to extract the Range<String.Index> of a sentence within its whitespace padding. For example,
let padded = " El águila (🦅). "
let sentenceRangeInPadded = ???
assert(padded[sentenceRangeInPadded] == "El águila (🦅).") // The test!
Here's some regex that I started with, but looks like variable length lookbehinds aren't supported.
let sentenceRangeInPadded = padded.range(of: #"(?<=^\s*).*?(?=\s*$)"#, options: .regularExpression)!
I'm not looking to extract the sentence (could just use trimmingCharacters(in:) for that), just the Range.
Thanks for reading!
You may use
#"(?s)\S(?:.*\S)?"#
See the regex demo.
Details
(?s) - a DOTALL modifier making . match any char, including line break chars
\S - the first non-whitespace char
(?:.*\S)? - an optional non-capturing group matching
.* - any 0+ chars as many as possible
\S - up to the last non-whitespace char.

Sed replacing Special Characters in a string

I am having difficulties replacing a string containing special characters using sed. My old and new string are shown below
oldStr = "# td=(nstates=20) cam-b3lyp/6-31g geom=connectivity"
newStr = "# opt b3lyp/6-31g geom=connectivity"
My sed command is the following
sed -i 's/\# td\=\(nstates\=20\) cam\-b3lyp\/6\-31g geom\=connectivity/\# opt b3lyp\/6\-31g geom\=connectivity/g' myfile.txt
I dont get any errors, however there is no match. Any ideas on how to fix my patterns.
Thanks
try s|# td=(nstates=20) cam-b3lyp/6-31g geom=connectivity|# opt b3lyp/6-31g geom=connectivity|g'
you can use next to anything after s instead of /, as your expression contains slashes I used | instead. -, = and # don't have to be escaped (minus only in character sets [...]), escaped parens indicate a group, nonescaped parens are literals.

Clean string from html tags and special characters

I want to clean my text from html tags, html spacial characters and characters like < > [ ] / \ * ,
I used $str = preg_replace("/&#?[a-zA-Z0-9]+;/i", "", $str);
it works well with html special characters but some characters doesn't remove like :
( /*/*]]>*/ )
how can I remove these characters?
If you are really using php as it looks like, you can just use:
$str = htmlspecialchars($str);
All HTML chars will be escaped (which could be better than just stripping them). If you really want just to filter these characters, what you need to do is escape those characters on the chars list:
$str = preg_replace("/[\&#\?\]\[\/\\\<\>\*\:\(\);]*/i","",$str);
Notice there's just one "/[]*/i", I removed the a-zA-Z0-9 as you should want these chars in. You can also classify only the desired chars to enter your string (will give you trouble with accentuations like á é ü if you use them, you have to specify every accepted char):
$str = preg_replace("/[^a-zA-Z0-9áÁéÉíÍãÃüÜõÕñÑ\.\+\-\_\%\$\#\!\=;]*/","",$str);
Notice also there's never too much to escape characters, unless for example for the intervals (\a-\z would do fine, \a-\z would match a, or -, or z).
I hope it helps. :)
Regular expression for html tags is:
/\<(.*)?\>/
so use something like this:
// The regular expression to remove HTML tags
$htmltagsregex = '/\<(.*)?\>/';
// what shit will substitute it
$nothing = '';
// the string I want to apply it to
$string = 'this is a string with <b>HTML tags</b> that I want to <strong>remove</strong>';
// DO IT
$result = preg_replace ($htmltagsregex,nothing,$string);
and it will return
this is a string with HTML tags that I want to remove
That's all