Convert a string JSON to a dictionary in Swift when curly quotes are used - swift

I have a string JSON, but it has fancy curly quotes in it which makes NSJSONSerialization fail.
let str = "{“title”:\"this is a “test” example\"}"
try! JSONSerialization.jsonObject(with: str.data(using: .utf8)!) // Error
The quotes around title are curly double quotes and apparently JSONSerialization can not handle it and fails. A naive approach would be to simple replace all instances of the curly quote with a non-curly one. The problem with that approach is that it will change the curly quotes around test which shouldn't be changed! The quotes around title are OK to be changed but the ones around test should not.
What can I do to get around this issue?

To fix this, you talk to whoever created the string, which does not contain JSON at the moment, and convince them to create a string that does contain JSON.
For JSON the rule is: If your parser can't parse it, then it's broken and you don't touch it.
The problem isn't that JSONSerialization cannot handle it. The problem is that JSONSerialization absolutely must not under any circumstances handle it. Because it's not JSON.

If curly quotes are only used for the keys, this regex will do the job:
let str = "{“title”:\"this is a “test” example\"}"
let strFixed = str.replacingOccurrences(
of: #"“(.[^”]*)”:\"(.[^\"]*)\""#,
with: "\"$1\":\"$2\"",
options: .regularExpression
)
// It's strongly recommended to use try/catch instead of force unwrapping.
let json = try! JSONSerialization.jsonObject(with: strFixed.data(using: .utf8)!)
If we print json, we get the correct result:
{
title = "this is a \U201ctest\U201d example";
}
Explanation
“(.[^”]*)”:\"(.[^\"]*)\"
------------------------
“(.[^”]*)” match everything between curly braces,
except the closing curling brace character
: separator between keys and values
\"(.[^\"]*)\" match everything between double quotes,
except the double quote character
\"$1\":\"$2\"
-------------
\"$1\" place the first captured group between double quotes
: separator between keys and values
\"$2\" place the second captured group between double quotes

Related

regex works in online tool but doesn't agree with NSRegularExpression

do {
// initialization failed, looks like I can not use "\\" here
let regex = try NSRegularExpression.init(pattern: "(?<!\\)\n")
let string = """
aaabbb
zzz
"""
// expect "aaabbb\nzzz"
print(regex.stringByReplacingMatches(in: string, options: [], range: NSMakeRange(0, string.count), withTemplate: "\\n"))
} catch let error {
print(error)
}
Here I want to replace "\n" in my string with "\\n", but failed at the very beginning, the error message is
// NSRegularExpression did not recognize the pattern correctly.
Error Domain=NSCocoaErrorDomain Code=2048 "The value “(?<!\)
” is invalid." UserInfo={NSInvalidValue=(?<!\)
}
The regex has been tested in regular expression 101, so it is right, just doesn't work in Swift for some reason.
How can I do this?
Base on Larme's comment:
in Swift, \ (double back slash) in a String is for "having a ``, as you see in the error, you have (?<!\), but it means then that you are escaping the closing ), so you have a missing closing ). I'd say that you should write then "(?<!\\\\)\n"?
I finally figured out what's going on and how to fix it.
The problem is backslash.
In Swift, a backslash inside double quotation mark would be treated as escape sequence, like this
// won't compile
// error: Invalid escape sequence in literal
let regex = try NSRegularExpression.init(pattern: "(?<!\)\n")
If we add another backslash, is it work?
No, cause these 2 backslashes would be treated as a single escape character for the upcoming closing ).
// compile but get a runtime error
let regex = try NSRegularExpression.init(pattern: "(?<!\\)\n")
Hence the runtime error
NSRegularExpression did not recognize the pattern correctly.
Error Domain=NSCocoaErrorDomain Code=2048 "The value “(?<!\)
” is invalid." UserInfo={NSInvalidValue=(?<!\)
To show that what we need is a literal backslash, we actually need 4 backslashes
let regex = try NSRegularExpression.init(pattern: "(?<!\\\\)\n")
The first two backslashes represent an escape character and the last two represent one literal backslash.
These seem very troublesome.
Better Approach
Fortunately, starting with Swift 5, we can use a pair of # to do this
// works like in online tool
let regex = try NSRegularExpression.init(pattern: #"(?<!\\)\n"#)
Another thing
It’s worth noticing that the initialization of regular expression isn’t the only thing that requires special handling
// withTemplate
print(regex.stringByReplacingMatches(in: string, options: [], range: NSMakeRange(0, string.count), withTemplate: #"\\n"#))
// As a comparison, this is OK
print(string.replacingOccurrences(of: "\n", with: "\\N"))

How to convert a JavaScript regex to Swift for email validation?

Hello I am new to Swift, and I am working on a project that requires regex validation for email validation, here's the code
static func validateEmail(string: String?) -> Style.ValidationResult {
guard let string = string else {
return .init(
isValid: false,
error: ValidationErrors.noEmail
)
}
let format = "^[^!-/[-_{-~]*(?:[0-9A-Za-z](?:[0-9A-Za-z]+|([.])(?!\1)))*([^!-/[-_{-~]){1,256}#[a-zA-Z0-9][a-zA-Z0-9-]{0,64}(\.[a-zA-Z0-9][a-zA-Z0-9-]{0,25})+"
let predicate = NSPredicate(format: "SELF MATCHES %#", format)
return .init(
isValid: predicate.evaluate(with: string),
error: ValidationErrors.noEmail
)
}
When I build the app and actually test this part, it either returns Can't do regex matching, reason: Can't open pattern U_REGEX_MISSING_CLOSE_BRACKET or simply cannot build.
I am aware that this is due to the escape character, but I have tried many times and still couldn't find out how to solve it, can anyone tell me the rules to conver JavaScript regex to Swift regex
You need to
Use a "raw" string literal to avoid double escaping backslashes that form regex escapes (you have \1 backreference and \. regex escape in the pattern), so use let format = #"..."#
You need to escape square brackets inside character classes in an ICU regex pattern because [ and ] are both special (they are used to create character class intersections and unions).
Thus, you need to use
let format = #"^[^!-/\[-_{-~]*(?:[0-9A-Za-z](?:[0-9A-Za-z]+|(\.)(?!\1)))*([^!-/\[-_{-~]){1,256}#[a-zA-Z0-9][a-zA-Z0-9-]{0,64}(?:\.[a-zA-Z0-9][a-zA-Z0-9-]{0,25})+"#

Swift: Provide default value inside string literal

I'm going to provide a default value for an optional String which is placed inside a String literal as a parameter (if I've used the right keywords!).
I want to know how can I set a default value for it using "??" operator?
I think I should use escape characters before double quote but I don't know what is the right syntax.
Here is my syntax which lead to error:
print ("Error \(response.result.error ?? "default value") ")
//------------------------ what should be ^here^^^^^^^^
Just wrap it in parentheses:
print("Error \((response.result.error ?? "default value"))")
The cleaner way is to use #Alladinian answer and put the string in a variable before printing it
You have to literally substitute default value with your message.
No need to escape double quotes since you're inside \(<expression>) (More about Swift's string interpolation here)
If you need a cleaner approach then do it in two steps:
let msg = response.result.error ?? "whatever"
print("Error: \(msg)")
Finally, if you want to print only non-nil errors (avoiding logs for responses that did not produce any errors) you could do:
if let error = response.result.error { print("Error: \(error)") }

Unable to find matches using email regular expression with unicode characters

we've got regular expression which is used on our backend for email validation:
/^((([a-z]|\d|[!#\$%&'*+-/=\?\^{\|}~]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])+(\.([a-z]|\d|[!#\$%&'\*\+\-\/=\?\^_{\|}~]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])+)*)|((\x22)((((\x20|\x09)*(\x0d\x0a))?(\x20|\x09)+)?(([\x01-\x08\x0b\x0c\x0e-\x1f\x7f]|\x21|[\x23-\x5b]|[\x5d-\x7e]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(\([\x01-\x09\x0b\x0c\x0d-\x7f]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF]))))*(((\x20|\x09)*(\x0d\x0a))?(\x20|\x09)+)?(\x22)))#((([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])([a-z]|\d|-|.||~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF]))).)+(([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])([a-z]|\d|-|.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])))$/i
I tested it using online tool, works fine:
https://regexr.com/3u9tn
However, I have some difficulties using it in our swift project. I had to escape special characters and wrap all unicode characters into {} to use as a swift string literal.
Here is playground:
import Foundation
let pattern = "/^((([a-z]|\\d|[!#\\$%&'\\*\\+\\-\\/=\\?\\^_`{\\|}~]|[\u{00A0}-\u{D7FF}\u{F900}-\u{FDCF}\u{FDF0}-\u{FFEF}])+(\\.([a-z]|\\d|[!#\\$%&'\\*\\+\\-\\/=\\?\\^_`{\\|}~]|[\u{00A0}-\u{D7FF}\u{F900}-\u{FDCF}\u{FDF0}-\u{FFEF}])+)*)|((\\x22)((((\\x20|\\x09)*(\\x0d\\x0a))?(\\x20|\\x09)+)?(([\\x01-\\x08\\x0b\\x0c\\x0e-\\x1f\\x7f]|\\x21|[\\x23-\\x5b]|[\\x5d-\\x7e]|[\u{00A0}-\u{D7FF}\u{F900}-\u{FDCF}\u{FDF0}-\u{FFEF}])|(\\\\([\\x01-\\x09\\x0b\\x0c\\x0d-\\x7f]|[\u{00A0}-\u{D7FF}\u{F900}-\u{FDCF}\u{FDF0}-\u{FFEF}]))))*(((\\x20|\\x09)*(\\x0d\\x0a))?(\\x20|\\x09)+)?(\\x22)))#((([a-z]|\\d|[\u{00A0}-\u{D7FF}\u{F900}-\u{FDCF}\u{FDF0}-\u{FFEF}])|(([a-z]|\\d|[\u{00A0}-\u{D7FF}\u{F900}-\u{FDCF}\u{FDF0}-\u{FFEF}])([a-z]|\\d|-|\\.|_|~|[\u{00A0}-\u{D7FF}\u{F900}-\u{FDCF}\u{FDF0}-\u{FFEF}])*([a-z]|\\d|[\u{00A0}-\u{D7FF}\u{F900}-\u{FDCF}\u{FDF0}-\u{FFEF}])))\\.)+(([a-z]|[\u{00A0}-\u{D7FF}\u{F900}-\u{FDCF}\u{FDF0}-\u{FFEF}])|(([a-z]|[\u{00A0}-\u{D7FF}\u{F900}-\u{FDCF}\u{FDF0}-\u{FFEF}])([a-z]|\\d|-|\\.|_|~|[\u{00A0}-\u{D7FF}\u{F900}-\u{FDCF}\u{FDF0}-\u{FFEF}])*([a-z]|[\u{00A0}-\u{D7FF}\u{F900}-\u{FDCF}\u{FDF0}-\u{FFEF}])))$/i"
extension String {
func matches(_ pattern: String) -> Bool {
do {
let internalExpression = try NSRegularExpression(pattern: pattern, options: .allowCommentsAndWhitespace)
let matches = internalExpression.matches(in: self, options: NSRegularExpression.MatchingOptions.reportCompletion, range:NSMakeRange(0, self.count))
return matches.count > 0
} catch let error {
print(error)
return false
}
}
}
let matches = "test#gmail.com".matches(pattern)
print(matches)
I tried different match options but still getting error and I'm a bit confused now how to make this work:
The value
“/^((([a-z]|\d|[!#\$%&'*+-/=\?\^{\|}~]|[ -퟿豈-﷏ﷰ-￯])+(\.([a-z]|\d|[!#\$%&'\*\+\-\/=\?\^_{\|}~]|[ -퟿豈-﷏ﷰ-￯])+)*)|((\x22)((((\x20|\x09)*(\x0d\x0a))?(\x20|\x09)+)?(([\x01-\x08\x0b\x0c\x0e-\x1f\x7f]|\x21|[\x23-\x5b]|[\x5d-\x7e]|[ -퟿豈-﷏ﷰ-￯])|(\([\x01-\x09\x0b\x0c\x0d-\x7f]|[ -퟿豈-﷏ﷰ-￯]))))*(((\x20|\x09)*(\x0d\x0a))?(\x20|\x09)+)?(\x22)))#((([a-z]|\d|[ -퟿豈-﷏ﷰ-￯])|(([a-z]|\d|[ -퟿豈-﷏ﷰ-￯])([a-z]|\d|-|.||~|[ -퟿豈-﷏ﷰ-￯])([a-z]|\d|[ -퟿豈-﷏ﷰ-￯]))).)+(([a-z]|[ -퟿豈-﷏ﷰ-￯])|(([a-z]|[ -퟿豈-﷏ﷰ-￯])([a-z]|\d|-|.|_|~|[ -퟿豈-﷏ﷰ-￯])([a-z]|[ -퟿豈-﷏ﷰ-￯])))$/i”
is invalid.
I have checked already similar questions and my question is not about which regular expression to use but how to make this one working since I'd like to keep it consistent
Any help appreciated.
There are a lot of changes you need to make to have the pattern work with ICU regex engine (ICU library provides regex capabilities in Swift/Objective-C).
Wrapping \uXXXX with \u{XXXX} is wrong, you just need to double escape the backslash as ICU regex supports \uXXXX notation
Do not escape chars that do not need escaping inside a character class (only escape \, [ and ], others can be easily placed inside without escaping, e.g. - should be put at the start/end of the character class)
Merge the separate character classes since they all match one single char (i.e. ([a-c]|[e-g]|\d) = [a-ce-g\d])
(\x22): no need using a capturing group to wrap a single atom, remove parentheses in such cases (a grouping is needed when you want to match several sequences or alternatives)
(\x20|\x09)*: single char matching atoms should be grouped as a character class for better efficiency, [\x20\x09]*
You may use (after double escaping)
^([-a-z\d!#\$%&'*+/=?^_`{|}~\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF]+(\.[-a-z\d!#$%&'*+/=?^_`{|}~\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF]+)*|(\x22((([\x20\x09]*\x0d\x0a)?[\x20\x09]+)?([\x01-\x08\x0b\x0c\x0e-\x1f\x7f\x21\x23-\x5b\x5d-\x7e\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF]|\\[\x01-\x09\x0b\x0c\x0d-\x7f\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF]))*(([\x20\x09]*\x0d\x0a)?[\x20\x09]+)?\x22))#(([a-z\d\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF]|([a-z\d\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF][-a-z\d._~\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF]*[a-z\d\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF]))\.)+([a-z\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF]|([a-z\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF][-a-z\d._~\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF]*[a-z\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF]))$
Like this:
let str = "дима#gmail.com"
let pattern = "(?i)^([-a-z\\d!#\\$%&'*+/=?^_`{|}~\\u00A0-\\uD7FF\\uF900-\\uFDCF\\uFDF0-\\uFFEF]+(\\.[-a-z\\d!#$%&'*+/=?^_`{|}~\\u00A0-\\uD7FF\\uF900-\\uFDCF\\uFDF0-\\uFFEF]+)*|(\\x22((([\\x20\\x09]*\\x0d\\x0a)?[\\x20\\x09]+)?([\\x01-\\x08\\x0b\\x0c\\x0e-\\x1f\\x7f\\x21\\x23-\\x5b\\x5d-\\x7e\\u00A0-\\uD7FF\\uF900-\\uFDCF\\uFDF0-\\uFFEF]|\\\\[\\x01-\\x09\\x0b\\x0c\\x0d-\\x7f\\u00A0-\\uD7FF\\uF900-\\uFDCF\\uFDF0-\\uFFEF]))*(([\\x20\\x09]*\\x0d\\x0a)?[\\x20\\x09]+)?\\x22))#(([a-z\\d\\u00A0-\\uD7FF\\uF900-\\uFDCF\\uFDF0-\\uFFEF]|([a-z\\d\\u00A0-\\uD7FF\\uF900-\\uFDCF\\uFDF0-\\uFFEF][-a-z\\d._~\\u00A0-\\uD7FF\\uF900-\\uFDCF\\uFDF0-\\uFFEF]*[a-z\\d\\u00A0-\\uD7FF\\uF900-\\uFDCF\\uFDF0-\\uFFEF]))\\.)+([a-z\\u00A0-\\uD7FF\\uF900-\\uFDCF\\uFDF0-\\uFFEF]|([a-z\\u00A0-\\uD7FF\\uF900-\\uFDCF\\uFDF0-\\uFFEF][-a-z\\d._~\\u00A0-\\uD7FF\\uF900-\\uFDCF\\uFDF0-\\uFFEF]*[a-z\\u00A0-\\uD7FF\\uF900-\\uFDCF\\uFDF0-\\uFFEF]))$"
print(str.range(of: pattern, options: .regularExpression) != nil) // => true
Note that (?i) turns on case insensitivity for the regex.

Using regex in Swift

I'm trying to replace the occurrences of all spaces and special characters within a string in Swift.
I don't know what the RegEx would be in Swift.
I am trying to use the built in function .replacingOccurences(of:) and passing in my RegEx as the string-protocol, while the code compiles, no changes are made.
Online there are many resources on how to replace occurrences in Swift however, the solutions are far more complicated than what seems to be nessacary.
How can I properly return this output (Current Strategy):
let word = "Hello World!!"
func regEx(word: String) -> String {
return word.replacingOccurrences(of: "/[^\\w]/g", with: "")
}
// return --> "HelloWorld"
You may use
return word.replacingOccurrences(of: "\\W+", with: "", options: .regularExpression)
Note the options: .regularExpression argument that actually enables regex-based search in the .replacingOccurrences method.
Your pattern is [^\w]. It is a negated character class that matches any char but a word char. So, it is equal to \W.
The /.../ are regex delimiters. In Swift regex, they are parsed as literal forward slashes, and thus your pattern did not work.
The g is a "global" modifier that let's a regex engine match multiple occurrences, but it only works where it is supported (e.g. in JavaScript). Since regex delimiters are not supported in Swift regex, the regex engine knows how to behave through the .replacingOccurrences method definition:
Returns a new string in which all occurrences of a target string in the receiver are replaced by another given string.
If you need to check ICU regex syntax, consider referring to ICU User Guide > Regular Expressions, it is the regex library used in Swift/Objective-C.
Additionally, you could extend String and accomplish the same bit. But if it's homework and they said function stick with that.
var word = "Hello World!!!"
extension String {
func regEx() -> String {
return self.replacingOccurrences(of: "\\W+", with: "", options: .regularExpression, range: nil)
}
}
word.regEx()