trimmingCharacters not working as expected when characters include hyphen Swift - swift

Trying to understand what is going wrong in playgrounds with the next example :
let result = "+-----+".trimmingCharacters(in: CharacterSet(charactersIn: "+").inverted)
result is "+-----+"
expected result is "++"
due to method reference "Returns a new string made by removing from both ends of the String characters contained in a given character set."
Examples that work how I expect:
let result = "D123ABC".trimmingCharacters(in: CharacterSet(charactersIn: "01234567890.").inverted)
result is "123"
let result = "+-----+".trimmingCharacters(in: CharacterSet(charactersIn: "*").inverted)
result is ""

trimmingCharacters only replaces the trailing/leading characters.
If you want to replace all characters that are not "+" you can use
"+-----+".replacingOccurrences(of: "[^+]", with: "", options: .regularExpression)

Agree with rmaddy.
For more explanation check this:
let result = "123+--+abc".trimmingCharacters(in: CharacterSet(charactersIn: "+").inverted)
Result: +--+
let result = "+--+".trimmingCharacters(in: CharacterSet(charactersIn: "+").inverted)
Result: +--+

Related

Split String or Substring with Regex pattern in Swift

First let me point out... I want to split a String or Substring with any character that is not an alphabet, a number, # or #. That means, I want to split with whitespaces(spaces & line breaks) and special characters or symbols excluding # and #
In Android Java, I am able to achieve this with:
String[] textArr = text.split("[^\\w_##]");
Now, I want to do the same in Swift. I added an extension to String and Substring classes
extension String {}
extension Substring {}
In both extensions, I added a method that returns an array of Substring
func splitWithRegex(by regexStr: String) -> [Substring] {
//let string = self (for String extension) | String(self) (for Substring extension)
let regex = try! NSRegularExpression(pattern: regexStr)
let range = NSRange(string.startIndex..., in: string)
return regex.matches(in: string, options: .anchored, range: range)
.map { match -> Substring in
let range = Range(match.range(at: 1), in: string)!
return string[range]
}
}
And when I tried to use it, (Only tested with a Substring, but I also think String will give me the same result)
let textArray = substring.splitWithRegex(by: "[^\\w_##]")
print("substring: \(substring)")
print("textArray: \(textArray)")
This is the out put:
substring: This,is a #random #text written for debugging
textArray: []
Please can Someone help me. I don't know if the problem if from my regex [^\\w_##] or from splitWithRegex method
The main reason why the code doesn't work is range(at: 1) which returns the content of the first captured group, but the pattern does not capture anything.
With just range the regex returns the ranges of the found matches, but I suppose you want the characters between.
To accomplish that you need a dynamic index starting at the first character. In the map closure return the string from the current index to the lowerBound of the found range and set the index to its upperBound. Finally you have to add manually the string from the upperBound of the last match to the end.
The Substring type is a helper type for slicing strings. It should not be used beyond a temporary scope.
extension String {
func splitWithRegex(by regexStr: String) -> [String] {
guard let regex = try? NSRegularExpression(pattern: regexStr) else { return [] }
let range = NSRange(startIndex..., in: self)
var index = startIndex
var array = regex.matches(in: self, range: range)
.map { match -> String in
let range = Range(match.range, in: self)!
let result = self[index..<range.lowerBound]
index = range.upperBound
return String(result)
}
array.append(String(self[index...]))
return array
}
}
let text = "This,is a #random #text written for debugging"
let textArray = text.splitWithRegex(by: "[^\\w_##]")
print(textArray) // ["This", "is", "a", "#random", "#text", "written", "for", "debugging"]
However in macOS 13 and iOS 16 there is a new API quite similar to the java API
let text = "This,is a #random #text written for debugging"
let textArray = Array(text.split(separator: /[^\w_##]/))
print(textArray)
The forward slashes indicate a regex literal

Regular expressions in swift

I'm bit confused by NSRegularExpression in swift, can any one help me?
task:1 given ("name","john","name of john")
then I should get ["name","john","name of john"]. Here I should avoid the brackets.
task:2 given ("name"," john","name of john")
then I should get ["name","john","name of john"]. Here I should avoid the brackets and extra spaces and finally get array of strings.
task:3 given key = value // comment
then I should get ["key","value","comment"]. Here I should get only strings in the line by avoiding = and //
I have tried below code for task 1 but not passed.
let string = "(name,john,string for user name)"
let pattern = "(?:\\w.*)"
do {
let regex = try NSRegularExpression(pattern: pattern, options: .caseInsensitive)
let matches = regex.matches(in: string, options: [], range: NSRange(location: 0, length: string.utf16.count))
for match in matches {
if let range = Range(match.range, in: string) {
let name = string[range]
print(name)
}
}
} catch {
print("Regex was bad!")
}
Thanks in advance.
RegEx in Swift
These posts might help you to explore regular expressions in swift:
Does a string match a pattern?
Swift extract regex matches
How can I use String slicing subscripts in Swift 4?
How to use regex with Swift?
Swift 3 - How do I extract captured groups in regular expressions?
How to group search regular expressions using swift?
Task 1 & 2
This expression might help you to match your desired outputs for both Task 1 and 2:
"(\s+)?([a-z\s]+?)(\s+)?"
Based on Rob's advice, you could much reduce the boundaries, such as the char list [a-z\s]. For example, here, we can also use:
"(\s+)?(.*?)(\s+)?"
or
"(\s+)?(.+?)(\s+)?"
to simply pass everything in between two " and/or space.
RegEx
If this wasn't your desired expression, you can modify/change your expressions in regex101.com.
RegEx Circuit
You can also visualize your expressions in jex.im:
JavaScript Demo
const regex = /"(\s+)?([a-z\s]+?)(\s+)?"/gm;
const str = `"name","john","name of john"
"name"," john","name of john"
" name "," john","name of john "
" name "," john"," name of john "`;
const subst = `\n$2`;
// The substituted value will be contained in the result variable
const result = str.replace(regex, subst);
console.log('Substitution result: ', result);
Task 3
This expression might help you to design an expression for the third task:
(.*?)([a-z\s]+)(.*?)
const regex = /(.*?)([a-z\s]+)(.*?)/gm;
const str = `key = value // comment
key = value with some text // comment`;
const subst = `$2,`;
// The substituted value will be contained in the result variable
const result = str.replace(regex, subst);
console.log('Substitution result: ', result);
Separate the string by non alpha numeric characters except white spaces. Then trim the elements with white spaces.
extension String {
func words() -> [String] {
return self.components(separatedBy: CharacterSet.alphanumerics.inverted.subtracting(.whitespaces))
.filter({ !$0.isEmpty })
.map({ $0.trimmingCharacters(in: .whitespaces) })
}
}
let string1 = "(name,john,string for user name)"
let string2 = "(name, john,name of john)"
let string3 = "key = value // comment"
print(string1.words())//["name", "john", "string for user name"]
print(string2.words())//["name", "john", "name of john"]
print(string3.words())//["key", "value", "comment"]
Here I have done with after understanding all of above comments.
let text = """
Capturing and non-capturing groups are somewhat advanced topics. You’ll encounter examples of capturing and non-capturing groups later on in the tutorial
"""
extension String {
func rex (_ expr : String)->[String] {
return try! NSRegularExpression(pattern: expr, options: [.caseInsensitive])
.matches(in: self, options: [], range: NSRange(location: 0, length: self.count))
.map {
String(self[Range($0.range, in: self)!])
}
}
}
let r = text.rex("(?:\\w+-\\w+)") // pass any rex
A single pattern, works for test:1...3, in Swift.
let string =
//"(name,john,string for user name)" //test:1
//#"("name"," john","name of john")"# //test:2
"key = value // comment" //test:3
let pattern = #"(?:\w+)(?:\s+\w+)*"# //Swift 5+ only
//let pattern = "(?:\\w+)(?:\\s+\\w+)*"
do {
let regex = try NSRegularExpression(pattern: pattern)
let matches = regex.matches(in: string, range: NSRange(0..<string.utf16.count))
let matchingWords = matches.map {
String(string[Range($0.range, in: string)!])
}
print(matchingWords) //(test:3)->["key", "value", "comment"]
} catch {
print("Regex was bad!")
}
Let’s consider:
let string = "(name,José,name is José)"
I’d suggest a regex that looks for strings where:
It’s the substring either after the ( at the start of the full string or after a comma, i.e., look behind assertion of (?<=^\(|,);
It’s the substring that does not contain , within it, i.e., [^,]+?;
It’s the substring that is terminated by either a comma or ) at the end of the full string, i.e., look ahead assertion of (?=,|\)$), and
If you want to have it skip white space before and after the substrings, throw in the \s*+, too.
Thus:
let pattern = #"(?<=^\(|,)\s*+([^,]+?)\s*+(?=,|\)$)"#
let regex = try! NSRegularExpression(pattern: pattern)
regex.enumerateMatches(in: string, range: NSRange(string.startIndex..., in: string)) { match, _, _ in
if let nsRange = match?.range(at: 1), let range = Range(nsRange, in: string) {
let substring = String(string[range])
// do something with `substring` here
}
}
Note, I’m using the Swift 5 extended string delimiters (starting with #" and ending with "#) so that I don’t have to escape my backslashes within the string. If you’re using Swift 4 or earlier, you’ll want to escape those back slashes:
let pattern = "(?<=^\\(|,)\\s*+([^,]+?)\\s*+(?=,|\\)$)"

How can I extract an unknown substring from a string in Swift 4?

Using swift 4, I need to parse a string to get substrings that will always be different. For example:
let str = "[33376:7824459] Device Sarah's Phone (Hardware: D21AP, ECID: 8481036056622, UDID: 76e6bc436fdcfd6c4e39c11ed2fe9236bb4ec, Serial: F2LVP5JCLY)"
let strRange = str.range(of: "(?<=Device )(?= (Hardware)", options: .regularExpression)
print(strRange!)
I would think this would output "Sarah's Phone"
I'm not getting any errors on this code, but it's also not working. What am I doing wrong?
Several problems:
You have a lookahead and lookbehind here, but nothing that would actually match any characters, so it'll never match anything except an empty string.
You didn't properly escape the parenthesis in your lookahead.
You should use if let or guard let, rather than !, to unwrap the optional. Otherwise, you'll get a crash when you encounter an input string that doesn't match the pattern.
I'm not sure why you'd expect print(strRange) to output text. strRange is a range, not a string or a substring.
This sample will fix your problems:
import Foundation
let str = "[33376:7824459] Device Sarah's Phone (Hardware: D21AP, ECID: 8481036056622, UDID: 76e6bc436fdcfd6c4e39c11ed2fe9236bb4ec, Serial: F2LVP5JCLY)"
if let strRange = str.range(of: "(?<=Device ).*(?= \\(Hardware)", options: .regularExpression) {
print(str[strRange])
} else {
print("Not Found")
}

String Indexes in swift

let greeting = "Hello World"
greeting[greeting.startIndex]
This code will print first character of string "H"
let = "Hello World"
greeting.startIndex
greeting.endIndex
whats the difference? this thing don't do anything. Even i dont get error for
greeting.endIndex
of course I am not retrieving string value through subscript syntax but in the Substring topic I found
greeting.endIndex
it is print only
string.index
Ok let me explain why I am asking about string.index and greeting.endIndex
let greeting = "Hello_world!"
let index = greeting.index(of: "_") ?? greeting.endIndex
let beginning = greeting[..<index]
// beginning is "Hello"
This code is related to substring, I totally understand what is it doing and in second line of code there is Nil-Coalescing Operator and you know what is it for. But what if
let greeting = "Hello world!"
let index = greeting.index(of: "_") ?? greeting.endIndex
let beginning = greeting[..<index]
// beginning is "Hello world"
If there is no "_" in string value means greeting.index(of: "_") is nil then it should be returns default value greeting.endIndex as Nil-Coalescing Operator does right. So why does greeting.endIndex returning `"Hello world!"
Your code does nothing. This code however
let greeting = "Hello"
print(greeting[greeting.startIndex]) // Prints 'H'
print(greeting[greeting.endIndex])
Causes a fatal error because greeting.endIndex actually "points" to just after the end of the String, so the last statement is an array bounds violation.
In case you describe startIndex return position of the first character in a nonempty string and the endIndex return the position one greater than the last valid subscript argument.
Also in version 4 of the Swift. String is represented as Collection.
So when you do:
greeting[greeting.startIndex]
You ask greeting string to return element in the first position that is "H"
The startIndex is the start of the string. The endIndex is the end of the string. You can use these when building ranges or when creating new indexes as an offsetBy from one or both of these.
So, for example, if you want a Substring that excluded the first and last characters, you could set fromIndex to be startIndex plus 1 and toIndex to be endIndex less 1, yielding:
let string = "Hello World"
let fromIndex = string.index(string.startIndex, offsetBy: 1)
let toIndex = string.index(string.endIndex, offsetBy: -1)
let substring = string[fromIndex..<toIndex] // "ello Worl"
You then ask:
If there is no "_" in string value means greeting.index(of: "_") is nil then it should be returns default value greeting.endIndex as Nil-Coalescing Operator does right. So why does greeting.endIndex returning `"Hello world!"
Make sure you don't conflate the string[index] syntax (which returns a Character) and the string[..<index] which returns a Substring from the startIndex up to index (i.e. "return Substring of everything up to index"):
let beginning = greeting[..<index]
This partially bound range is just short-hand for:
let beginning = greeting[greeting.startIndex..<index]
So, when you default index to the greeting.endIndex, that's like saying you want a substring that is, effectively, the whole string:
let beginning = greeting[greeting.startIndex..<greeting.endIndex]
So, that's why the syntax presented in your question, greeting[..<index], returns the whole string if _ was not found and it used endIndex.
As an aside, if you wanted different behavior, namely for it to return an empty substring if the _ is not found, you could do the following:
let index = greeting.index(of: "_") ?? greeting.startIndex
let beginning = greeting[..<index]
Or, if you think that's a little too cute, just adopt standard safe unwrapping patterns, e.g.:
guard let index = greeting.index(of: "_") else {
// handle unfound _ here
}
let beginning = greeting[..<index]

How can we remove every characters other than numbers, dot and colon in swift?

I am stuck at getting a string from html body
<html><head>
<title>Uaeexchange Mobile Application</title></head><body>
<div id='ourMessage'>
49.40:51.41:50.41
</div></body></html>
I Would like to get the string containing 49.40:51.41:50.41 . I don't want to do it by string advance or index. Can I get this string by specifying I need only numbers,dot(.) and colon(:) in swift. I mean some numbers and some special characters?
I tried
let stringArray = response.componentsSeparatedByCharactersInSet(
NSCharacterSet.decimalDigitCharacterSet().invertedSet)
let newString = stringArray.joinWithSeparator("")
print("Trimmed\(newString)and count\(newString.characters.count)")
but this obviously trims away dot and colon too. any suggestions friends?
The simple answer to your question is that you need to include "." & ":" in the set that you want to keep.
let response: String = "<html><head><title>Uaeexchange Mobile Application</title></head><body><div id='ourMessage'>49.40:51.41:50.41</div></body></html>"
var s: CharacterSet = CharacterSet.decimalDigits
s.insert(charactersIn: ".:")
let stringArray: [String] = response.components(separatedBy: s.inverted)
let newString: String = stringArray.joined(separator: "")
print("Trimmed '\(newString)' and count=\(newString.characters.count)")
// "Trimmed '49.40:51.41:50.41' and count=17\n"
Without more information on what else your response might be, I can't really give a better answer, but fundamentally this is not a good solution. What if the response had been
<html><head><title>Uaeexchange Mobile Application</title></head><body>
<div id='2'>Some other stuff: like this</div>
<div id='ourMessage'>49.40:51.41:50.41</div>
</body></html>
Using a replace/remove solution to this is a hack, not an algorithm - it will work until it doesn't.
I think you should probably be looking for the <div id='ourMessage'> and reading from there to the next <, but again, we'd need more information on the specification of the format of the response.
I'd recommend to use an HTML parser, nevertheless this is a simple solution with regular expression:
let extractedString = response.replacingOccurrences(of: "[^\\d:.]+", with: "", options: .regularExpression)
Or the positive regex search which is more code but also more reliable:
let pattern = ">\\s?([\\d:.]+)\\s?<"
let regex = try! NSRegularExpression(pattern: pattern)
if let match = regex.firstMatch(in: response, range: NSMakeRange(0, response.utf8.count)) {
let range = match.rangeAt(1)
let startIndex = response.index(response.startIndex, offsetBy: range.location)
let endIndex = response.index(startIndex, offsetBy: range.length)
let extractedString = response.substring(with: startIndex..<endIndex)
print(extractedString)
}
While the simple (negative) regex search removes all characters which don't match digits, dots and colons the positive search considers also the closing (>) and opening tags (<) around the desired result so an accidental digit, dot or colon doesn't match the pattern.
You can also use the String.replacingOccurrences() method in other ways, without regex, as follows:
import Foundation
var response: String = "<html><head><title>Uaeexchange Mobile Application</title></head><body><div id='ourMessage'>49.40:51.41:50.41</div></body></html>"
let charsNotToBeTrimmed = (0...9).map{String($0)} + ["." ,":"] // you can add any character you want here, that's the advantage
for i in response.characters{
if !charsNotToBeTrimmed.contains(String(i)){
response = response.replacingOccurrences(of: String(i), with: "")
}
}
print(response)
Basically, this creates an array of characters which should not be trimmed and if a character is not out there, it gets removed in the for-loop
But you have to be warned that what you're trying to do isn't quite right...