Related
First let me point out... I want to split a String or Substring with any character that is not an alphabet, a number, # or #. That means, I want to split with whitespaces(spaces & line breaks) and special characters or symbols excluding # and #
In Android Java, I am able to achieve this with:
String[] textArr = text.split("[^\\w_##]");
Now, I want to do the same in Swift. I added an extension to String and Substring classes
extension String {}
extension Substring {}
In both extensions, I added a method that returns an array of Substring
func splitWithRegex(by regexStr: String) -> [Substring] {
//let string = self (for String extension) | String(self) (for Substring extension)
let regex = try! NSRegularExpression(pattern: regexStr)
let range = NSRange(string.startIndex..., in: string)
return regex.matches(in: string, options: .anchored, range: range)
.map { match -> Substring in
let range = Range(match.range(at: 1), in: string)!
return string[range]
}
}
And when I tried to use it, (Only tested with a Substring, but I also think String will give me the same result)
let textArray = substring.splitWithRegex(by: "[^\\w_##]")
print("substring: \(substring)")
print("textArray: \(textArray)")
This is the out put:
substring: This,is a #random #text written for debugging
textArray: []
Please can Someone help me. I don't know if the problem if from my regex [^\\w_##] or from splitWithRegex method
The main reason why the code doesn't work is range(at: 1) which returns the content of the first captured group, but the pattern does not capture anything.
With just range the regex returns the ranges of the found matches, but I suppose you want the characters between.
To accomplish that you need a dynamic index starting at the first character. In the map closure return the string from the current index to the lowerBound of the found range and set the index to its upperBound. Finally you have to add manually the string from the upperBound of the last match to the end.
The Substring type is a helper type for slicing strings. It should not be used beyond a temporary scope.
extension String {
func splitWithRegex(by regexStr: String) -> [String] {
guard let regex = try? NSRegularExpression(pattern: regexStr) else { return [] }
let range = NSRange(startIndex..., in: self)
var index = startIndex
var array = regex.matches(in: self, range: range)
.map { match -> String in
let range = Range(match.range, in: self)!
let result = self[index..<range.lowerBound]
index = range.upperBound
return String(result)
}
array.append(String(self[index...]))
return array
}
}
let text = "This,is a #random #text written for debugging"
let textArray = text.splitWithRegex(by: "[^\\w_##]")
print(textArray) // ["This", "is", "a", "#random", "#text", "written", "for", "debugging"]
However in macOS 13 and iOS 16 there is a new API quite similar to the java API
let text = "This,is a #random #text written for debugging"
let textArray = Array(text.split(separator: /[^\w_##]/))
print(textArray)
The forward slashes indicate a regex literal
I'm bit confused by NSRegularExpression in swift, can any one help me?
task:1 given ("name","john","name of john")
then I should get ["name","john","name of john"]. Here I should avoid the brackets.
task:2 given ("name"," john","name of john")
then I should get ["name","john","name of john"]. Here I should avoid the brackets and extra spaces and finally get array of strings.
task:3 given key = value // comment
then I should get ["key","value","comment"]. Here I should get only strings in the line by avoiding = and //
I have tried below code for task 1 but not passed.
let string = "(name,john,string for user name)"
let pattern = "(?:\\w.*)"
do {
let regex = try NSRegularExpression(pattern: pattern, options: .caseInsensitive)
let matches = regex.matches(in: string, options: [], range: NSRange(location: 0, length: string.utf16.count))
for match in matches {
if let range = Range(match.range, in: string) {
let name = string[range]
print(name)
}
}
} catch {
print("Regex was bad!")
}
Thanks in advance.
RegEx in Swift
These posts might help you to explore regular expressions in swift:
Does a string match a pattern?
Swift extract regex matches
How can I use String slicing subscripts in Swift 4?
How to use regex with Swift?
Swift 3 - How do I extract captured groups in regular expressions?
How to group search regular expressions using swift?
Task 1 & 2
This expression might help you to match your desired outputs for both Task 1 and 2:
"(\s+)?([a-z\s]+?)(\s+)?"
Based on Rob's advice, you could much reduce the boundaries, such as the char list [a-z\s]. For example, here, we can also use:
"(\s+)?(.*?)(\s+)?"
or
"(\s+)?(.+?)(\s+)?"
to simply pass everything in between two " and/or space.
RegEx
If this wasn't your desired expression, you can modify/change your expressions in regex101.com.
RegEx Circuit
You can also visualize your expressions in jex.im:
JavaScript Demo
const regex = /"(\s+)?([a-z\s]+?)(\s+)?"/gm;
const str = `"name","john","name of john"
"name"," john","name of john"
" name "," john","name of john "
" name "," john"," name of john "`;
const subst = `\n$2`;
// The substituted value will be contained in the result variable
const result = str.replace(regex, subst);
console.log('Substitution result: ', result);
Task 3
This expression might help you to design an expression for the third task:
(.*?)([a-z\s]+)(.*?)
const regex = /(.*?)([a-z\s]+)(.*?)/gm;
const str = `key = value // comment
key = value with some text // comment`;
const subst = `$2,`;
// The substituted value will be contained in the result variable
const result = str.replace(regex, subst);
console.log('Substitution result: ', result);
Separate the string by non alpha numeric characters except white spaces. Then trim the elements with white spaces.
extension String {
func words() -> [String] {
return self.components(separatedBy: CharacterSet.alphanumerics.inverted.subtracting(.whitespaces))
.filter({ !$0.isEmpty })
.map({ $0.trimmingCharacters(in: .whitespaces) })
}
}
let string1 = "(name,john,string for user name)"
let string2 = "(name, john,name of john)"
let string3 = "key = value // comment"
print(string1.words())//["name", "john", "string for user name"]
print(string2.words())//["name", "john", "name of john"]
print(string3.words())//["key", "value", "comment"]
Here I have done with after understanding all of above comments.
let text = """
Capturing and non-capturing groups are somewhat advanced topics. You’ll encounter examples of capturing and non-capturing groups later on in the tutorial
"""
extension String {
func rex (_ expr : String)->[String] {
return try! NSRegularExpression(pattern: expr, options: [.caseInsensitive])
.matches(in: self, options: [], range: NSRange(location: 0, length: self.count))
.map {
String(self[Range($0.range, in: self)!])
}
}
}
let r = text.rex("(?:\\w+-\\w+)") // pass any rex
A single pattern, works for test:1...3, in Swift.
let string =
//"(name,john,string for user name)" //test:1
//#"("name"," john","name of john")"# //test:2
"key = value // comment" //test:3
let pattern = #"(?:\w+)(?:\s+\w+)*"# //Swift 5+ only
//let pattern = "(?:\\w+)(?:\\s+\\w+)*"
do {
let regex = try NSRegularExpression(pattern: pattern)
let matches = regex.matches(in: string, range: NSRange(0..<string.utf16.count))
let matchingWords = matches.map {
String(string[Range($0.range, in: string)!])
}
print(matchingWords) //(test:3)->["key", "value", "comment"]
} catch {
print("Regex was bad!")
}
Let’s consider:
let string = "(name,José,name is José)"
I’d suggest a regex that looks for strings where:
It’s the substring either after the ( at the start of the full string or after a comma, i.e., look behind assertion of (?<=^\(|,);
It’s the substring that does not contain , within it, i.e., [^,]+?;
It’s the substring that is terminated by either a comma or ) at the end of the full string, i.e., look ahead assertion of (?=,|\)$), and
If you want to have it skip white space before and after the substrings, throw in the \s*+, too.
Thus:
let pattern = #"(?<=^\(|,)\s*+([^,]+?)\s*+(?=,|\)$)"#
let regex = try! NSRegularExpression(pattern: pattern)
regex.enumerateMatches(in: string, range: NSRange(string.startIndex..., in: string)) { match, _, _ in
if let nsRange = match?.range(at: 1), let range = Range(nsRange, in: string) {
let substring = String(string[range])
// do something with `substring` here
}
}
Note, I’m using the Swift 5 extended string delimiters (starting with #" and ending with "#) so that I don’t have to escape my backslashes within the string. If you’re using Swift 4 or earlier, you’ll want to escape those back slashes:
let pattern = "(?<=^\\(|,)\\s*+([^,]+?)\\s*+(?=,|\\)$)"
Sorry if the title is not clear.
What I mean is this:
If I have a variable, we'll call that a, with a value of "Hello\nWorld", it would be written as
var a = "Hello\nWorld
And if I were to print it, I'd get
Hello
World
How could I print it as:
Hello\nWorld
I know this is a little old however I was looking for a solution to the same problem and I figured out something easy.
If you're wanting to print out a string that shows the escape characters like "\nThis Thing\nAlso this"
print(myString.debugDescription)
Here's a more complete version of #Pedro Castilho's answer.
import Foundation
extension String {
static let escapeSequences = [
(original: "\0", escaped: "\\0"),
(original: "\\", escaped: "\\\\"),
(original: "\t", escaped: "\\t"),
(original: "\n", escaped: "\\n"),
(original: "\r", escaped: "\\r"),
(original: "\"", escaped: "\\\""),
(original: "\'", escaped: "\\'"),
]
mutating func literalize() {
self = self.literalized()
}
func literalized() -> String {
return String.escapeSequences.reduce(self) { string, seq in
string.replacingOccurrences(of: seq.original, with: seq.escaped)
}
}
}
let a = "Hello\0\\\t\n\r\"\'World"
print("Original: \(a)\r\n\r\n\r\n")
print("Literalized: \(a.literalized())")
You can't, not without changing the string itself. The \n character sequence only exists in your code as a representation of a newline character, the compiler will change it into an actual newline.
In other words, the issue here is that the "raw" string is the string with the actual newline.
If you want it to appear as an actual \n, you'll need to escape the backslash. (Change it into \\n)
You could also use the following function to automate this:
func literalize(_ string: String) -> String {
return string.replacingOccurrences(of: "\n", with: "\\n")
.replacingOccurrences(of: "\t", with: "\\t")
}
And so on. You can add more replacingOccurrences calls for every escape sequence you want to literalize.
If "Hello\nWorld" is literally the string you're trying to print, then all you do is this:
var str = "Hello\\nWorld"
print(str)
I tested this in the Swift Playgrounds!
Late to the party but the answer to this question is to map the String UnicodeScalarView Unicode.Scalar elements converting them to escaped ascii strings. Then you can simply join back the string:
extension Unicode.Scalar {
var asciiEscaped: String { escaped(asASCII: true) }
}
extension StringProtocol {
var asciiEscaped: String {
unicodeScalars.map(\.asciiEscaped).joined()
}
}
print("Hello\nWorld".asciiEscaped) // Hello\nWorld
Just use double \
var a = "Hello\\nWorld"
Every example of trimming strings in Swift remove both leading and trailing whitespace, but how can only trailing whitespace be removed?
For example, if I have a string:
" example "
How can I end up with:
" example"
Every solution I've found shows trimmingCharacters(in: CharacterSet.whitespaces), but I want to retain the leading whitespace.
RegEx is a possibility, or a range can be derived to determine index of characters to remove, but I can't seem to find an elegant solution for this.
With regular expressions:
let string = " example "
let trimmed = string.replacingOccurrences(of: "\\s+$", with: "", options: .regularExpression)
print(">" + trimmed + "<")
// > example<
\s+ matches one or more whitespace characters, and $ matches
the end of the string.
In Swift 4 & Swift 5
This code will also remove trailing new lines.
It works based on a Character struct's method .isWhitespace
var trailingSpacesTrimmed: String {
var newString = self
while newString.last?.isWhitespace == true {
newString = String(newString.dropLast())
}
return newString
}
This short Swift 3 extension of string uses the .anchored and .backwards option of rangeOfCharacter and then calls itself recursively if it needs to loop. Because the compiler is expecting a CharacterSet as the parameter, you can just supply the static when calling, e.g. "1234 ".trailing(.whitespaces) will return "1234". (I've not done timings, but would expect faster than regex.)
extension String {
func trailingTrim(_ characterSet : CharacterSet) -> String {
if let range = rangeOfCharacter(from: characterSet, options: [.anchored, .backwards]) {
return self.substring(to: range.lowerBound).trailingTrim(characterSet)
}
return self
}
}
In Foundation you can get ranges of indices matching a regular expression. You can also replace subranges. Combining this, we get:
import Foundation
extension String {
func trimTrailingWhitespace() -> String {
if let trailingWs = self.range(of: "\\s+$", options: .regularExpression) {
return self.replacingCharacters(in: trailingWs, with: "")
} else {
return self
}
}
}
You can also have a mutating version of this:
import Foundation
extension String {
mutating func trimTrailingWhitespace() {
if let trailingWs = self.range(of: "\\s+$", options: .regularExpression) {
self.replaceSubrange(trailingWs, with: "")
}
}
}
If we match against \s* (as Martin R. did at first) we can skip the if let guard and force-unwrap the optional since there will always be a match. I think this is nicer since it's obviously safe, and remains safe if you change the regexp. I did not think about performance.
Handy String extension In Swift 4
extension String {
func trimmingTrailingSpaces() -> String {
var t = self
while t.hasSuffix(" ") {
t = "" + t.dropLast()
}
return t
}
mutating func trimmedTrailingSpaces() {
self = self.trimmingTrailingSpaces()
}
}
Swift 4
extension String {
var trimmingTrailingSpaces: String {
if let range = rangeOfCharacter(from: .whitespacesAndNewlines, options: [.anchored, .backwards]) {
return String(self[..<range.lowerBound]).trimmingTrailingSpaces
}
return self
}
}
Demosthese's answer is a useful solution to the problem, but it's not particularly efficient. This is an upgrade to their answer, extending StringProtocol instead, and utilizing Substring to remove the need for repeated copying.
extension StringProtocol {
#inline(__always)
var trailingSpacesTrimmed: Self.SubSequence {
var view = self[...]
while view.last?.isWhitespace == true {
view = view.dropLast()
}
return view
}
}
No need to create a new string when dropping from the end each time.
extension String {
func trimRight() -> String {
String(reversed().drop { $0.isWhitespace }.reversed())
}
}
This operates on the collection and only converts the result back into a string once.
It's a little bit hacky :D
let message = " example "
var trimmed = ("s" + message).trimmingCharacters(in: .whitespacesAndNewlines)
trimmed = trimmed.substring(from: trimmed.index(after: trimmed.startIndex))
Without regular expression there is not direct way to achieve that.Alternatively you can use the below function to achieve your required result :
func removeTrailingSpaces(with spaces : String) -> String{
var spaceCount = 0
for characters in spaces.characters{
if characters == " "{
print("Space Encountered")
spaceCount = spaceCount + 1
}else{
break;
}
}
var finalString = ""
let duplicateString = spaces.replacingOccurrences(of: " ", with: "")
while spaceCount != 0 {
finalString = finalString + " "
spaceCount = spaceCount - 1
}
return (finalString + duplicateString)
}
You can use this function by following way :-
let str = " Himanshu "
print(removeTrailingSpaces(with : str))
One line solution with Swift 4 & 5
As a beginner in Swift and iOS programming I really like #demosthese's solution above with the while loop as it's very easy to understand. However the example code seems longer than necessary. The following uses essentially the same logic but implements it as a single line while loop.
// Remove trailing spaces from myString
while myString.last == " " { myString = String(myString.dropLast()) }
This can also be written using the .isWhitespace property, as in #demosthese's solution, as follows:
while myString.last?.isWhitespace == true { myString = String(myString.dropLast()) }
This has the benefit (or disadvantage, depending on your point of view) that this removes all types of whitespace, not just spaces but (according to Apple docs) also including newlines, and specifically the following characters:
“\t” (U+0009 CHARACTER TABULATION)
“ “ (U+0020 SPACE)
U+2029 PARAGRAPH SEPARATOR
U+3000 IDEOGRAPHIC SPACE
Note: Even though .isWhitespace is a Boolean it can't be used directly in the while loop as it ends up being optional ? due to the chaining of the optional .last property, which returns nil if the String (or collection) is empty. The == true logic gets around this since nil != true.
I'd love to get some feedback on this, esp. in case anyone sees any issues or drawbacks with this simple single line approach.
Swift 5
extension String {
func trimTrailingWhiteSpace() -> String {
guard self.last == " " else { return self }
var tmp = self
repeat {
tmp = String(tmp.dropLast())
} while tmp.last == " "
return tmp
}
}
I'm trying to remove the last punctuation of a string in swift 2.0
var str: String = "This is a string, but i need to remove this comma, \n"
var trimmedstr: String = str.stringByTrimmingCharactersInSet(NSCharacterSet.whitespaceAndNewlineCharacterSet())
First I'm removing the the white spaces and newline characters at the end, and then I need to check of the last character of trimmedstr if it is a punctuation. It can be a period, comma, dash, etc, and if it is i need to remove it it.
How can i accomplish this?
There are multiple ways to do it. You can use contains to check if the last character is in the set of expected characters, and use dropLast() on the String to construct a new string without the last character:
let str = "This is a string, but i need to remove this comma, \n"
let trimmedstr = str.trimmingCharacters(in: .whitespacesAndNewlines)
if let lastchar = trimmedstr.last {
if [",", ".", "-", "?"].contains(lastchar) {
let newstr = String(trimmedstr.dropLast())
print(newstr)
}
}
Could use .trimmingCharacters(in:.whitespacesAndNewlines) and .trimmingCharacters(in: .punctuationCharacters)
for example, to remove whitespaces and punctuations on both ends of the String-
let str = "\n This is a string, but i need to remove this comma and whitespaces, \t\n"
let trimmedStr = str.trimmingCharacters(in:
.whitespacesAndNewlines).trimmingCharacters(in: .punctuationCharacters)
Result -
This is a string, but i need to remove this comma and whitespaces