Regex replace spaces at each new lines - swift

I am saving users input to db as a string and I would like to remove all spaces at each lines.
Input from user:
Hi!
My name is:
Bob
I am from the USA.
I want to remove spaces between "Bob", so the result will be:
Hi!
My name is:
Bob
I am from the USA.
I am trying to do it with the following code
let regex = try! NSRegularExpression(pattern: "\n[\\s]+", options: .caseInsensitive)
a = regex.stringByReplacingMatches(in: a, options: [], range: NSRange(0..<a.utf16.count), withTemplate: "\n")
but this code replace multiple new lines "\n", I don't want to do it.
After I run the above code: "1\n\n\n 2" -> "1\n2". The result I need: "1\n\n\n2" (only spaces are removed, not new lines).

No need for regex, split the string on the new line character into an array and then trim all lines and join them together again
let trimmed = string.components(separatedBy: .newlines)
.map { $0.trimmingCharacters(in: .whitespaces) }
.joined(separator: "\n")
or you can use reduce
let trimmed = string.components(separatedBy: .newlines)
.reduce(into: "") { $0 += "\($1.trimmingCharacters(in: .whitespaces))\n"}

You can use
let regex = try! NSRegularExpression(pattern: "(?m)^\\h+", options: .caseInsensitive)
Actually, as there are no case chars in the pattern, you may remove .caseInsensitive and use:
let regex = try! NSRegularExpression(pattern: "(?m)^\\h+", options: [])
See the regex demo. The pattern means:
(?m) - turn on multiline mode
^ - due to (?m), it matches any line start position
\h+ - one or more horizontal whitespaces.
Swift code example:
let txt = "Hi!\n\nMy name is:\n Bob\n\nI am from the USA."
let regex = "(?m)^\\h+"
print( txt.replacingOccurrences(of: regex, with: "", options: [.regularExpression]) )
Output:
Hi!
My name is:
Bob
I am from the USA.

Related

break line regex

How can I match a break line from OCR text using regex?
For example I have this text:
"NAME JESUS LASTNAME"
I want to find a match with NAME and then get the next two lines
if (line.text.range(of: "^NAME+\\n", options: .regularExpression) != nil){
let name = line.text
print(name)
}
You can use a positive look behind to find NAME followed by a new line, and try to match a line followed by any text that ends on a new line or the end of a string "(?s)(?<=NAME\n).*\n.*(?=$|\n)":
For more info about the regex above you can check this
Playground testing:
let str = "NAME\nJESUS\nLASTNAME"
let pattern = "(?s)(?<=NAME\n).*\n.*(?=$|\n)"
if let range = str.range(of: pattern, options: .regularExpression) {
let text = String(str[range])
print(text)
}
This will print
JESUS
LASTNAME
You can use
(?m)(?<=^NAME\n).*\n.*
See the regex demo. Details:
(?m) - a multiline option making ^ match start of a line
(?<=^NAME\n) - a positive lookbehind that matches a location that is immediately preceeded with start of a line, NAME and then a line feed char
.*\n.* - two subsequent lines (.* matches zero or more chars other than line break chars as many as possible).
See the Swift fiddle:
import Foundation
let line_text = "NAME\nJESUS\nLASTNAME"
if let rng = line_text.range(of: #"(?m)(?<=^NAME\n).*\n.*"#, options: .regularExpression) {
print(String(line_text[rng]))
}
// => JESUS
// LASTNAME

Create an NSPredicate with a line break as part of a string

I need to create a predicate that will look for the following string:
"fred\n5" where \n is a newline.
At least, this is string that is returned when reading the metadata back
You can do it with Regular Expression
let string = """
fred
5
"""
let predicate = NSPredicate(format: "self MATCHES %#", "fred\\n5")
predicate.evaluate(with: string) // true
It's also possible to use the pattern fred(\\n|\\r)5, it considers both linefeed and return.
Alternatively remove the newline character (actually any whitespace and newline characters)
let trimmedString = string.replacingOccurrences(of: "\\s", with: "", options: .regularExpression)

Regular expressions in swift

I'm bit confused by NSRegularExpression in swift, can any one help me?
task:1 given ("name","john","name of john")
then I should get ["name","john","name of john"]. Here I should avoid the brackets.
task:2 given ("name"," john","name of john")
then I should get ["name","john","name of john"]. Here I should avoid the brackets and extra spaces and finally get array of strings.
task:3 given key = value // comment
then I should get ["key","value","comment"]. Here I should get only strings in the line by avoiding = and //
I have tried below code for task 1 but not passed.
let string = "(name,john,string for user name)"
let pattern = "(?:\\w.*)"
do {
let regex = try NSRegularExpression(pattern: pattern, options: .caseInsensitive)
let matches = regex.matches(in: string, options: [], range: NSRange(location: 0, length: string.utf16.count))
for match in matches {
if let range = Range(match.range, in: string) {
let name = string[range]
print(name)
}
}
} catch {
print("Regex was bad!")
}
Thanks in advance.
RegEx in Swift
These posts might help you to explore regular expressions in swift:
Does a string match a pattern?
Swift extract regex matches
How can I use String slicing subscripts in Swift 4?
How to use regex with Swift?
Swift 3 - How do I extract captured groups in regular expressions?
How to group search regular expressions using swift?
Task 1 & 2
This expression might help you to match your desired outputs for both Task 1 and 2:
"(\s+)?([a-z\s]+?)(\s+)?"
Based on Rob's advice, you could much reduce the boundaries, such as the char list [a-z\s]. For example, here, we can also use:
"(\s+)?(.*?)(\s+)?"
or
"(\s+)?(.+?)(\s+)?"
to simply pass everything in between two " and/or space.
RegEx
If this wasn't your desired expression, you can modify/change your expressions in regex101.com.
RegEx Circuit
You can also visualize your expressions in jex.im:
JavaScript Demo
const regex = /"(\s+)?([a-z\s]+?)(\s+)?"/gm;
const str = `"name","john","name of john"
"name"," john","name of john"
" name "," john","name of john "
" name "," john"," name of john "`;
const subst = `\n$2`;
// The substituted value will be contained in the result variable
const result = str.replace(regex, subst);
console.log('Substitution result: ', result);
Task 3
This expression might help you to design an expression for the third task:
(.*?)([a-z\s]+)(.*?)
const regex = /(.*?)([a-z\s]+)(.*?)/gm;
const str = `key = value // comment
key = value with some text // comment`;
const subst = `$2,`;
// The substituted value will be contained in the result variable
const result = str.replace(regex, subst);
console.log('Substitution result: ', result);
Separate the string by non alpha numeric characters except white spaces. Then trim the elements with white spaces.
extension String {
func words() -> [String] {
return self.components(separatedBy: CharacterSet.alphanumerics.inverted.subtracting(.whitespaces))
.filter({ !$0.isEmpty })
.map({ $0.trimmingCharacters(in: .whitespaces) })
}
}
let string1 = "(name,john,string for user name)"
let string2 = "(name, john,name of john)"
let string3 = "key = value // comment"
print(string1.words())//["name", "john", "string for user name"]
print(string2.words())//["name", "john", "name of john"]
print(string3.words())//["key", "value", "comment"]
Here I have done with after understanding all of above comments.
let text = """
Capturing and non-capturing groups are somewhat advanced topics. You’ll encounter examples of capturing and non-capturing groups later on in the tutorial
"""
extension String {
func rex (_ expr : String)->[String] {
return try! NSRegularExpression(pattern: expr, options: [.caseInsensitive])
.matches(in: self, options: [], range: NSRange(location: 0, length: self.count))
.map {
String(self[Range($0.range, in: self)!])
}
}
}
let r = text.rex("(?:\\w+-\\w+)") // pass any rex
A single pattern, works for test:1...3, in Swift.
let string =
//"(name,john,string for user name)" //test:1
//#"("name"," john","name of john")"# //test:2
"key = value // comment" //test:3
let pattern = #"(?:\w+)(?:\s+\w+)*"# //Swift 5+ only
//let pattern = "(?:\\w+)(?:\\s+\\w+)*"
do {
let regex = try NSRegularExpression(pattern: pattern)
let matches = regex.matches(in: string, range: NSRange(0..<string.utf16.count))
let matchingWords = matches.map {
String(string[Range($0.range, in: string)!])
}
print(matchingWords) //(test:3)->["key", "value", "comment"]
} catch {
print("Regex was bad!")
}
Let’s consider:
let string = "(name,José,name is José)"
I’d suggest a regex that looks for strings where:
It’s the substring either after the ( at the start of the full string or after a comma, i.e., look behind assertion of (?<=^\(|,);
It’s the substring that does not contain , within it, i.e., [^,]+?;
It’s the substring that is terminated by either a comma or ) at the end of the full string, i.e., look ahead assertion of (?=,|\)$), and
If you want to have it skip white space before and after the substrings, throw in the \s*+, too.
Thus:
let pattern = #"(?<=^\(|,)\s*+([^,]+?)\s*+(?=,|\)$)"#
let regex = try! NSRegularExpression(pattern: pattern)
regex.enumerateMatches(in: string, range: NSRange(string.startIndex..., in: string)) { match, _, _ in
if let nsRange = match?.range(at: 1), let range = Range(nsRange, in: string) {
let substring = String(string[range])
// do something with `substring` here
}
}
Note, I’m using the Swift 5 extended string delimiters (starting with #" and ending with "#) so that I don’t have to escape my backslashes within the string. If you’re using Swift 4 or earlier, you’ll want to escape those back slashes:
let pattern = "(?<=^\\(|,)\\s*+([^,]+?)\\s*+(?=,|\\)$)"

Remove whitespaces from a string

I referred this SO post to remove whitespaces and newline characters from a string. But in my string, I may have extra whitespaces as well as extra newline characters. I want to remove the unnecessary \n's and whitespaces from that string.
But if there is a string like so..."This \n is a st\tri\rng" then I don't want Thisisastring as the result but instead something like this..
This is a string
To replace contiguous spaces with a single space, replace Regular Expression \s+ with a single space:
let str = "This \n\n is a string"
if let regex = try? NSRegularExpression(pattern: "\\s+", options: NSRegularExpression.Options.caseInsensitive)
{
let result = regex.stringByReplacingMatches(in: str, options: [], range: NSMakeRange(0, str.count), withTemplate: " ")
print(result) //output: "This is a string"
}

Find characters inside quotation marks in String

I'm trying to pull out the parts of a string that are in quotation marks, i.e. in "Rouge One" is an awesome movie I want to extract Rouge One.
This is what I have so far but can't figure out where to go from here: I create a copy of the text so that I can remove the first quotation mark so that I can get the index of the second.
if text.contains("\"") {
guard let firstQuoteMarkIndex = text.range(of: "\"") else {return}
var textCopy = text
let textWithoutFirstQuoteMark = textCopy.replacingCharacters(in: firstQuoteMarkIndex, with: "")
let secondQuoteMarkIndex = textCopy.range(of: "\"")
let stringBetweenQuotes = text.substring(with: Range(start: firstQuoteMarkIndex, end: secondQuoteMarkIndex))
}
There is no need to create copies or to replace substrings for this task.
Here is a possible approach:
Use text.range(of: "\"") to find the first quotation mark.
Use text.range(of: "\"", range:...) to find the second quotation mark, i.e. the first one after the range found in step 1.
Extract the substring between the two ranges.
Example:
let text = " \"Rouge One\" is an awesome movie"
if let r1 = text.range(of: "\""),
let r2 = text.range(of: "\"", range: r1.upperBound..<text.endIndex) {
let stringBetweenQuotes = text.substring(with: r1.upperBound..<r2.lowerBound)
print(stringBetweenQuotes) // "Rouge One"
}
Another option is a regular expression search with "positive lookbehind" and "positive lookahead" patterns:
if let range = text.range(of: "(?<=\\\").*?(?=\\\")", options: .regularExpression) {
let stringBetweenQuotes = text.substring(with: range)
print(stringBetweenQuotes)
}
var rouge = "\"Rouge One\" is an awesome movie"
var separated = rouge.components(separatedBy: "\"") // ["", "Rouge One", " is an awesome movie"]
separated.dropFirst().first
I would use .components(separatedBy:)
let stringArray = text.components(separatedBy: "\"")
Check if stringArray count is > 2 (there is at least 2 quotes).
Check if stringArray count is odd, aka count % 2 == 1.
If it is odd, all the even indices are between 2 quotes and they are what you want.
If it is even, all the even indices - 1 are between 2 quotes (the last one doesn't have an end quote).
This will allow you to also capture multiple sets of quoted strings, like:
"Rogue One" is a "Star Wars" movie.
Another option is to use regular expressions to find pairs of quotes:
let pattern = try! NSRegularExpression(pattern: "\\\"([^\"]+)\\\"")
// Small helper methods making it easier to work with enumerateMatches(in:...)
extension String {
subscript(utf16Range range: Range<Int>) -> String? {
get {
let start = utf16.index(utf16.startIndex, offsetBy: range.lowerBound)
let end = utf16.index(utf16.startIndex, offsetBy: range.upperBound)
return String(utf16[start..<end])
}
}
var fullUTF16Range: NSRange {
return NSRange(location: 0, length: utf16.count)
}
}
// Loop through *all* quoted substrings in the original string.
let str = "\"Rogue One\" is an awesome movie"
pattern.enumerateMatches(in: str, range: str.fullUTF16Range) { (result, flags, stop) in
// rangeAt(1) is the range representing the characters in the 1st
// capture group of the regular expression: ([^"]+)
if let result = result, let range = result.rangeAt(1).toRange() {
print("This was in quotes: \(str[utf16Range: range] ?? "<bad range>")")
}
}