Extracting value with backlash from string value swift [duplicate] - swift

This question already has answers here:
regex to get string between two % characters
(2 answers)
Closed 2 years ago.
Hi am trying to get values of a string which is imputed in the form "40|LQ,FP,MD,GR \"Dinner out\"". The string value I am trying to extract is Dinner out the text could be different though like ride in but it still follows the same pattern. how can I extract this string value from the rest of the character using regex of any alternative.

You can try
var text = "40|LQ,FP,MD,GR \"Dinner out\""
var regex = try? NSRegularExpression(pattern: "\\\".+?\\\"", options: [])
var matches = regex?.matches(in: text, options: []
, range: NSMakeRange(0, text.count)) ?? []
for match in matches {
let r = (text as NSString).substring(with: match.range)
print("found= \(r)")
}

I would recommend you to have a look at this question and its answers.
Swift Get string between 2 strings in a string
Quoting answer here just for your ease.
extension String {
func slice(from: String, to: String) -> String? {
return (range(of: from)?.upperBound).flatMap { substringFrom in
(range(of: to, range: substringFrom..<endIndex)?.lowerBound).map { substringTo in
String(self[substringFrom..<substringTo])
}
}
}
}

Related

Split String or Substring with Regex pattern in Swift

First let me point out... I want to split a String or Substring with any character that is not an alphabet, a number, # or #. That means, I want to split with whitespaces(spaces & line breaks) and special characters or symbols excluding # and #
In Android Java, I am able to achieve this with:
String[] textArr = text.split("[^\\w_##]");
Now, I want to do the same in Swift. I added an extension to String and Substring classes
extension String {}
extension Substring {}
In both extensions, I added a method that returns an array of Substring
func splitWithRegex(by regexStr: String) -> [Substring] {
//let string = self (for String extension) | String(self) (for Substring extension)
let regex = try! NSRegularExpression(pattern: regexStr)
let range = NSRange(string.startIndex..., in: string)
return regex.matches(in: string, options: .anchored, range: range)
.map { match -> Substring in
let range = Range(match.range(at: 1), in: string)!
return string[range]
}
}
And when I tried to use it, (Only tested with a Substring, but I also think String will give me the same result)
let textArray = substring.splitWithRegex(by: "[^\\w_##]")
print("substring: \(substring)")
print("textArray: \(textArray)")
This is the out put:
substring: This,is a #random #text written for debugging
textArray: []
Please can Someone help me. I don't know if the problem if from my regex [^\\w_##] or from splitWithRegex method
The main reason why the code doesn't work is range(at: 1) which returns the content of the first captured group, but the pattern does not capture anything.
With just range the regex returns the ranges of the found matches, but I suppose you want the characters between.
To accomplish that you need a dynamic index starting at the first character. In the map closure return the string from the current index to the lowerBound of the found range and set the index to its upperBound. Finally you have to add manually the string from the upperBound of the last match to the end.
The Substring type is a helper type for slicing strings. It should not be used beyond a temporary scope.
extension String {
func splitWithRegex(by regexStr: String) -> [String] {
guard let regex = try? NSRegularExpression(pattern: regexStr) else { return [] }
let range = NSRange(startIndex..., in: self)
var index = startIndex
var array = regex.matches(in: self, range: range)
.map { match -> String in
let range = Range(match.range, in: self)!
let result = self[index..<range.lowerBound]
index = range.upperBound
return String(result)
}
array.append(String(self[index...]))
return array
}
}
let text = "This,is a #random #text written for debugging"
let textArray = text.splitWithRegex(by: "[^\\w_##]")
print(textArray) // ["This", "is", "a", "#random", "#text", "written", "for", "debugging"]
However in macOS 13 and iOS 16 there is a new API quite similar to the java API
let text = "This,is a #random #text written for debugging"
let textArray = Array(text.split(separator: /[^\w_##]/))
print(textArray)
The forward slashes indicate a regex literal

Swift String Tokenizer / Parser [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
Hello there fellow Swift devs!
I am a junior dev, and I'm trying to figure out a best way to tokenize / parse Swift String as an exercise.
What I have is a string which looks like this:
let string = "This is a {B}string{/B} and this is a substring."
What I would like to do is, tokenize the string, and change the "strings / tokens" inside the tags you see.
I can see using NSRegularExpression and it's matches, but it feels too generic. I would like to have only say 2 of these tags, that change the text. What would be the best approach in Swift 5.2^?
if let regex = try? NSRegularExpression(pattern: "\\{[a-z0-9]+\}", options: .caseInsensitive) {
let string = self as NSString
return regex.matches(in: self, options: [], range: NSRange(location: 0, length: string.length)).map {
// now $0 is the result? but it won't work for enclosing the tags :/
}
}
If the option of using html tags instead of {B}{/B} is acceptable, then you can use the StringEx library that I wrote for this purpose.
You can select a substring inside the html tag and replace it with another string like this:
let string = "This is a <b>string</b> and this is a substring."
let ex = string.ex
ex[.tag("b")].replace(with: "some value")
print(ex.rawString) // This is a <b>some value</b> and this is a substring.
print(ex.string) // This is a some value and this is a substring.
if necessary, you can also style the selected substrings and get NSAttributedString:
ex[.tag("b")].style([
.font(.boldSystemFont(ofSize: 16)),
.color(.black)
])
myLabel.attributedText = ex.attributedString
Not sure if you have solved it with NLTokenizer or not, but you can certainly solve it with Regx here is how (I have implemented it as generic, in future if you have to handle different kinds of tags and substite different string for them small tweak to the logic should do the job )
override func viewDidLoad() {
super.viewDidLoad()
let regexStr = "(\\{B\\}(\\s*\\w+\\s*)*\\{\\/B\\})"
let regex = try! NSRegularExpression(pattern: regexStr)
var string = "Sandeep {B}Bhandaari{/B} is here{B}Sandeep{/B}"
var foundRanges = [NSRange]()
regex.enumerateMatches(in: string, options: [], range: NSMakeRange(0, string.count)) { (match, flag, stop) in
if let matchRange = match?.range(at: 1) {
foundRanges.append(matchRange)
}
}
let substituteString = "abcd"
var replacedString = string as NSString
let foundRangesCount = foundRanges.count
var currentRange = 0
while foundRangesCount > currentRange {
let range = foundRanges[currentRange]
replacedString = replacedString.replacingCharacters(in: range, with: substituteString) as NSString
reEvaluateAllRanges(ranges: &foundRanges, byOffset: range.length - substituteString.count)
currentRange += 1
}
debugPrint(replacedString)
}
func reEvaluateAllRanges(ranges: inout [NSRange], byOffset: Int) {
var newFoundRange = [NSRange]()
for range in ranges {
newFoundRange.append(NSMakeRange(range.location - byOffset, range.length))
}
ranges = newFoundRange
}
Input: "Sandeep {B}Bhandaari{/B} is here"
Output: Sandeep abcd is here
Input: "Sandeep {B}Bhandaari{/B} is here{B}Sandeep{/B}"
Output: Sandeep abcd is hereabcd
Look at the edge case handling Longer strings replaced by smaller substitute strings and vice versa also detection of string enclosed in tag with / without space
EDIT 1:
Regx (\\{B\\}(\\s*\\w+\\s*)*\\{\\/B\\}) should be self explanatory, incase you need help with understanding it use cheat sheet
regex.enumerateMatches(in: string, options: [], range: NSMakeRange(0, string.count)) { (match, flag, stop) in
if let matchRange = match?.range(at: 1) {
foundRanges.append(matchRange)
}
}
I could have modified substring here itself, but if you have more than one match and if you mutate string evaluated ranges will be corrupted hence am saving all found ranges into an array and apply replace on each one of them later
let substituteString = "abcd"
var replacedString = string as NSString
let foundRangesCount = foundRanges.count
var currentRange = 0
while foundRangesCount > currentRange {
let range = foundRanges[currentRange]
replacedString = replacedString.replacingCharacters(in: range, with: substituteString) as NSString
reEvaluateAllRanges(ranges: &foundRanges, byOffset: range.length - substituteString.count)
currentRange += 1
}
Here am iterating through all found match ranges and replace character in range with substitute string, you can always have a switch / if else ladder inside while loop to look for different types of tags and pass different substitute strings for each tags
func reEvaluateAllRanges(ranges: inout [NSRange], byOffset: Int) {
var newFoundRange = [NSRange]()
for range in ranges {
newFoundRange.append(NSMakeRange(range.location - byOffset, range.length))
}
ranges = newFoundRange
}
This function modifies all the ranges in array using the offset, remember you need to only modify range's location, length remains same
One bit of optimisation you can do is probably get rid of ranges from array for which you have already applied substitute strings

How to remove certain characters in a string?

The string value varies sometimes it's
93.93% - 94.13, 85.34, %74.90, 88.21%
I just need to extract the double value like this.
93.93, 85.34, 74.90, 88.21
You can use regex to extract numbers from your string like this:
let sourceString = "93.93% - 94.13, 85.34, %74.90, 88.21%"
func getNumbers(from string : String) -> [String] {
let pattern = "((\\+|-)?([0-9]+)(\\.[0-9]+)?)|((\\+|-)?\\.?[0-9]+)" // Change this according to your requirement
let regex = try! NSRegularExpression(pattern: pattern)
let matches = regex.matches(in: string, range: NSRange(string.startIndex..., in: string))
let result = matches.map { (match) -> String in
let range = Range(match.range, in: string)!
return String(string[range])
}
return result
}
let numberArray = getNumbers(from: sourceString)
print(numberArray)
Result:
["93.93", "94.13", "85.34", "74.90", "88.21"]
you should try using a regex like this for example :
[0-9]{2}.[0-9]{2}
This regex find all string that match two numbers, then a dot and two numbers again.
for each value such as var str='%74.90'; use this line -
var double=str.match(/[+-]?\d+(\.\d+)?/g).map(function(v) { return parseFloat(v); })[0];
Use Scanner to scan the values. Scanner is highly configurable and designed for scanning string and numeric values from loosely demarcated strings. Below is the example:
let characterSet = CharacterSet.init(charactersIn: "0123456789.").inverted
let scanner = Scanner(string: "93.93% - 94.13, 85.34, %74.90, 88.21%")
scanner.charactersToBeSkipped = characterSet
var numStr: NSString?
while scanner.scanUpToCharacters(from: characterSet, into: &numStr) {
print(numStr ?? "")
}
Output:
93.93
94.13
85.34
74.90
88.21
It is easier to understand comparatively regex.

Get numbers characters from a string [duplicate]

This question already has answers here:
Filter non-digits from string
(12 answers)
Closed 6 years ago.
How to get numbers characters from a string? I don't want to convert in Int.
var string = "string_1"
var string2 = "string_20_certified"
My result have to be formatted like this:
newString = "1"
newString2 = "20"
Pattern matching a String's unicode scalars against Western Arabic Numerals
You could pattern match the unicodeScalars view of a String to a given UnicodeScalar pattern (covering e.g. Western Arabic numerals).
extension String {
var westernArabicNumeralsOnly: String {
let pattern = UnicodeScalar("0")..."9"
return String(unicodeScalars
.flatMap { pattern ~= $0 ? Character($0) : nil })
}
}
Example usage:
let str1 = "string_1"
let str2 = "string_20_certified"
let str3 = "a_1_b_2_3_c34"
let newStr1 = str1.westernArabicNumeralsOnly
let newStr2 = str2.westernArabicNumeralsOnly
let newStr3 = str3.westernArabicNumeralsOnly
print(newStr1) // 1
print(newStr2) // 20
print(newStr3) // 12334
Extending to matching any of several given patterns
The unicode scalar pattern matching approach above is particularly useful extending it to matching any of a several given patterns, e.g. patterns describing different variations of Eastern Arabic numerals:
extension String {
var easternArabicNumeralsOnly: String {
let patterns = [UnicodeScalar("\u{0660}")..."\u{0669}", // Eastern Arabic
"\u{06F0}"..."\u{06F9}"] // Perso-Arabic variant
return String(unicodeScalars
.flatMap { uc in patterns.contains{ $0 ~= uc } ? Character(uc) : nil })
}
}
This could be used in practice e.g. if writing an Emoji filter, as ranges of unicode scalars that cover emojis can readily be added to the patterns array in the Eastern Arabic example above.
Why use the UnicodeScalar patterns approach over Character ones?
A Character in Swift contains of an extended grapheme cluster, which is made up of one or more Unicode scalar values. This means that Character instances in Swift does not have a fixed size in the memory, which means random access to a character within a collection of sequentially (/contiguously) stored character will not be available at O(1), but rather, O(n).
Unicode scalars in Swift, on the other hand, are stored in fixed sized UTF-32 code units, which should allow O(1) random access. Now, I'm not entirely sure if this is a fact, or a reason for what follows: but a fact is that if benchmarking the methods above vs equivalent method using the CharacterView (.characters property) for some test String instances, its very apparent that the UnicodeScalar approach is faster than the Character approach; naive testing showed a factor 10-25 difference in execution times, steadily growing for growing String size.
Knowing the limitations of working with Unicode scalars vs Characters in Swift
Now, there are drawbacks using the UnicodeScalar approach, however; namely when working with characters that cannot represented by a single unicode scalar, but where one of its unicode scalars are contained in the pattern to which we want to match.
E.g., consider a string holding the four characters "Café". The last character, "é", is represented by two unicode scalars, "e" and "\u{301}". If we were to implement pattern matching against, say, UnicodeScalar("a")...e, the filtering method as applied above would allow one of the two unicode scalars to pass.
extension String {
var onlyLowercaseLettersAthroughE: String {
let patterns = [UnicodeScalar("1")..."e"]
return String(unicodeScalars
.flatMap { uc in patterns.contains{ $0 ~= uc } ? Character(uc) : nil })
}
}
let str = "Cafe\u{301}"
print(str) // Café
print(str.onlyLowercaseLettersAthroughE) // Cae
/* possibly we'd want "Ca" or "Caé"
as result here */
In the particular use case queried by from the OP in this Q&A, the above is not an issue, but depending on the use case, it will sometimes be more appropriate to work with Character pattern matching over UnicodeScalar.
Edit: Updated for Swift 4 & 5
Here's a straightforward method that doesn't require Foundation:
let newstring = string.filter { "0"..."9" ~= $0 }
or borrowing from #dfri's idea to make it a String extension:
extension String {
var numbers: String {
return filter { "0"..."9" ~= $0 }
}
}
print("3 little pigs".numbers) // "3"
print("1, 2, and 3".numbers) // "123"
import Foundation
let string = "a_1_b_2_3_c34"
let result = string.components(separatedBy: CharacterSet.decimalDigits.inverted).joined(separator: "")
print(result)
Output:
12334
Here is a Swift 2 example:
let str = "Hello 1, World 62"
let intString = str.componentsSeparatedByCharactersInSet(
NSCharacterSet
.decimalDigitCharacterSet()
.invertedSet)
.joinWithSeparator("") // Return a string with all the numbers
This method iterate through the string characters and appends the numbers to a new string:
class func getNumberFrom(string: String) -> String {
var number: String = ""
for var c : Character in string.characters {
if let n: Int = Int(String(c)) {
if n >= Int("0")! && n < Int("9")! {
number.append(c)
}
}
}
return number
}
For example with regular expression
let text = "string_20_certified"
let pattern = "\\d+"
let regex = try! NSRegularExpression(pattern: pattern, options: [])
if let match = regex.firstMatch(in: text, options: [], range: NSRange(location: 0, length: text.characters.count)) {
let newString = (text as NSString).substring(with: match.range)
print(newString)
}
If there are multiple occurrences of the pattern use matches(in..
let matches = regex.matches(in: text, options: [], range: NSRange(location: 0, length: text.characters.count))
for match in matches {
let newString = (text as NSString).substring(with: match.range)
print(newString)
}

Why does my string cleanup function return the original value? [duplicate]

This question already has an answer here:
Can't replacing string with string
(1 answer)
Closed 7 years ago.
I have made a func so that I easily can make all letters of a string lower case, while also removing all ! and spaces. I made this func (outside of viewdidload)
func cleanLink(linkName: String) -> String {
linkName.stringByReplacingOccurrencesOfString("!", withString: "")
linkName.stringByReplacingOccurrencesOfString(" ", withString: "")
linkName.lowercaseString
return linkName
}
I then used these lines of codes
var theLinkName = cleanLink("AB C!")
print(theLinkName)
The problems is that this is just printing AB C! while I want it to print abc. What am I doing wrong?
The problem is that stringByReplacingOccurrencesOfString returns a new string; it does not perform the replacement in place.
You need to use the return value of the function instead, like this:
func cleanLink(linkName: String) -> String {
return linkName
.stringByReplacingOccurrencesOfString("!", withString: "")
.stringByReplacingOccurrencesOfString(" ", withString: "")
.lowercaseString
}
This "chains" the invocations of functions that produce new strings, and returns the final result of the replacement.