How to search array using unknown characters - Swift 3 for Mac - swift

I am looking for a way to search an Array of strings (containing filenames with extension) for dots (if the string contains characters-a dot-charaters, print the string definition). To do that I have to use something like wildcards (.).
So I tried this :
let testString = "*.*"
if Array[x].countains(testString)
{
print (Array[x])
}
or
if Array[x].range(of:testString) != nil
{
print (Array[x])
}
But it does not work. I guess I have to declare it differently but I don't know how and I have not found the right example.
Could someone shows some examples? Thank U.

Using this helper method on String:
extension String {
func contains(regex: NSRegularExpression) -> Bool {
let length = self.utf16.count // NSRanges are UTF-16 based!
let wholeString = NSRange(location: 0, length: length)
let matchCount = regex.numberOfMatches(in: self, range: wholeString)
return matchCount > 0
}
}
Then try this:
let fileNameWithExtension = try! NSRegularExpression(pattern: "\\w+[.]\\w+")
if Array[x].contains(regex: fileNameWithExtension) {
print(Array[x])
}
You may need to tweak my pattern above in order to match all cases you have in mind. This NSRegularExpression cheat sheet might help you there ;-)

Related

Split String or Substring with Regex pattern in Swift

First let me point out... I want to split a String or Substring with any character that is not an alphabet, a number, # or #. That means, I want to split with whitespaces(spaces & line breaks) and special characters or symbols excluding # and #
In Android Java, I am able to achieve this with:
String[] textArr = text.split("[^\\w_##]");
Now, I want to do the same in Swift. I added an extension to String and Substring classes
extension String {}
extension Substring {}
In both extensions, I added a method that returns an array of Substring
func splitWithRegex(by regexStr: String) -> [Substring] {
//let string = self (for String extension) | String(self) (for Substring extension)
let regex = try! NSRegularExpression(pattern: regexStr)
let range = NSRange(string.startIndex..., in: string)
return regex.matches(in: string, options: .anchored, range: range)
.map { match -> Substring in
let range = Range(match.range(at: 1), in: string)!
return string[range]
}
}
And when I tried to use it, (Only tested with a Substring, but I also think String will give me the same result)
let textArray = substring.splitWithRegex(by: "[^\\w_##]")
print("substring: \(substring)")
print("textArray: \(textArray)")
This is the out put:
substring: This,is a #random #text written for debugging
textArray: []
Please can Someone help me. I don't know if the problem if from my regex [^\\w_##] or from splitWithRegex method
The main reason why the code doesn't work is range(at: 1) which returns the content of the first captured group, but the pattern does not capture anything.
With just range the regex returns the ranges of the found matches, but I suppose you want the characters between.
To accomplish that you need a dynamic index starting at the first character. In the map closure return the string from the current index to the lowerBound of the found range and set the index to its upperBound. Finally you have to add manually the string from the upperBound of the last match to the end.
The Substring type is a helper type for slicing strings. It should not be used beyond a temporary scope.
extension String {
func splitWithRegex(by regexStr: String) -> [String] {
guard let regex = try? NSRegularExpression(pattern: regexStr) else { return [] }
let range = NSRange(startIndex..., in: self)
var index = startIndex
var array = regex.matches(in: self, range: range)
.map { match -> String in
let range = Range(match.range, in: self)!
let result = self[index..<range.lowerBound]
index = range.upperBound
return String(result)
}
array.append(String(self[index...]))
return array
}
}
let text = "This,is a #random #text written for debugging"
let textArray = text.splitWithRegex(by: "[^\\w_##]")
print(textArray) // ["This", "is", "a", "#random", "#text", "written", "for", "debugging"]
However in macOS 13 and iOS 16 there is a new API quite similar to the java API
let text = "This,is a #random #text written for debugging"
let textArray = Array(text.split(separator: /[^\w_##]/))
print(textArray)
The forward slashes indicate a regex literal

Swift String Tokenizer / Parser [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
Hello there fellow Swift devs!
I am a junior dev, and I'm trying to figure out a best way to tokenize / parse Swift String as an exercise.
What I have is a string which looks like this:
let string = "This is a {B}string{/B} and this is a substring."
What I would like to do is, tokenize the string, and change the "strings / tokens" inside the tags you see.
I can see using NSRegularExpression and it's matches, but it feels too generic. I would like to have only say 2 of these tags, that change the text. What would be the best approach in Swift 5.2^?
if let regex = try? NSRegularExpression(pattern: "\\{[a-z0-9]+\}", options: .caseInsensitive) {
let string = self as NSString
return regex.matches(in: self, options: [], range: NSRange(location: 0, length: string.length)).map {
// now $0 is the result? but it won't work for enclosing the tags :/
}
}
If the option of using html tags instead of {B}{/B} is acceptable, then you can use the StringEx library that I wrote for this purpose.
You can select a substring inside the html tag and replace it with another string like this:
let string = "This is a <b>string</b> and this is a substring."
let ex = string.ex
ex[.tag("b")].replace(with: "some value")
print(ex.rawString) // This is a <b>some value</b> and this is a substring.
print(ex.string) // This is a some value and this is a substring.
if necessary, you can also style the selected substrings and get NSAttributedString:
ex[.tag("b")].style([
.font(.boldSystemFont(ofSize: 16)),
.color(.black)
])
myLabel.attributedText = ex.attributedString
Not sure if you have solved it with NLTokenizer or not, but you can certainly solve it with Regx here is how (I have implemented it as generic, in future if you have to handle different kinds of tags and substite different string for them small tweak to the logic should do the job )
override func viewDidLoad() {
super.viewDidLoad()
let regexStr = "(\\{B\\}(\\s*\\w+\\s*)*\\{\\/B\\})"
let regex = try! NSRegularExpression(pattern: regexStr)
var string = "Sandeep {B}Bhandaari{/B} is here{B}Sandeep{/B}"
var foundRanges = [NSRange]()
regex.enumerateMatches(in: string, options: [], range: NSMakeRange(0, string.count)) { (match, flag, stop) in
if let matchRange = match?.range(at: 1) {
foundRanges.append(matchRange)
}
}
let substituteString = "abcd"
var replacedString = string as NSString
let foundRangesCount = foundRanges.count
var currentRange = 0
while foundRangesCount > currentRange {
let range = foundRanges[currentRange]
replacedString = replacedString.replacingCharacters(in: range, with: substituteString) as NSString
reEvaluateAllRanges(ranges: &foundRanges, byOffset: range.length - substituteString.count)
currentRange += 1
}
debugPrint(replacedString)
}
func reEvaluateAllRanges(ranges: inout [NSRange], byOffset: Int) {
var newFoundRange = [NSRange]()
for range in ranges {
newFoundRange.append(NSMakeRange(range.location - byOffset, range.length))
}
ranges = newFoundRange
}
Input: "Sandeep {B}Bhandaari{/B} is here"
Output: Sandeep abcd is here
Input: "Sandeep {B}Bhandaari{/B} is here{B}Sandeep{/B}"
Output: Sandeep abcd is hereabcd
Look at the edge case handling Longer strings replaced by smaller substitute strings and vice versa also detection of string enclosed in tag with / without space
EDIT 1:
Regx (\\{B\\}(\\s*\\w+\\s*)*\\{\\/B\\}) should be self explanatory, incase you need help with understanding it use cheat sheet
regex.enumerateMatches(in: string, options: [], range: NSMakeRange(0, string.count)) { (match, flag, stop) in
if let matchRange = match?.range(at: 1) {
foundRanges.append(matchRange)
}
}
I could have modified substring here itself, but if you have more than one match and if you mutate string evaluated ranges will be corrupted hence am saving all found ranges into an array and apply replace on each one of them later
let substituteString = "abcd"
var replacedString = string as NSString
let foundRangesCount = foundRanges.count
var currentRange = 0
while foundRangesCount > currentRange {
let range = foundRanges[currentRange]
replacedString = replacedString.replacingCharacters(in: range, with: substituteString) as NSString
reEvaluateAllRanges(ranges: &foundRanges, byOffset: range.length - substituteString.count)
currentRange += 1
}
Here am iterating through all found match ranges and replace character in range with substitute string, you can always have a switch / if else ladder inside while loop to look for different types of tags and pass different substitute strings for each tags
func reEvaluateAllRanges(ranges: inout [NSRange], byOffset: Int) {
var newFoundRange = [NSRange]()
for range in ranges {
newFoundRange.append(NSMakeRange(range.location - byOffset, range.length))
}
ranges = newFoundRange
}
This function modifies all the ranges in array using the offset, remember you need to only modify range's location, length remains same
One bit of optimisation you can do is probably get rid of ranges from array for which you have already applied substitute strings

How to get the range of the first line in a string?

I would like to change the formatting of the first line of text in an NSTextView (give it a different font size and weight to make it look like a headline). Therefore, I need the range of the first line. One way to go is this:
guard let firstLineString = textView.string.components(separatedBy: .newlines).first else {
return
}
let range = NSRange(location: 0, length: firstLineString.count)
However, I might be working with quite long texts so it appears to be inefficient to first split the entire string into line components when all I need is the first line component. Thus, it seems to make sense to use the firstIndex(where:) method:
let firstNewLineIndex = textView.string.firstIndex { character -> Bool in
return CharacterSet.newlines.contains(character)
}
// Then: Create an NSRange from 0 up to firstNewLineIndex.
This doesn't work and I get an error:
Cannot convert value of type '(Unicode.Scalar) -> Bool' to expected argument type 'Character'
because the contains method accepts not a Character but a Unicode.Scalar as a parameter (which doesn't really make sense to me because then it should be called a UnicodeScalarSet and not a CharacterSet, but nevermind...).
My question is:
How can I implement this in an efficient way, without first slicing the whole string?
(It doesn't necessarily have to use the firstIndex(where:) method, but appears to be the way to go.)
A String.Index range for the first line in string can be obtained with
let range = string.lineRange(for: ..<string.startIndex)
If you need that as an NSRange then
let nsRange = NSRange(range, in: string)
does the trick.
You can use rangeOfCharacter, which returns the Range<String.Index> of the first character from a set in your string:
extension StringProtocol where Index == String.Index {
var partialRangeOfFirstLine: PartialRangeUpTo<String.Index> {
return ..<(rangeOfCharacter(from: .newlines)?.lowerBound ?? endIndex)
}
var rangeOfFirstLine: Range<Index> {
return startIndex..<partialRangeOfFirstLine.upperBound
}
var firstLine: SubSequence {
return self[partialRangeOfFirstLine]
}
}
You can use it like so:
var str = """
some string
with new lines
"""
var attributedString = NSMutableAttributedString(string: str)
let firstLine = NSAttributedString(string: String(str.firstLine))
// change firstLine as you wish
let range = NSRange(str.rangeOfFirstLine, in: str)
attributedString.replaceCharacters(in: range, with: firstLine)

Remove suffix from filename in Swift

When trying to remove the suffix from a filename, I'm only left with the suffix, which is exactly not what I want.
What (how many things) am I doing wrong here:
let myTextureAtlas = SKTextureAtlas(named: "demoArt")
let filename = (myTextureAtlas.textureNames.first?.characters.split{$0 == "."}.map(String.init)[1].replacingOccurrences(of: "\'", with: ""))! as String
print(filename)
This prints png which is the most dull part of the whole thing.
If by suffix you mean path extension, there is a method for this:
let filename = "demoArt.png"
let name = (filename as NSString).deletingPathExtension
// name - "demoArt"
Some people here seem to overlook that a filename can have multiple periods in the name and in that case only the last period separates the file extension. So this.is.a.valid.image.filename.jpg and stripping the extension should return this.is.a.valid.image.filename and not this (as two answers here would produce) or anything else in between. The regex answer works correctly but using a regex for that is a bit overkill (probably 10 times slower than using simple string processing). Here's a generic function that works for everyone:
func stripFileExtension ( _ filename: String ) -> String {
var components = filename.components(separatedBy: ".")
guard components.count > 1 else { return filename }
components.removeLast()
return components.joined(separator: ".")
}
print("1: \(stripFileExtension("foo"))")
print("2: \(stripFileExtension("foo.bar"))")
print("3: \(stripFileExtension("foo.bar.foobar"))")
Output:
foo
foo
foo.bar
You can also split the String using componentsSeparatedBy, like this:
let fileName = "demoArt.png"
var components = fileName.components(separatedBy: ".")
if components.count > 1 { // If there is a file extension
components.removeLast()
return components.joined(separator: ".")
} else {
return fileName
}
To clarify:
fileName.components(separatedBy: ".")
will return an array made up of "demoArt" and "png".
In iOS Array start with 0 and you want name of the file without extension, so you have split the string using ., now the name will store in first object and extension in the second one.
Simple Example
let fileName = "demoArt.png"
let name = fileName.characters.split(".").map(String.init).first
If you don't care what the extension is. This is a simple way.
let ss = filename.prefix(upTo: fileName.lastIndex { $0 == "." } ?? fileName.endIndex))
You may want to convert resulting substring to String after this. With String(ss)
#Confused with Swift 4 you can do this:
let fileName = "demoArt.png"
// or on your specific case:
// let fileName = myTextureAtlas.textureNames.first
let name = String(fileName.split(separator: ".").first!)
print(name)
Additionally you should also unwrapp first but I didn't want to complicate the sample code to solve your problem.
Btw, since I've also needed this recently, if you want to remove a specific suffix you know in advance, you can do something like this:
let fileName = "demoArt.png"
let fileNameExtension = ".png"
if fileName.hasSuffix(fileNameExtension) {
let name = fileName.prefix(fileName.count - fileNameExtension.count)
print(name)
}
How about using .dropLast(k) where k is the number of characters you drop from the suffix ?
Otherwise for removing extensions from path properly from filename, I insist you to use URL and .deletingPathExtension().lastPathComponent.
Maybe a bit overhead but at least it's a rock solid Apple API.
You can also use a Regexp to extract all the part before the last dot like that :
let fileName = "test.png"
let pattern = "^(.*)(\\.[a-zA-Z]+)$"
let regexp = try! NSRegularExpression(pattern: pattern, options: [])
let extractedName = regexp.stringByReplacingMatches(in: fileName, options: [], range: NSMakeRange(0, fileName.characters.count), withTemplate: "$1")
print(extractedName) //test
let mp3Files = ["alarm.mp3", "bubbles.mp3", "fanfare.mp3"]
let ringtonsArray = mp3Files.flatMap { $0.components(separatedBy: ".").first }
You can return a new string removing a definite number of characters from the end.
let fileName = "demoArt.png"
fileName.dropLast(4)
This code returns "demoArt"
One liner:
let stringWithSuffixDropped = fileName.split(separator: ".").dropLast().joined(separator: ".")

Substrings in Swift

I'm having a problem with understand how I can work with substrings in Swift. Basically, I'm getting a JSON value that has a string with the following format:
Something
I'm trying to get rid of the HTML anchor tag with Swift so I'm left with Something. My thought was to find the index of every < and > in the string so then I could just do a substringWithRange and advance up to the right index.
My problem is that I can't figure out how to find the index. I've read that Swift doesn't support the index (unless you extend it.)
I don't want to add CPU cycles unnecessarily. So my question is, how do I find the indexes in a way that is not inefficient? Or, is there a better way of filtering out the tags?
Edit: Converted Andrew's first code sample to a function:
func formatTwitterSource(rawStr: String) -> String {
let unParsedString = rawStr
var midParseString = ""
var parsedString = ""
if let firstEndIndex = find(unParsedString, ">") {
midParseString = unParsedString[Range<String.Index>(start: firstEndIndex.successor(), end: unParsedString.endIndex)]
if let secondStartIndex = find(midParseString, "<") {
parsedString = midParseString[Range<String.Index>(start: midParseString.startIndex, end: secondStartIndex)]
}
}
return parsedString
}
Nothing too complicated. It takes in a String that has the tags in it. Then it uses Andrew's magic to parse everything out. I renamed the variables and made them clearer so you can see which variable does what in the process. Then in the end, it returns the parsed string.
You could do something like this, but it isn't pretty really. Obviously you would want to factor this into a function and possibly allow for various start/end tokens.
let testText = "Something"
if let firstEndIndex = find(testText, ">") {
let testText2 = testText[Range<String.Index>(start: firstEndIndex.successor(), end: testText.endIndex)]
if let secondStartIndex = find(testText2, "<") {
let testText3 = testText2[Range<String.Index>(start: testText2.startIndex, end: secondStartIndex)]
}
}
Edit
Working on this a little further and came up with something a little more idiomatic?
let startSplits = split(testText, { $0 == "<" })
let strippedValues = map(startSplits) { (s) -> String? in
if let endIndex = find(s, ">") {
return s[Range<String.Index>(start: endIndex.successor(), end: s.endIndex)]
}
return nil
}
let strings = map(filter(strippedValues, { $0 != "" })) { $0! }
It uses a little more functional style there at the end. Not sure I much enjoy the Swift style of map/filter compared to Haskell. But anyhow, the one potentially dangerous part is that forced unwrapping in the final map. If you can live with a result of [String?] then it isn't necessary.
Even though this question has been already answered, I am adding solution based on regex.
let pattern = "<.*>(.*)<.*>"
let src = "Something"
var error: NSError? = nil
var regex = NSRegularExpression(pattern: pattern, options: .DotMatchesLineSeparators, error: &error)
if let regex = regex {
var result = regex.stringByReplacingMatchesInString(src, options: nil, range: NSRange(location:0,
length:countElements(src)), withTemplate: "$1")
println(result)
}