How do you remove only punctuation within words in a string?

How do you remove only punctuation within words in a string? - swift

PROBLEM:
Remove all punctuation inside words in a string, not outside like beginning and end of a string. (Ex. we'll = well, We're = Were, etc.)
WHAT I'VE TRIED (Works except I have to name every single kind of punctuation to remove individually.. bad solution):
INPUT: let testString = "What's with this project I'm trying to build, it's cra!zy!"
if let resultRange = myString.range(of: "'") {
let startIndex = resultRange.lowerBound
let endIndex = resultRange.upperBound
let range = startIndex..<endIndex
let result = myString.replacingCharacters(in: range, with: "")
print(result) // outputs = "Whats with this project Im trying to build, its cra!zy"
}
ALSO TRIED (It's good, but it removes the punctuation from both the start and end of words, which fails the above requirements):
extension String {
func removingCharacters(inCharacterSet forbiddenCharacters:CharacterSet) -> String
{
var filteredString = self
while true {
if var forbiddenCharRange = filteredString.rangeOfCharacter(from: forbiddenCharacters) {
filteredString.removeSubrange(forbiddenCharRange)
}
else {
break
}
}
return filteredString
}
}
var resultMyString = myString.removingCharacters(inCharacterSet: .punctuationCharacters)
print(resultMyString) // outputs = "Whats with this project Im trying to build its crazy"
DESIRED OUTPUT: "Whats with this project Im trying to build, its crazy!"

Regex that accounts for all valid types of punctuation in a regular expression = [:punct:]
Combine that regex to account for word boundaries = "\b[:punct:]\b"
let testString = "What's with this project I'm trying to build, it's cra!zy!"
let result = testString
.replacingOccurrences(of: "\\b[:punct:]\\b", with: "", options: .regularExpression)
Outputs = "Whats with this project Im trying to build, its crazy!"

Related

Replacing two ranges in a String simultaneously

Say you have a string that looks likes this:
let myStr = "Hello, this is a test String"
And you have two Ranges,
let rangeOne = myStr.range(of: "Hello") //lowerBound: 0, upperBound: 4
let rangeTwo = myStr.range(of: "this") //lowerBound: 7, upperBound: 10
Now you wish to replace those ranges of myStr with new characters, that may not be the same length as their original, you end up with this:
var myStr = "Hello, this is a test String"
let rangeOne = myStr.range(of: "Hello")!
let rangeTwo = myStr.range(of: "this")!
myStr.replaceSubrange(rangeOne, with: "Bonjour") //Bonjour, this is a test String
myStr.replaceSubrange(rangeTwo, with: "ce") //Bonjourceis is a test String
Because rangeTwo is based on the pre-altered String, it fails to properly replace it.
I could store the length of the replacement and use it to reconstruct a new range, but there is no guarantee that rangeOne will be the first to be replaced, nor that rangeOne will actually be first in the string.

The solution is the same as removing multiple items from an array by index in a loop.
Do it backwards
First replace rangeTwo then rangeOne
myStr.replaceSubrange(rangeTwo, with: "ce")
myStr.replaceSubrange(rangeOne, with: "Bonjour")
An alternative could be also replacingOccurrences(of:with:)

This problem can be solved by shifting the second range based on the length of first the replaced string.
Using your code, here is how you would do it:
var myStr = "Hello, this is a test String"
let rangeOne = myStr.range(of: "Hello")!
let rangeTwo = myStr.range(of: "this")!
let shift = "Bonjour".count - "Hello".count
let shiftedTwo = myStr.index(rangeTwo.lowerBound, offsetBy: shift)..<myStr.index(rangeTwo.upperBound, offsetBy: shift)
myStr.replaceSubrange(rangeOne, with: "Bonjour") // Bonjour, this is a test String
myStr.replaceSubrange(shiftedTwo, with: "ce") // Bonjour, ce is a test String

You can sort the range in descending order, then replace backwards, from the end to the start. So that any subsequent replacement will not be affect by the previous replacements. Also, it is safer to use replacingCharacters instead of replaceSubrange in case when dealing with multi-codepoints characters.
let myStr = "Hello, this is a test String"
var ranges = [myStr.range(of: "Hello")!,myStr.range(of: "this")!]
ranges.shuffle()
ranges.sort(by: {$1.lowerBound < $0.lowerBound}) //Sort in reverse order
let newWords : [String] = ["Bonjour😀","ce"].reversed()
var newStr = myStr
for i in 0..<ranges.count
{
let range = ranges[i]
//check overlap
if(ranges.contains(where: {$0.overlaps(range)}))
{
//Some range over lap
throw ...
}
let newWord = newWords[i]
newStr = newStr.replacingCharacters(in: range, with: newWord)
}
print(newStr)

My solution ended up being to take the ranges and replacement strings, work backwards and replace
extension String {
func replacingRanges(_ ranges: [NSRange], with insertions: [String]) -> String {
var copy = self
copy.replaceRanges(ranges, with: insertions)
return copy
}
mutating func replaceRanges(_ ranges: [NSRange], with insertions: [String]) {
var pairs = Array(zip(ranges, insertions))
pairs.sort(by: { $0.0.upperBound > $1.0.upperBound })
for (range, replacementText) in pairs {
guard let textRange = Range(range, in: self) else { continue }
replaceSubrange(textRange, with: replacementText)
}
}
}
Which works out to be useable like this
var myStr = "Hello, this is a test."
let rangeOne = NSRange(location: 0, length: 5) // “Hello”
let rangeTwo = NSRange(location: 7, length: 4) // “this”
myStr.replaceRanges([rangeOne, rangeTwo], with: ["Bonjour", "ce"])
print(myStr) // Bonjour, ce is a test.

Get the string up to a specific character

var hello = "hello, how are you?"
var hello2 = "hello, how are you #tom?"
i want to delete every letter behind the # sign.
result should be
var hello2 = "hello, how are you #tom?"
->
hello2.trimmed()
print(hello2.trimmed())
-> "hello, how are you"
Update
As i want to use it to link multiple users and replace the space behind #sign with the correct name, I always need the reference to the latest occurrence of the #sign to replace it.
text3 = "hey i love you #Tom #Marcus #Peter"
Example what the final version should look like
to start off
var text = "hello #tom #mark #mathias"
i want to always get the index of the latest # sign in the text

Expanding on #appzYourLife answer, the following will also trim off the whitespace characters after removing everything after the # symbol.
import Foundation
var str = "hello, how are you #tom"
if str.contains("#") {
let endIndex = str.range(of: "#")!.lowerBound
str = str.substring(to: endIndex).trimmingCharacters(in: .whitespacesAndNewlines)
}
print(str) // Output - "hello, how are you"
UPDATE:
In response to finding the last occurance of the # symbol in the string and removing it, here is how I would approach it:
var str = "hello, how are you #tom #tim?"
if str.contains("#") {
//Reverse the string
var reversedStr = String(str.characters.reversed())
//Find the first (last) occurance of #
let endIndex = reversedStr.range(of: "#")!.upperBound
//Get the string up to and after the # symbol
let newStr = reversedStr.substring(from: endIndex).trimmingCharacters(in: .whitespacesAndNewlines)
//Store the new string over the original
str = String(newStr.characters.reversed())
//str = "hello, how are you #tom"
}
Or looking at #appzYourLife answer use range(of:options:range:locale:) instead of literally reversing the characters
var str = "hello, how are you #tom #tim?"
if str.contains("#") {
//Find the last occurrence of #
let endIndex = str.range(of: "#", options: .backwards, range: nil, locale: nil)!.lowerBound
//Get the string up to and after the # symbol
let newStr = str.substring(from: endIndex).trimmingCharacters(in: .whitespacesAndNewlines)
//Store the new string over the original
str = newStr
//str = "hello, how are you #tom"
}
As an added bonus, here is how I would approach removing every # starting with the last and working forward:
var str = "hello, how are you #tom and #tim?"
if str.contains("#") {
while str.contains("#") {
//Reverse the string
var reversedStr = String(str.characters.reversed())
//Find the first (last) occurance of #
let endIndex = reversedStr.range(of: "#")!.upperBound
//Get the string up to and after the # symbol
let newStr = reversedStr.substring(from: endIndex).trimmingCharacters(in: .whitespacesAndNewlines)
//Store the new string over the original
str = String(newStr.characters.reversed())
}
//after while loop, str = "hello, how are you"
}

let text = "hello, how are you #tom?"
let trimSpot = text.index(of: "#") ?? text.endIndex
let trimmed = text[..<trimSpot]
Since a string is a collection of Character type, it can be accessed as such. The second line finds the index of the # sign and assigns its value to trimSpot, but if it is not there, the endIndex of the string is assigned through the use of the nil coalescing operator
??
The string, or collection of Characters, can be provided a range that will tell it what characters to get. The expression inside of the brackets,
..<trimSpot
is a range from 0 to trimSpot-1. So,
text[..<trimSpot]
returns an instance of type Substring, which points at the original String instance.

You need to find the range of the "#" and then use it to create a substring up to the index before.
import Foundation
let text = "hello, how are you #tom?"
if let range = text.range(of: "#") {
let result = text.substring(to: range.lowerBound)
print(result) // "hello, how are you "
}
Considerations
Please note that, following the logic you described and using the input text you provided, the output string will have a blank space as last character
Also note that if multiple # are presente in the input text, then the first occurrence will be used.
Last index [Update]
I am adding this new section to answer the question you posted in the comments.
If you have a text like this
let text = "hello #tom #mark #mathias"
and you want the index of the last occurrency of "#" you can write
if let index = text.range(of: "#", options: .backwards)?.lowerBound {
print(index)
}

Try regular expressions, they are much safer (if you know what you are doing...)
let hello2 = "hello, how are you #tom, #my #next #victim?"
let deletedStringsAfterAtSign = hello2.replacingOccurrences(of: "#\\w+", with: "", options: .regularExpression, range: nil)
print(deletedStringsAfterAtSign)
//prints "hello, how are you , ?"
And this code removes exactly what you need and leaves the characters after the strings clear, so you can see the , and ? still being there. :)
EDIT: what you asked in comments to this answer:
let hello2 = "hello, how are you #tom, #my #next #victim?"
if let elementIwannaAfterEveryAtSign = hello2.components(separatedBy: " #").last
{
let deletedStringsAfterAtSign = hello2.replacingOccurrences(of: "#\\w+", with: elementIwannaAfterEveryAtSign, options: .regularExpression, range: nil)
print(deletedStringsAfterAtSign)
//prints hello, how are you victim?, victim? victim? victim??
}

Remove suffix from filename in Swift

When trying to remove the suffix from a filename, I'm only left with the suffix, which is exactly not what I want.
What (how many things) am I doing wrong here:
let myTextureAtlas = SKTextureAtlas(named: "demoArt")
let filename = (myTextureAtlas.textureNames.first?.characters.split{$0 == "."}.map(String.init)[1].replacingOccurrences(of: "\'", with: ""))! as String
print(filename)
This prints png which is the most dull part of the whole thing.

If by suffix you mean path extension, there is a method for this:
let filename = "demoArt.png"
let name = (filename as NSString).deletingPathExtension
// name - "demoArt"

Some people here seem to overlook that a filename can have multiple periods in the name and in that case only the last period separates the file extension. So this.is.a.valid.image.filename.jpg and stripping the extension should return this.is.a.valid.image.filename and not this (as two answers here would produce) or anything else in between. The regex answer works correctly but using a regex for that is a bit overkill (probably 10 times slower than using simple string processing). Here's a generic function that works for everyone:
func stripFileExtension ( _ filename: String ) -> String {
var components = filename.components(separatedBy: ".")
guard components.count > 1 else { return filename }
components.removeLast()
return components.joined(separator: ".")
}
print("1: \(stripFileExtension("foo"))")
print("2: \(stripFileExtension("foo.bar"))")
print("3: \(stripFileExtension("foo.bar.foobar"))")
Output:
foo
foo
foo.bar

You can also split the String using componentsSeparatedBy, like this:
let fileName = "demoArt.png"
var components = fileName.components(separatedBy: ".")
if components.count > 1 { // If there is a file extension
components.removeLast()
return components.joined(separator: ".")
} else {
return fileName
}
To clarify:
fileName.components(separatedBy: ".")
will return an array made up of "demoArt" and "png".

In iOS Array start with 0 and you want name of the file without extension, so you have split the string using ., now the name will store in first object and extension in the second one.
Simple Example
let fileName = "demoArt.png"
let name = fileName.characters.split(".").map(String.init).first

If you don't care what the extension is. This is a simple way.
let ss = filename.prefix(upTo: fileName.lastIndex { $0 == "." } ?? fileName.endIndex))
You may want to convert resulting substring to String after this. With String(ss)

#Confused with Swift 4 you can do this:
let fileName = "demoArt.png"
// or on your specific case:
// let fileName = myTextureAtlas.textureNames.first
let name = String(fileName.split(separator: ".").first!)
print(name)
Additionally you should also unwrapp first but I didn't want to complicate the sample code to solve your problem.
Btw, since I've also needed this recently, if you want to remove a specific suffix you know in advance, you can do something like this:
let fileName = "demoArt.png"
let fileNameExtension = ".png"
if fileName.hasSuffix(fileNameExtension) {
let name = fileName.prefix(fileName.count - fileNameExtension.count)
print(name)
}

How about using .dropLast(k) where k is the number of characters you drop from the suffix ?
Otherwise for removing extensions from path properly from filename, I insist you to use URL and .deletingPathExtension().lastPathComponent.
Maybe a bit overhead but at least it's a rock solid Apple API.

You can also use a Regexp to extract all the part before the last dot like that :
let fileName = "test.png"
let pattern = "^(.*)(\\.[a-zA-Z]+)$"
let regexp = try! NSRegularExpression(pattern: pattern, options: [])
let extractedName = regexp.stringByReplacingMatches(in: fileName, options: [], range: NSMakeRange(0, fileName.characters.count), withTemplate: "$1")
print(extractedName) //test

let mp3Files = ["alarm.mp3", "bubbles.mp3", "fanfare.mp3"]
let ringtonsArray = mp3Files.flatMap { $0.components(separatedBy: ".").first }

You can return a new string removing a definite number of characters from the end.
let fileName = "demoArt.png"
fileName.dropLast(4)
This code returns "demoArt"

One liner:
let stringWithSuffixDropped = fileName.split(separator: ".").dropLast().joined(separator: ".")

swift: how can I delete a specific character?

a string such as ! !! yuahl! ! , I want delete ! and , when I code like this
for index in InputName.characters.indices {
if String(InputName[index]) == "" || InputName.substringToIndex(index) == "!" {
InputName.removeAtIndex(index)
}
}
have this error " fatal error: subscript: subRange extends past String end ", how should I do? THX :D

Swift 5+
let myString = "aaaaaaaabbbb"
let replaced = myString.replacingOccurrences(of: "bbbb", with: "") // "aaaaaaaa"

If you need to remove characters only on both ends, you can use stringByTrimmingCharactersInSet(_:)
let delCharSet = NSCharacterSet(charactersInString: "! ")
let s1 = "! aString! !"
let s1Del = s1.stringByTrimmingCharactersInSet(delCharSet)
print(s1Del) //->aString
let s2 = "! anotherString !! aString! !"
let s2Del = s2.stringByTrimmingCharactersInSet(delCharSet)
print(s2Del) //->anotherString !! aString
If you need to remove characters also in the middle, "reconstruct from the filtered output" would be a little bit more efficient than repeating single character removal.
var tempUSView = String.UnicodeScalarView()
tempUSView.appendContentsOf(s2.unicodeScalars.lazy.filter{!delCharSet.longCharacterIsMember($0.value)})
let s2DelAll = String(tempUSView)
print(s2DelAll) //->anotherStringaString
If you don't mind generating many intermediate Strings and Arrays, this single liner can generate the expected output:
let s2DelAll2 = s2.componentsSeparatedByCharactersInSet(delCharSet).joinWithSeparator("")
print(s2DelAll2) //->anotherStringaString

I find that the filter method is a good way to go for this sort of thing:
let unfiltered = "! !! yuahl! !"
// Array of Characters to remove
let removal: [Character] = ["!"," "]
// turn the string into an Array
let unfilteredCharacters = unfiltered.characters
// return an Array without the removal Characters
let filteredCharacters = unfilteredCharacters.filter { !removal.contains($0) }
// build a String with the filtered Array
let filtered = String(filteredCharacters)
print(filtered) // => "yeah"
// combined to a single line
print(String(unfiltered.characters.filter { !removal.contains($0) })) // => "yuahl"

Swift 3
In Swift 3, the syntax is a bit nicer. As a result of the Great Swiftification of the old APIs, the factory method is now called trimmingCharacters(in:). Also, you can construct the CharacterSet as a Set of single-character Strings:
let string = "! !! yuahl! !"
string.trimmingCharacters(in: [" ", "!"]) // "yuahl"
If you have characters in the middle of the string you would like to remove as well, you can use components(separatedBy:).joined():
let string = "! !! yu !ahl! !"
string.components(separatedBy: ["!", " "]).joined() // "yuahl"
H/T #OOPer for the Swift 2 version

func trimLast(character chars: Set<Character>) -> String {
let str: String = String(self.reversed())
guard let index = str.index(where: {!chars.contains($0)}) else {
return self
}
return String((str[index..<str.endIndex]).reversed())
}
Note:
By adding this function in String extension, you can delete the specific character of string at last.

for index in InputName.characters.indices.reversed() {
if String(InputName[index]) == "" || InputName.substringToIndex(index) == "!" {
InputName.removeAtIndex(index)
}
}

Also you can add such very helpful extension :
import Foundation
extension String{
func exclude(find:String) -> String {
return stringByReplacingOccurrencesOfString(find, withString: "", options: .CaseInsensitiveSearch, range: nil)
}
func replaceAll(find:String, with:String) -> String {
return stringByReplacingOccurrencesOfString(find, withString: with, options: .CaseInsensitiveSearch, range: nil)
}
}

you can use this:
for example if you want to remove "%" the percent from 10%
if let i = text.firstIndex(of: "%") {
text.remove(at: i) //10
}

Substrings in Swift

I'm having a problem with understand how I can work with substrings in Swift. Basically, I'm getting a JSON value that has a string with the following format:
Something
I'm trying to get rid of the HTML anchor tag with Swift so I'm left with Something. My thought was to find the index of every < and > in the string so then I could just do a substringWithRange and advance up to the right index.
My problem is that I can't figure out how to find the index. I've read that Swift doesn't support the index (unless you extend it.)
I don't want to add CPU cycles unnecessarily. So my question is, how do I find the indexes in a way that is not inefficient? Or, is there a better way of filtering out the tags?
Edit: Converted Andrew's first code sample to a function:
func formatTwitterSource(rawStr: String) -> String {
let unParsedString = rawStr
var midParseString = ""
var parsedString = ""
if let firstEndIndex = find(unParsedString, ">") {
midParseString = unParsedString[Range<String.Index>(start: firstEndIndex.successor(), end: unParsedString.endIndex)]
if let secondStartIndex = find(midParseString, "<") {
parsedString = midParseString[Range<String.Index>(start: midParseString.startIndex, end: secondStartIndex)]
}
}
return parsedString
}
Nothing too complicated. It takes in a String that has the tags in it. Then it uses Andrew's magic to parse everything out. I renamed the variables and made them clearer so you can see which variable does what in the process. Then in the end, it returns the parsed string.

You could do something like this, but it isn't pretty really. Obviously you would want to factor this into a function and possibly allow for various start/end tokens.
let testText = "Something"
if let firstEndIndex = find(testText, ">") {
let testText2 = testText[Range<String.Index>(start: firstEndIndex.successor(), end: testText.endIndex)]
if let secondStartIndex = find(testText2, "<") {
let testText3 = testText2[Range<String.Index>(start: testText2.startIndex, end: secondStartIndex)]
}
}
Edit
Working on this a little further and came up with something a little more idiomatic?
let startSplits = split(testText, { $0 == "<" })
let strippedValues = map(startSplits) { (s) -> String? in
if let endIndex = find(s, ">") {
return s[Range<String.Index>(start: endIndex.successor(), end: s.endIndex)]
}
return nil
}
let strings = map(filter(strippedValues, { $0 != "" })) { $0! }
It uses a little more functional style there at the end. Not sure I much enjoy the Swift style of map/filter compared to Haskell. But anyhow, the one potentially dangerous part is that forced unwrapping in the final map. If you can live with a result of [String?] then it isn't necessary.

Even though this question has been already answered, I am adding solution based on regex.
let pattern = "<.*>(.*)<.*>"
let src = "Something"
var error: NSError? = nil
var regex = NSRegularExpression(pattern: pattern, options: .DotMatchesLineSeparators, error: &error)
if let regex = regex {
var result = regex.stringByReplacingMatchesInString(src, options: nil, range: NSRange(location:0,
length:countElements(src)), withTemplate: "$1")
println(result)
}

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

How do you remove only punctuation within words in a string? - swift

Related

Replacing two ranges in a String simultaneously

Get the string up to a specific character

Remove suffix from filename in Swift

swift: how can I delete a specific character?

Substrings in Swift

Categories

Resources