How to get non-escaped apostrophe from .components(separatedBy: CharacterSet) - swift

How I can get components(separatedBy: CharacterSet) to return the substrings so that they do not contain escaped apostrophes or single quotes?
When I print the resulting array, I want it to not include the backslash character.
I am using a playground to manipulate text and produce output in the terminal that I can copy and use outside of Xcode, so I want to strip the escape character from the string representation produced in the terminal output.
var str = "can't,,, won't, , good-bye, Santa Claus"
var delimiters = CharacterSet.letters.inverted.subtracting(.whitespaces)
delimiters = delimiters.subtracting(CharacterSet(charactersIn: "-"))
delimiters = delimiters.subtracting(CharacterSet(charactersIn: "'"))
var result = str.components(separatedBy: delimiters)
.map({ $0.trimmingCharacters(in: .whitespaces) })
.filter({ !$0.isEmpty })
print(result) // ["can\'t", "won\'t", "good-bye", "Santa Claus"]

What you are asking for is a metaphysical impossibility. You cannot want anything about how print prints. It's only a representation in the log.
Your strings do not actually contain any backslashes, so what's the problem? How the print command output notates them is irrelevant. You might as well "want" the print command to translate your strings into French. No, that's not what it does. It just prints, and the way it prints is the way it prints.
Another way to look at it: An array doesn't contain square brackets at both ends. And a string doesn't contain double-quotes at both ends. Those are things you might write in order express those things as literals, but they are not real as part of the actual object. Well, I don't see you objecting to those!
Basically, if you want to control the output of something, you write an output routine. If you're doing to rely on print, just accept the funny old way it writes stuff and move on.

Related

How can I read characters of a String with String format in Swift?

I am trying use String(format:, ) for reading some characters from left or right, do we have something for this job?
for example reading 2 characters from left would be: "AB" like this: "%2L#"
my code:
let stringOfText = String(format: "%#", "ABCDEF")
String(format:) is usually to transform a different value type into a string.
Since you already have a string, you don't really need this method.
try:
https://developer.apple.com/documentation/swift/string/2894830-prefix

How do I format a string from a string with %# in Swift

I am using Swift 4.2. I am getting extraneous characters when formatting one string (s1) from another string(s0) using the %# format code.
I have searched extensively for details of string formatting but have come up with only partial answers including the code in the second line below. I need to be able to format s1 so that I can customize output from a Swift process. I ask this because I have not found an answer while searching for ways to format a string from a string.
I tried the following three statements:
let s0:[String] = ["abcdef"]
let s1:[String] = [String(format:"%#",s0)]
print(s1)
...
The output is shown below. It may not be clear, here, but there are four leading spaces to the left of the abcdef string.
["(\n abcdef\n)"]
How can I format s1 so it does not include the brackets, the \n escape characters, and the leading spaces?
The issue here is you are using an array but a string in s0.
so the following index will help you.
let s0:[String] = ["abcdef"]
let s1:[String] = [String(format:" %#",s0[0])]
I am getting extraneous characters when formatting one string (s1) from another string (s0) ...
The s0 is not a string. It is an array of strings (i.e. the square brackets of [String] indicate an array and is the same as saying Array<String>). And your s1 is also array, but one that that has one element, whose value is the string representation of the entire s0 array of strings. That’s obviously not what you intended.
How can I format s1 so it does not include the brackets, the \n escape characters, and the leading spaces?
You’re getting those brackets because s1 is an array. You’re getting the string with the \n and spaces because its first value is the string representation of yet another array, s0.
So, if you’re just trying to format a string, s0, you can do:
let s0: String = "abcdef"
let s1: String = String(format: "It is ‘%#’", s0)
Or, if you really want an array of strings, you can call String(format:) for each using the map function:
let s0: [String] = ["abcdef", "ghijkl"]
let s1: [String] = s0.map { String(format: "It is ‘%#’", $0) }
By the way, in the examples above, I didn’t use a string format of just %#, because that doesn’t accomplish anything at all, so I assumed you were formatting the string for a reason.
FWIW, we generally don’t use String(format:) very often. Usually we do “string interpolation”, with \( and ):
let s0: String = "abcdef"
let s1: String = "It is ‘\(s0)’"
Get rid of all the unneccessary arrays and let the compiler figure out the types:
let s0 = "abcdef" // a string
let s1 = String(format:"- %# -",s0) // another string
print(s1) // prints "- abcdef -"

Carriage return character not being matched in Swift

I'm trying to parse a file that (apparently) ends its lines with carriage returns, but they aren't being matched as such in Swift, despite having the same UTF8 value. I can see possible fixes for the problem, but I'm curious as to what these characters actually are.
Here's some sample code, with the output below. (CR is set using Character("\r"), although I've tried it using "\r" as well.
try f.forEach() { c in
print(c, terminator:" ") // DBG
if (c == "\r") {
print("Carriage return found!")
}
print(String(c).utf8.first!, terminator:" ")//DBG
print(String(describing:pstate)) // DBG
...
case .field:
switch c {
case CR,LF :
self.endline()
pstate = .eol
When it reaches the end of line (which shows up as such in my text editors), I get this:
. 46 field
0 48 field
13 field
I 73 field
It doesn't seem to be matching using == or in the switch statement. Is there another approach I should be using for this character?
(I'll note that the parsing works fine with files that terminate in newlines.)
I determined what the problem was. By looking at c.unicodeScalars I discovered that the end of line character was in fact "\r\n", not just "\r". As seen in my code I was only taking the first when printing it out as UTF-8. I don't know if that's something from String.forEach or in the file itself.
I know that there are tests to determine if something is a newline. Swift 5 has them directly (c.isNewline), and there is also the CharacterSet approach as noted by Bill Nattaner.
I'm happier with something that will work in my switch statement (and thus I'll define each one explicitly), but that might change if I expect to deal with a wider variety of files.
I'm a little hazy as to what the f.forEach represents, but if your variable c is of type Character then you could replace your if statement with:
if "\(c)".rangeOfCharacter( from: CharacterSet.newlines ) != nil
{
print("Carriage return found!")
}
That way you won't have to invent a list of all-possible new line characters.

Is there a function to escape all regex-relevant characters?

The regex I'm using in my application is a combination of user-input and code. Because I don't want to restrict the user I would like to escape all regex-relevant characters like "+", brackets , slashes etc. from the entry.
Is there a function for that or at least an easy way to get all those characters in an array so that I can do something like this:
for regexChar in regexCharacterArray{
myCombinedRegex = myCombinedRegex.replaceOccurences(of: regexChar, with: "\\" + regexChar)
}
Yes, there is NSRegularExpression.escapedPattern(for:):
Returns a string by adding backslash escapes as necessary to protect any characters that would match as pattern metacharacters.
Example:
let escaped = NSRegularExpression.escapedPattern(for: "[*]+")
print(escaped) // \[\*]\+

Adding new line to NSCharacterSet

I want to strip a string of all new lines and commas (and place it into an array), so I created this:
let results = text.componentsSeparatedByCharactersInSet(NSCharacterSet(charactersInString: ",\n"))
However, the newlines are still existing in my array (the commas are being removed). What's the correct way of adding newline to the NSCharacterSet? Or, how to add comma to NSCharacterSet.newLineCharacterSet.
Thanks.
Here is janky solution, but still looking for a more elegant one.
var results = text.componentsSeparatedByCharactersInSet(NSCharacterSet(charactersInString: ","))
text = results.joinWithSeparator(" ")
results = text.componentsSeparatedByCharactersInSet(NSCharacterSet.whitespaceAndNewlineCharacterSet())
(one-line) SOLUTION:
var results = text.componentsSeparatedByCharactersInSet(NSCharacterSet(charactersInString: " ,\u{000A}\u{000B}\u{000C}\u{000D}\u{0085}"))
Explanation is below.
You can unite two NSCharacterSet by first using an NSMutableCharacterSet, for example:
let charset = NSMutableCharacterSet(charactersInString: ",")
charset.formUnionWithCharacterSet(NSCharacterSet.newlineCharacterSet())
let results = text.componentsSeparatedByCharactersInSet(charset)
So MartinR brought to my attention that there are more line feeds than just "\n".
I looked at the values used in NSCharacterSet.newlineCharacterSet and added them all, giving me:
var results = text.componentsSeparatedByCharactersInSet(NSCharacterSet(charactersInString: " ,\u{000A}\u{000B}\u{000C}\u{000D}\u{0085}"))
This got rid of all the whitespace, commas, and new lines. Interestingly - when I used all the newline values separately to see if I could figure out which newline was being used in my case, none of them worked. But when used all together, it strips my new lines.