How to print a content of the CharacterSet.decimalDigits? - swift

I tried to print a content of the CharacterSet.decimalDigits with:
print(CharacterSet.decimalDigits)
output: CFCharacterSet Predefined DecimalDigit Set
But my expectation was something like this:
[1, 2, 3, 4 ...]
So my question is: How to print content of the CharacterSet.decimalDigits?

This is not easy. Character sets are not made to be iterated, they are made to check whether a character is inside them or not. They don't contain the characters themselves and the ranges cannot be accessed.
The only thing you can do is to iterate over all characters and check every one of them against the character set, e.g.:
let set = CharacterSet.decimalDigits
let allCharacters = UInt32.min ... UInt32.max
allCharacters
.lazy
.compactMap { UnicodeScalar($0) }
.filter { set.contains($0) }
.map { String($0) }
.forEach { print($0) }
However, note that such a thing takes significant time and shouldn't be used inside a production application.

I don't think you can to that, at least not directly. If you look at the output of
let data = CharacterSet.decimalDigits.bitmapRepresentation
for byte in data {
print(String(format: "%02x", byte))
}
you'll see that the set internally stores bits at the code positions where the decimal digits are.

Related

Iterate trough every word from WordsArray to take every character from it

I'm making some hangman app so words i use should be displayed with "?" instead of letters
if let wordsUrl = Bundle.main.url(forResource: "start", withExtension: "txt"){
if let wordsContent = try? String(contentsOf: wordUrl){
var allWords = wordsContent.components(separatedBy: "\n")
I don't know how to index every word from allWords array.? After that i would change letters using another property which i would use to display
for letter in word {
usedLetters.append(letter)
promptWord.append("?")
I’d recommend creating a method which you can call whenever your text field needs updating due to something such as a new letter input from the user.
var wordTextField: UITextField!
var usedWords = [] // Array to track the words already used by the user
let word = "hangman" // Word for the user to guess
var promptWord = "" // What will be displayed in the wordTextField
func updateTextField() {
for letter in word.uppercased() {
let strLetter = String(letter)
if usedLetters.contains(strLetter) {
promptWord += strLetter
} else {
promptWord += "?"
}
}
wordTextField.text = promptWord
A brief explanation of what the code does:
Firstly it iterates through the word inspecting each letter (uppercase so that there are no inconsistencies with the characters when the comparison is made to what the user has entered as their guess).
Secondly it checks to see if the strLetter is contained within the usedLetters array if it is then it places the letter inside of the correct location in the promptWord.
Whenever the letter is not found to be contained within the usedWords array a “?” is instead added to the string.
Finally the text of the wordTextField is set to be the promptWord displaying the amount of letters which the user has left to guess and how many as well as which letters the user has guessed correctly.
You can convert a String to an array of characters:
let string = "a String"
let characters = Array(characters)
So you could map your array of words to an array of arrays of characters like this:
var allWordsAsCharacterArrays = allWords.map { Array($0) }
You can also populate strings with question marks using String.init(repeating:count:)
When you pick a word from your words array, you could convert it to an array of characters, and a working string that you would populate with an array of question marks. As the user picks letters, you could replace the question marks in the working string with the correct letters from the word they are guessing.
It looks like you are just trying to provide the user ultimately with a hidden word containing only question marks. May I suggest a more straight forward approach?
let wordToGuess = "Hangman"
let hiddenWord = String(repeating: "?", count: wordToGuess.count)
now when the user guesses you can replace the proper characters
let guess = "h" // get from your user input
if wordToGuess.localizedStandardContains(guess) {
var location = 0
for c in wordToGuess {
if c.lowercased() == guess.lowercased() {
let index = hiddenWord.index(hiddenWord.startIndex, offsetBy: location)
hiddenWord = hiddenWord.replacingCharacters(in: index...index , with: String(c) )
print("hidden word now: \(hiddenWord)")
}
location += 1
}
}
note this is pretty messy code. It works, but I'm sure there is a much better way.

Swift 4 Substring Crash

I'm a little confused about the best practices for Swift 4 string manipulation.
How do you handle the following:
let str = "test"
let start = str.index(str.startIndex, offsetBy: 7)
Thread 1: Fatal error: cannot increment beyond endIndex
Imagine that you do not know the length of the variable 'str' above. And since 'start' is not an optional value, what is the best practice to prevent that crash?
If you use the variation with limitedBy parameter, that will return an optional value:
if let start = str.index(str.startIndex, offsetBy: 7, limitedBy: str.endIndex) {
...
}
That will gracefully detect whether the offset moves the index past the endIndex. Obviously, handle this optional however best in your scenario (if let, guard let, nil coalescing operator, etc.).
Your code doesn't do any range checking:
let str = "test"
let start = str.index(str.startIndex, offsetBy: 7)
Write a function that tests the length of the string first. In fact, you could create an extension on String that lets you use integer subscripts, and returns a Character?:
extension String {
//Allow string[Int] subscripting. WARNING: Slow O(n) performance
subscript(index: Int) -> Character? {
guard index < self.count else { return nil }
return self[self.index(self.startIndex, offsetBy: index)]
}
}
This code:
var str = "test"
print("str[7] = \"\(str[7])\"")
Would display:
str[7] = "nil"
##EDIT:
Be aware, as Alexander pointed out in a comment below, that the subscript extension above has up to O(n) performance (it takes longer and longer as the index value goes up, up to the length of the string.)
If you need to loop through all the characters in a string code like this:
for i in str.count { doSomething(string: str[i]) }
would have O(n^2) (Or n-squared) performance, which is really, really bad. in that case, you should instead first convert the string to an array of characters:
let chars = Array(str.characters)
for i in chars.count { doSomething(string: chars[i]) }
or
for aChar in chars { //do something with aChar }
With that code you pay the O(n) time cost of converting the string to an array of characters once, and then you can do operations on the array of characters with maximum speed. The downside of that approach is that it would more than double the memory requirements.

Count the number of lines in a Swift String

After reading a medium sized file (about 500kByte) from a web-service I have a regular Swift String (lines) originally encoded in .isolatin1. Before actually splitting it I would like to count the number of lines (quickly) in order to be able to initialise a progress bar.
What is the best Swift idiom to achieve this?
I came up with the following:
let linesCount = lines.reduce(into: 0) { (count, letter) in
if letter == "\r\n" {
count += 1
}
}
This does not look too bad but I am asking myself if there is a shorter/faster way to do it. The characters property provides access to a sequence of Unicode graphemes which treat \r\n as only one entity. Checking this with all CharacterSet.newlines does not work, since CharacterSet is not a set of Character but a set of Unicode.Scalar (a little counter-intuitively in my book) which is a set of code points (where \r\n counts as two code points), not graphemes. Trying
var lines = "Hello, playground\r\nhere too\r\nGalahad\r\n"
lines.unicodeScalars.reduce(into: 0) { (cnt, letter) in
if CharacterSet.newlines.contains(letter) {
cnt += 1
}
}
will count to 6 instead of 3. So this is more general than the above method, but it will not work correctly for CRLF line endings.
Is there a way to allow for more line ending conventions (as in CharacterSet.newlines) that still achieves the correct result for CRLF? Can the number of lines be computed with less code (while still remaining readable)?
If it's ok for you to use a Foundation method on an NSString, I suggest using
enumerateLines(_ block: #escaping (String, UnsafeMutablePointer<ObjCBool>) -> Void)
Here's an example:
import Foundation
let base = "Hello, playground\r\nhere too\r\nGalahad\r\n"
let ns = base as NSString
ns.enumerateLines { (str, _) in
print(str)
}
It separates the lines properly, taking into account all linefeed types, such as "\r\n", "\n", etc:
Hello, playground
here too
Galahad
In my example I print the lines but it's trivial to count them instead, as you need to - my version is just for the demonstration.
As I did not find a generic way to count newlines I ended up just solving my problem by iterating through all the characters using
let linesCount = text.reduce(into: 0) { (count, letter) in
if letter == "\r\n" { // This treats CRLF as one "letter", contrary to UnicodeScalars
count += 1
}
}
I was sure this would be a lot faster than enumerating lines for just counting, but I resolved to eventually do the measurement. Today I finally got to it and found ... that I could not have been more wrong.
A 10000 line string counted lines as above in about 1.0 seconds , but counting through enumeration using
var enumCount = 0
text.enumerateLines { (str, _) in
enumCount += 1
}
only took around 0.8 seconds and was consistently faster by a little more than 20%. I do not know what tricks the Swift engineers hide in their sleves, but they sure manage to enumerateLines very quickly. This just for the record.
You can use the following extension
extension String {
var numberOfLines: Int {
return self.components(separatedBy: "\n").count
}
}
Swift 5 Extension
extension String {
func numberOfLines() -> Int {
return self.numberOfOccurrencesOf(string: "\n") + 1
}
func numberOfOccurrencesOf(string: String) -> Int {
return self.components(separatedBy:string).count - 1
}
}
Example:
let testString = "First line\nSecond line\nThird line"
let numberOfLines = testString.numberOfLines() // returns 3
I use this, a CharacterSet which Apple provides, made for this task:
let newLines = text.components(separatedBy: .newlines).count - 1

Get numbers characters from a string [duplicate]

This question already has answers here:
Filter non-digits from string
(12 answers)
Closed 6 years ago.
How to get numbers characters from a string? I don't want to convert in Int.
var string = "string_1"
var string2 = "string_20_certified"
My result have to be formatted like this:
newString = "1"
newString2 = "20"
Pattern matching a String's unicode scalars against Western Arabic Numerals
You could pattern match the unicodeScalars view of a String to a given UnicodeScalar pattern (covering e.g. Western Arabic numerals).
extension String {
var westernArabicNumeralsOnly: String {
let pattern = UnicodeScalar("0")..."9"
return String(unicodeScalars
.flatMap { pattern ~= $0 ? Character($0) : nil })
}
}
Example usage:
let str1 = "string_1"
let str2 = "string_20_certified"
let str3 = "a_1_b_2_3_c34"
let newStr1 = str1.westernArabicNumeralsOnly
let newStr2 = str2.westernArabicNumeralsOnly
let newStr3 = str3.westernArabicNumeralsOnly
print(newStr1) // 1
print(newStr2) // 20
print(newStr3) // 12334
Extending to matching any of several given patterns
The unicode scalar pattern matching approach above is particularly useful extending it to matching any of a several given patterns, e.g. patterns describing different variations of Eastern Arabic numerals:
extension String {
var easternArabicNumeralsOnly: String {
let patterns = [UnicodeScalar("\u{0660}")..."\u{0669}", // Eastern Arabic
"\u{06F0}"..."\u{06F9}"] // Perso-Arabic variant
return String(unicodeScalars
.flatMap { uc in patterns.contains{ $0 ~= uc } ? Character(uc) : nil })
}
}
This could be used in practice e.g. if writing an Emoji filter, as ranges of unicode scalars that cover emojis can readily be added to the patterns array in the Eastern Arabic example above.
Why use the UnicodeScalar patterns approach over Character ones?
A Character in Swift contains of an extended grapheme cluster, which is made up of one or more Unicode scalar values. This means that Character instances in Swift does not have a fixed size in the memory, which means random access to a character within a collection of sequentially (/contiguously) stored character will not be available at O(1), but rather, O(n).
Unicode scalars in Swift, on the other hand, are stored in fixed sized UTF-32 code units, which should allow O(1) random access. Now, I'm not entirely sure if this is a fact, or a reason for what follows: but a fact is that if benchmarking the methods above vs equivalent method using the CharacterView (.characters property) for some test String instances, its very apparent that the UnicodeScalar approach is faster than the Character approach; naive testing showed a factor 10-25 difference in execution times, steadily growing for growing String size.
Knowing the limitations of working with Unicode scalars vs Characters in Swift
Now, there are drawbacks using the UnicodeScalar approach, however; namely when working with characters that cannot represented by a single unicode scalar, but where one of its unicode scalars are contained in the pattern to which we want to match.
E.g., consider a string holding the four characters "Café". The last character, "é", is represented by two unicode scalars, "e" and "\u{301}". If we were to implement pattern matching against, say, UnicodeScalar("a")...e, the filtering method as applied above would allow one of the two unicode scalars to pass.
extension String {
var onlyLowercaseLettersAthroughE: String {
let patterns = [UnicodeScalar("1")..."e"]
return String(unicodeScalars
.flatMap { uc in patterns.contains{ $0 ~= uc } ? Character(uc) : nil })
}
}
let str = "Cafe\u{301}"
print(str) // Café
print(str.onlyLowercaseLettersAthroughE) // Cae
/* possibly we'd want "Ca" or "Caé"
as result here */
In the particular use case queried by from the OP in this Q&A, the above is not an issue, but depending on the use case, it will sometimes be more appropriate to work with Character pattern matching over UnicodeScalar.
Edit: Updated for Swift 4 & 5
Here's a straightforward method that doesn't require Foundation:
let newstring = string.filter { "0"..."9" ~= $0 }
or borrowing from #dfri's idea to make it a String extension:
extension String {
var numbers: String {
return filter { "0"..."9" ~= $0 }
}
}
print("3 little pigs".numbers) // "3"
print("1, 2, and 3".numbers) // "123"
import Foundation
let string = "a_1_b_2_3_c34"
let result = string.components(separatedBy: CharacterSet.decimalDigits.inverted).joined(separator: "")
print(result)
Output:
12334
Here is a Swift 2 example:
let str = "Hello 1, World 62"
let intString = str.componentsSeparatedByCharactersInSet(
NSCharacterSet
.decimalDigitCharacterSet()
.invertedSet)
.joinWithSeparator("") // Return a string with all the numbers
This method iterate through the string characters and appends the numbers to a new string:
class func getNumberFrom(string: String) -> String {
var number: String = ""
for var c : Character in string.characters {
if let n: Int = Int(String(c)) {
if n >= Int("0")! && n < Int("9")! {
number.append(c)
}
}
}
return number
}
For example with regular expression
let text = "string_20_certified"
let pattern = "\\d+"
let regex = try! NSRegularExpression(pattern: pattern, options: [])
if let match = regex.firstMatch(in: text, options: [], range: NSRange(location: 0, length: text.characters.count)) {
let newString = (text as NSString).substring(with: match.range)
print(newString)
}
If there are multiple occurrences of the pattern use matches(in..
let matches = regex.matches(in: text, options: [], range: NSRange(location: 0, length: text.characters.count))
for match in matches {
let newString = (text as NSString).substring(with: match.range)
print(newString)
}

How to get multiple lines of stdin Swift HackerRank?

I just tried out a HackerRank challenge, and if a question gives you x lines of input, putting x lines of let someVariable = readLine() simply doesn't cut it, because there are lot's of test cases that shoot way more input to the code we write, so hard coded readLine() for each line of input won't fly.
Is there some way to get multiple lines of input into one variable?
For anyone else out there who's trying a HackerRank challenge for the first time, you might need to know a couple of things that you may have never come across. I only recently learned about this piece of magic called the readLine() command, which is a native function in Swift.
When the HackerRank system executes your code, it passes your code lines of input and this is a way of retrieving that input.
let line1 = readLine()
let line2 = readLine()
let line3 = readLine()
line1 is now given the value of the first line of input mentioned in the question (or delivered to your code by one of the test cases), with line2 being the second and so on.
Your code may work just great but may fail on a bunch of other test cases. These test cases don't send your code the same number of lines of input. Here's food for thought:
var string = ""
while let thing = readLine() {
string += thing + " "
}
print(string)
Now the string variable contains all the input there was to receive (as a String, in this case).
Hope that helps someone
:)
Definitely you shouldn't do this:
while let readString = readLine() {
s += readString
}
This because Swift will expect an input string (from readLine) forever and will never terminate, causing your application die by timeout.
Instead you should think in a for loop assuming you know how many lines you need to read, which is usually this way in HackerRank ;)
Try something like this:
let n = Int(readLine()!)! // Number of test cases
for _ in 1 ... n { // Loop from 1 to n
let line = readLine()! // Read a single line
// do something with input
}
If you know that each line is an integer, you can use this:
let line = Int(readLine()!)!
Or if you know each line is an array of integers, use this:
let line = readLine()!.characters.split(" ").map{ Int(String($0))! }
Or if each line is an array of strings:
let line = readLine()!.characters.split(" ").map{ String($0) }
I hope this helps.
For new version, to get an array of numbers separated by space
let numbers = readLine()!.components(separatedBy: [" "]).map { Int($0)! }
Using readLine() and AnyGenerator to construct a String array of the std input lines
readLine() will read from standard input line-by-line until EOF is hit, whereafter it returns nil.
Returns Characters read from standard input through the end of the
current line or until EOF is reached, or nil if EOF has already been
reached.
This is quite neat, as it makes readLine() a perfect candidate for generating a sequence using the AnyGenerator initializer init(body:) which recursively (as next()) invokes body, terminating in case body equals nil.
AnyGenerator
init(body: () -> Element?)
Create a GeneratorType instance whose next method invokes body
and returns the result.
With this, there's no need to actually supply the amount of lines we expect from standard input, and hence, we can catch all input from standard input e.g. into a String array, where each element corresponds to an input line:
let allLines = AnyGenerator { readLine() }.map{ $0 }
// type: Array<String>
After which we can work with the String array to apply whatever operations needed to solve a given task (/HackerRank task).
// example standard input
4 3
<tag1 value = "HelloWorld">
<tag2 name = "Name1">
</tag2>
</tag1>
tag1.tag2~name
tag1~name
tag1~value
/* resulting allLines array:
["4 3", "<tag1 value = \"HelloWorld\">",
"<tag2 name = \"Name1\">",
"</tag2>",
"</tag1>",
"tag1.tag2~name",
"tag1~name",
"tag1~value"] */
I recently discovered a neat trick to get a certain amount of lines. I'm gonna assume the first line gives you the amount of lines you get:
guard let count = readLine().flatMap({ Int($0) }) else { fatalError("No count") }
let lines = AnyGenerator{ readLine() }.prefix(count)
for line in lines {
}
I usually use this form.
if let line = readLine(), let cnt = Int(line) {
for _ in 1...cnt {
if let line = readLine() {
// your code for a line
}
}
}
Following the answer from dfrib, for Swift 3+, AnyIterator can be used instead of AnyGenerator, in the same way:
let allLines = AnyIterator { readLine() }.map{ $0 }
// type: Array<String>