Swift 3.0 iterate over String.Index range - swift

The following was possible with Swift 2.2:
let m = "alpha"
for i in m.startIndex..<m.endIndex {
print(m[i])
}
a
l
p
h
a
With 3.0, we get the following error:
Type 'Range' (aka 'Range') does not conform to protocol 'Sequence'
I am trying to do a very simple operation with strings in swift -- simply traverse through the first half of the string (or a more generic problem: traverse through a range of a string).
I can do the following:
let s = "string"
var midIndex = s.index(s.startIndex, offsetBy: s.characters.count/2)
let r = Range(s.startIndex..<midIndex)
print(s[r])
But here I'm not really traversing the string. So the question is: how do I traverse through a range of a given string. Like:
for i in Range(s.startIndex..<s.midIndex) {
print(s[i])
}

You can traverse a string by using indices property of the characters property like this:
let letters = "string"
let middle = letters.index(letters.startIndex, offsetBy: letters.characters.count / 2)
for index in letters.characters.indices {
// to traverse to half the length of string
if index == middle { break } // s, t, r
print(letters[index]) // s, t, r, i, n, g
}
From the documentation in section Strings and Characters - Counting Characters:
Extended grapheme clusters can be composed of one or more Unicode scalars. This means that different characters—and different representations of the same character—can require different amounts of memory to store. Because of this, characters in Swift do not each take up the same amount of memory within a string’s representation. As a result, the number of characters in a string cannot be calculated without iterating through the string to determine its extended grapheme cluster boundaries.
emphasis is my own.
This will not work:
let secondChar = letters[1]
// error: subscript is unavailable, cannot subscript String with an Int

Another option is to use enumerated() e.g:
let string = "Hello World"
for (index, char) in string.characters.enumerated() {
print(char)
}
or for Swift 4 just use
let string = "Hello World"
for (index, char) in string.enumerated() {
print(char)
}

Use the following:
for i in s.characters.indices[s.startIndex..<s.endIndex] {
print(s[i])
}
Taken from Migrating to Swift 2.3 or Swift 3 from Swift 2.2

Iterating over characters in a string is cleaner in Swift 4:
let myString = "Hello World"
for char in myString {
print(char)
}

If you want to traverse over the characters of a String, then instead of explicitly accessing the indices of the String, you could simply work with the CharacterView of the String, which conforms to CollectionType, allowing you access to neat subsequencing methods such as prefix(_:) and so on.
/* traverse the characters of your string instance,
up to middle character of the string, where "middle"
will be rounded down for strings of an odd amount of
characters (e.g. 5 characters -> travers through 2) */
let m = "alpha"
for ch in m.characters.prefix(m.characters.count/2) {
print(ch, ch.dynamicType)
} /* a Character
l Character */
/* round odd division up instead */
for ch in m.characters.prefix((m.characters.count+1)/2) {
print(ch, ch.dynamicType)
} /* a Character
l Character
p Character */
If you'd like to treat the characters within the loop as strings, simply use String(ch) above.
With regard to your comment below: if you'd like to access a range of the CharacterView, you could easily implement your own extension of CollectionType (specified for when Generator.Element is Character) making use of both prefix(_:) and suffix(_:) to yield a sub-collection given e.g. a half-open (from..<to) range
/* for values to >= count, prefixed CharacterView will be suffixed until its end */
extension CollectionType where Generator.Element == Character {
func inHalfOpenRange(from: Int, to: Int) -> Self {
guard case let to = min(to, underestimateCount()) where from <= to else {
return self.prefix(0) as! Self
}
return self.prefix(to).suffix(to-from) as! Self
}
}
/* example */
let m = "0123456789"
for ch in m.characters.inHalfOpenRange(4, to: 8) {
print(ch) /* \ */
} /* 4 a (sub-collection) CharacterView
5
6
7 */

The best way to do this is :-
let name = "nick" // The String which we want to print.
for i in 0..<name.count
{
// Operation name[i] is not allowed in Swift, an alternative is
let index = name.index[name.startIndex, offsetBy: i]
print(name[index])
}
for more details visit here

Swift 4.2
Simply:
let m = "alpha"
for i in m.indices {
print(m[i])
}

Swift 4:
let mi: String = "hello how are you?"
for i in mi {
print(i)
}

To concretely demonstrate how to traverse through a range in a string in Swift 4, we can use the where filter in a for loop to filter its execution to the specified range:
func iterateStringByRange(_ sentence: String, from: Int, to: Int) {
let startIndex = sentence.index(sentence.startIndex, offsetBy: from)
let endIndex = sentence.index(sentence.startIndex, offsetBy: to)
for position in sentence.indices where (position >= startIndex && position < endIndex) {
let char = sentence[position]
print(char)
}
}
iterateStringByRange("string", from: 1, to: 3) will print t, r and i

When iterating over the indices of characters in a string, you seldom only need the index. You probably also need the character at the given index. As specified by Paulo (updated for Swift 4+), string.indices will give you the indices of the characters. zip can be used to combine index and character:
let string = "string"
// Define the range to conform to your needs
let range = string.startIndex..<string.index(string.startIndex, offsetBy: string.count / 2)
let substring = string[range]
// If the range is in the type "first x characters", like until the middle, you can use:
// let substring = string.prefix(string.count / 2)
for (index, char) in zip(substring.indices, substring) {
// index is the index in the substring
print(char)
}
Note that using enumerated() will produce a pair of index and character, but the index is not the index of the character in the string. It is the index in the enumeration, which can be different.

Related

Swift get decimal value for String characters

Is it possible to get the decimal value for String characters in Swift?
something like:
let words:String = "1 ring to rule them all"
var value:Int = 0
for i in 0..<words.count {
let char = words[words.index(words.startIndex,offsetBy:i)]
value += Int(char.decimal)
}
where the first character in "1 ring to rule them all" is 49. Possible?
you could try this:
let words = "1 ring to rule them all"
var value: Int = 0
for i in 0..<words.count {
let char = words[words.index(words.startIndex,offsetBy:i)]
if let val = char.asciiValue {
print("----> char: \(char) val: \(val)") // char and its "decimal" value
value += Int(val)
}
}
print("\n----> value: \(value) \n") // meaningless total value
ok, looks like this is the way:
let characterString = "蜇"
let scalars = characterString.unicodeScalars
let ui32:UInt32 = scalars[scalars.startIndex].value
If you want to add up the Unicode values associated with a string, it would be:
var value = 0
for character in string {
for scalar in character.unicodeScalars {
value += Int(scalar.value)
}
}
Or, alternatively:
let value = string
.flatMap { $0.unicodeScalars }
.compactMap { $0.value }
.reduce(0, +)
While the above adds the values, as requested, if you are trying to get a numeric representation for a string, you might consider using hashValue, or checksum, or CRC, or the like. Simply summing the values will not be able to detect, for example, character transpositions. It just depends upon your use-case for this numeric representation of your string.

Swift 4 Substring Crash

I'm a little confused about the best practices for Swift 4 string manipulation.
How do you handle the following:
let str = "test"
let start = str.index(str.startIndex, offsetBy: 7)
Thread 1: Fatal error: cannot increment beyond endIndex
Imagine that you do not know the length of the variable 'str' above. And since 'start' is not an optional value, what is the best practice to prevent that crash?
If you use the variation with limitedBy parameter, that will return an optional value:
if let start = str.index(str.startIndex, offsetBy: 7, limitedBy: str.endIndex) {
...
}
That will gracefully detect whether the offset moves the index past the endIndex. Obviously, handle this optional however best in your scenario (if let, guard let, nil coalescing operator, etc.).
Your code doesn't do any range checking:
let str = "test"
let start = str.index(str.startIndex, offsetBy: 7)
Write a function that tests the length of the string first. In fact, you could create an extension on String that lets you use integer subscripts, and returns a Character?:
extension String {
//Allow string[Int] subscripting. WARNING: Slow O(n) performance
subscript(index: Int) -> Character? {
guard index < self.count else { return nil }
return self[self.index(self.startIndex, offsetBy: index)]
}
}
This code:
var str = "test"
print("str[7] = \"\(str[7])\"")
Would display:
str[7] = "nil"
##EDIT:
Be aware, as Alexander pointed out in a comment below, that the subscript extension above has up to O(n) performance (it takes longer and longer as the index value goes up, up to the length of the string.)
If you need to loop through all the characters in a string code like this:
for i in str.count { doSomething(string: str[i]) }
would have O(n^2) (Or n-squared) performance, which is really, really bad. in that case, you should instead first convert the string to an array of characters:
let chars = Array(str.characters)
for i in chars.count { doSomething(string: chars[i]) }
or
for aChar in chars { //do something with aChar }
With that code you pay the O(n) time cost of converting the string to an array of characters once, and then you can do operations on the array of characters with maximum speed. The downside of that approach is that it would more than double the memory requirements.

Why does swift substring with range require a special type of Range

Consider this function to build a string of random characters:
func makeToken(length: Int) -> String {
let chars: String = "abcdefghijklmnopqrstuvwxyz0123456789!?##$%ABCDEFGHIJKLMNOPQRSTUVWXYZ"
var result: String = ""
for _ in 0..<length {
let idx = Int(arc4random_uniform(UInt32(chars.characters.count)))
let idxEnd = idx + 1
let range: Range = idx..<idxEnd
let char = chars.substring(with: range)
result += char
}
return result
}
This throws an error on the substring method:
Cannot convert value of type 'Range<Int>' to expected argument
type 'Range<String.Index>' (aka 'Range<String.CharacterView.Index>')
I'm confused why I can't simply provide a Range with 2 integers, and why it's making me go the roundabout way of making a Range<String.Index>.
So I have to change the Range creation to this very over-complicated way:
let idx = Int(arc4random_uniform(UInt32(chars.characters.count)))
let start = chars.index(chars.startIndex, offsetBy: idx)
let end = chars.index(chars.startIndex, offsetBy: idx + 1)
let range: Range = start..<end
Why isn't it good enough for Swift for me to simply create a range with 2 integers and the half-open range operator? (..<)
Quite the contrast to "swift", in javascript I can simply do chars.substr(idx, 1)
I suggest converting your String to [Character] so that you can index it easily with Int:
func makeToken(length: Int) -> String {
let chars = Array("abcdefghijklmnopqrstuvwxyz0123456789!?##$%ABCDEFGHIJKLMNOPQRSTUVWXYZ".characters)
var result = ""
for _ in 0..<length {
let idx = Int(arc4random_uniform(UInt32(chars.count)))
result += String(chars[idx])
}
return result
}
Swift takes great care to provide a fully Unicode-compliant, type-safe, String abstraction.
Indexing a given Character, in an arbitrary Unicode string, is far from a trivial task. Each Character is a sequence of one or more Unicode scalars that (when combined) produce a single human-readable character. In particular, hiding all this complexity behind a simple Int based indexing scheme might result in the wrong performance mental model for programmers.
Having said that, you can always convert your string to a Array<Character> once for easy (and fast!) indexing. For instance:
let chars: String = "abcdefghijklmnop"
var charsArray = Array(chars.characters)
...
let resultingString = String(charsArray)

Get numbers characters from a string [duplicate]

This question already has answers here:
Filter non-digits from string
(12 answers)
Closed 6 years ago.
How to get numbers characters from a string? I don't want to convert in Int.
var string = "string_1"
var string2 = "string_20_certified"
My result have to be formatted like this:
newString = "1"
newString2 = "20"
Pattern matching a String's unicode scalars against Western Arabic Numerals
You could pattern match the unicodeScalars view of a String to a given UnicodeScalar pattern (covering e.g. Western Arabic numerals).
extension String {
var westernArabicNumeralsOnly: String {
let pattern = UnicodeScalar("0")..."9"
return String(unicodeScalars
.flatMap { pattern ~= $0 ? Character($0) : nil })
}
}
Example usage:
let str1 = "string_1"
let str2 = "string_20_certified"
let str3 = "a_1_b_2_3_c34"
let newStr1 = str1.westernArabicNumeralsOnly
let newStr2 = str2.westernArabicNumeralsOnly
let newStr3 = str3.westernArabicNumeralsOnly
print(newStr1) // 1
print(newStr2) // 20
print(newStr3) // 12334
Extending to matching any of several given patterns
The unicode scalar pattern matching approach above is particularly useful extending it to matching any of a several given patterns, e.g. patterns describing different variations of Eastern Arabic numerals:
extension String {
var easternArabicNumeralsOnly: String {
let patterns = [UnicodeScalar("\u{0660}")..."\u{0669}", // Eastern Arabic
"\u{06F0}"..."\u{06F9}"] // Perso-Arabic variant
return String(unicodeScalars
.flatMap { uc in patterns.contains{ $0 ~= uc } ? Character(uc) : nil })
}
}
This could be used in practice e.g. if writing an Emoji filter, as ranges of unicode scalars that cover emojis can readily be added to the patterns array in the Eastern Arabic example above.
Why use the UnicodeScalar patterns approach over Character ones?
A Character in Swift contains of an extended grapheme cluster, which is made up of one or more Unicode scalar values. This means that Character instances in Swift does not have a fixed size in the memory, which means random access to a character within a collection of sequentially (/contiguously) stored character will not be available at O(1), but rather, O(n).
Unicode scalars in Swift, on the other hand, are stored in fixed sized UTF-32 code units, which should allow O(1) random access. Now, I'm not entirely sure if this is a fact, or a reason for what follows: but a fact is that if benchmarking the methods above vs equivalent method using the CharacterView (.characters property) for some test String instances, its very apparent that the UnicodeScalar approach is faster than the Character approach; naive testing showed a factor 10-25 difference in execution times, steadily growing for growing String size.
Knowing the limitations of working with Unicode scalars vs Characters in Swift
Now, there are drawbacks using the UnicodeScalar approach, however; namely when working with characters that cannot represented by a single unicode scalar, but where one of its unicode scalars are contained in the pattern to which we want to match.
E.g., consider a string holding the four characters "Café". The last character, "é", is represented by two unicode scalars, "e" and "\u{301}". If we were to implement pattern matching against, say, UnicodeScalar("a")...e, the filtering method as applied above would allow one of the two unicode scalars to pass.
extension String {
var onlyLowercaseLettersAthroughE: String {
let patterns = [UnicodeScalar("1")..."e"]
return String(unicodeScalars
.flatMap { uc in patterns.contains{ $0 ~= uc } ? Character(uc) : nil })
}
}
let str = "Cafe\u{301}"
print(str) // Café
print(str.onlyLowercaseLettersAthroughE) // Cae
/* possibly we'd want "Ca" or "Caé"
as result here */
In the particular use case queried by from the OP in this Q&A, the above is not an issue, but depending on the use case, it will sometimes be more appropriate to work with Character pattern matching over UnicodeScalar.
Edit: Updated for Swift 4 & 5
Here's a straightforward method that doesn't require Foundation:
let newstring = string.filter { "0"..."9" ~= $0 }
or borrowing from #dfri's idea to make it a String extension:
extension String {
var numbers: String {
return filter { "0"..."9" ~= $0 }
}
}
print("3 little pigs".numbers) // "3"
print("1, 2, and 3".numbers) // "123"
import Foundation
let string = "a_1_b_2_3_c34"
let result = string.components(separatedBy: CharacterSet.decimalDigits.inverted).joined(separator: "")
print(result)
Output:
12334
Here is a Swift 2 example:
let str = "Hello 1, World 62"
let intString = str.componentsSeparatedByCharactersInSet(
NSCharacterSet
.decimalDigitCharacterSet()
.invertedSet)
.joinWithSeparator("") // Return a string with all the numbers
This method iterate through the string characters and appends the numbers to a new string:
class func getNumberFrom(string: String) -> String {
var number: String = ""
for var c : Character in string.characters {
if let n: Int = Int(String(c)) {
if n >= Int("0")! && n < Int("9")! {
number.append(c)
}
}
}
return number
}
For example with regular expression
let text = "string_20_certified"
let pattern = "\\d+"
let regex = try! NSRegularExpression(pattern: pattern, options: [])
if let match = regex.firstMatch(in: text, options: [], range: NSRange(location: 0, length: text.characters.count)) {
let newString = (text as NSString).substring(with: match.range)
print(newString)
}
If there are multiple occurrences of the pattern use matches(in..
let matches = regex.matches(in: text, options: [], range: NSRange(location: 0, length: text.characters.count))
for match in matches {
let newString = (text as NSString).substring(with: match.range)
print(newString)
}

Convert String.CharacterView.Index to int [duplicate]

I want to convert the index of a letter contained within a string to an integer value. Attempted to read the header files but I cannot find the type for Index, although it appears to conform to protocol ForwardIndexType with methods (e.g. distanceTo).
var letters = "abcdefg"
let index = letters.characters.indexOf("c")!
// ERROR: Cannot invoke initializer for type 'Int' with an argument list of type '(String.CharacterView.Index)'
let intValue = Int(index) // I want the integer value of the index (e.g. 2)
Any help is appreciated.
edit/update:
Xcode 11 • Swift 5.1 or later
extension StringProtocol {
func distance(of element: Element) -> Int? { firstIndex(of: element)?.distance(in: self) }
func distance<S: StringProtocol>(of string: S) -> Int? { range(of: string)?.lowerBound.distance(in: self) }
}
extension Collection {
func distance(to index: Index) -> Int { distance(from: startIndex, to: index) }
}
extension String.Index {
func distance<S: StringProtocol>(in string: S) -> Int { string.distance(to: self) }
}
Playground testing
let letters = "abcdefg"
let char: Character = "c"
if let distance = letters.distance(of: char) {
print("character \(char) was found at position #\(distance)") // "character c was found at position #2\n"
} else {
print("character \(char) was not found")
}
let string = "cde"
if let distance = letters.distance(of: string) {
print("string \(string) was found at position #\(distance)") // "string cde was found at position #2\n"
} else {
print("string \(string) was not found")
}
Works for Xcode 13 and Swift 5
let myString = "Hello World"
if let i = myString.firstIndex(of: "o") {
let index: Int = myString.distance(from: myString.startIndex, to: i)
print(index) // Prints 4
}
The function func distance(from start: String.Index, to end: String.Index) -> String.IndexDistance returns an IndexDistance which is just a typealias for Int
Swift 4
var str = "abcdefg"
let index = str.index(of: "c")?.encodedOffset // Result: 2
Note: If String contains same multiple characters, it will just get the nearest one from left
var str = "abcdefgc"
let index = str.index(of: "c")?.encodedOffset // Result: 2
encodedOffset has deprecated from Swift 4.2.
Deprecation message:
encodedOffset has been deprecated as most common usage is incorrect. Use utf16Offset(in:) to achieve the same behavior.
So we can use utf16Offset(in:) like this:
var str = "abcdefgc"
let index = str.index(of: "c")?.utf16Offset(in: str) // Result: 2
When searching for index like this
⛔️ guard let index = (positions.firstIndex { position <= $0 }) else {
it is treated as Array.Index. You have to give compiler a clue you want an integer
✅ guard let index: Int = (positions.firstIndex { position <= $0 }) else {
Swift 5
You can do convert to array of characters and then use advanced(by:) to convert to integer.
let myString = "Hello World"
if let i = Array(myString).firstIndex(of: "o") {
let index: Int = i.advanced(by: 0)
print(index) // Prints 4
}
To perform string operation based on index , you can not do it with traditional index numeric approach. because swift.index is retrieved by the indices function and it is not in the Int type. Even though String is an array of characters, still we can't read element by index.
This is frustrating.
So ,to create new substring of every even character of string , check below code.
let mystr = "abcdefghijklmnopqrstuvwxyz"
let mystrArray = Array(mystr)
let strLength = mystrArray.count
var resultStrArray : [Character] = []
var i = 0
while i < strLength {
if i % 2 == 0 {
resultStrArray.append(mystrArray[i])
}
i += 1
}
let resultString = String(resultStrArray)
print(resultString)
Output : acegikmoqsuwy
Thanks In advance
Here is an extension that will let you access the bounds of a substring as Ints instead of String.Index values:
import Foundation
/// This extension is available at
/// https://gist.github.com/zackdotcomputer/9d83f4d48af7127cd0bea427b4d6d61b
extension StringProtocol {
/// Access the range of the search string as integer indices
/// in the rendered string.
/// - NOTE: This is "unsafe" because it may not return what you expect if
/// your string contains single symbols formed from multiple scalars.
/// - Returns: A `CountableRange<Int>` that will align with the Swift String.Index
/// from the result of the standard function range(of:).
func countableRange<SearchType: StringProtocol>(
of search: SearchType,
options: String.CompareOptions = [],
range: Range<String.Index>? = nil,
locale: Locale? = nil
) -> CountableRange<Int>? {
guard let trueRange = self.range(of: search, options: options, range: range, locale: locale) else {
return nil
}
let intStart = self.distance(from: startIndex, to: trueRange.lowerBound)
let intEnd = self.distance(from: trueRange.lowerBound, to: trueRange.upperBound) + intStart
return Range(uncheckedBounds: (lower: intStart, upper: intEnd))
}
}
Just be aware that this can lead to weirdness, which is why Apple has chosen to make it hard. (Though that's a debatable design decision - hiding a dangerous thing by just making it hard...)
You can read more in the String documentation from Apple, but the tldr is that it stems from the fact that these "indices" are actually implementation-specific. They represent the indices into the string after it has been rendered by the OS, and so can shift from OS-to-OS depending on what version of the Unicode spec is being used. This means that accessing values by index is no longer a constant-time operation, because the UTF spec has to be run over the data to determine the right place in the string. These indices will also not line up with the values generated by NSString, if you bridge to it, or with the indices into the underlying UTF scalars. Caveat developer.
In case you got an "index is out of bounds" error. You may try this approach. Working in Swift 5
extension String{
func countIndex(_ char:Character) -> Int{
var count = 0
var temp = self
for c in self{
if c == char {
//temp.remove(at: temp.index(temp.startIndex,offsetBy:count))
//temp.insert(".", at: temp.index(temp.startIndex,offsetBy: count))
return count
}
count += 1
}
return -1
}
}