My current attempts at creating a random unicode character generate have failed with errors such as those mentioned in my other question here. It's obviously not as simple as just generating a random number.
Question: How can I generate a random unicode character in Swift?
Unicode Scalar Value
Any Unicode code point except high-surrogate and low-surrogate code
points. In other words, the ranges of integers 0 to D7FF and E000
to 10FFFF inclusive.
So, I've made a small code's snippet. See below.
This code works
func randomUnicodeCharacter() -> String {
let i = arc4random_uniform(1114111)
return (i > 55295 && i < 57344) ? randomUnicodeCharacter() : String(UnicodeScalar(i))
}
randomUnicodeCharacter()
This code doesn't work!
let N: UInt32 = 65536
let i = arc4random_uniform(N)
var c = String(UnicodeScalar(i))
print(c, appendNewline: false)
I was a little bit confused with this and this. [Maximum value: 65535]
static func randomCharacters(withLength length: Int = 20) -> String {
let base = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789"
var randomString: String = ""
for _ in 0..<length {
let randomValue = arc4random_uniform(UInt32(base.characters.count))
randomString += "\(base[base.index(base.startIndex, offsetBy: Int(randomValue))])"
}
return randomString
}
Here you can modify length (Int) and use this for generating random characters.
Related
Is it possible to get the decimal value for String characters in Swift?
something like:
let words:String = "1 ring to rule them all"
var value:Int = 0
for i in 0..<words.count {
let char = words[words.index(words.startIndex,offsetBy:i)]
value += Int(char.decimal)
}
where the first character in "1 ring to rule them all" is 49. Possible?
you could try this:
let words = "1 ring to rule them all"
var value: Int = 0
for i in 0..<words.count {
let char = words[words.index(words.startIndex,offsetBy:i)]
if let val = char.asciiValue {
print("----> char: \(char) val: \(val)") // char and its "decimal" value
value += Int(val)
}
}
print("\n----> value: \(value) \n") // meaningless total value
ok, looks like this is the way:
let characterString = "蜇"
let scalars = characterString.unicodeScalars
let ui32:UInt32 = scalars[scalars.startIndex].value
If you want to add up the Unicode values associated with a string, it would be:
var value = 0
for character in string {
for scalar in character.unicodeScalars {
value += Int(scalar.value)
}
}
Or, alternatively:
let value = string
.flatMap { $0.unicodeScalars }
.compactMap { $0.value }
.reduce(0, +)
While the above adds the values, as requested, if you are trying to get a numeric representation for a string, you might consider using hashValue, or checksum, or CRC, or the like. Simply summing the values will not be able to detect, for example, character transpositions. It just depends upon your use-case for this numeric representation of your string.
Consider this function to build a string of random characters:
func makeToken(length: Int) -> String {
let chars: String = "abcdefghijklmnopqrstuvwxyz0123456789!?##$%ABCDEFGHIJKLMNOPQRSTUVWXYZ"
var result: String = ""
for _ in 0..<length {
let idx = Int(arc4random_uniform(UInt32(chars.characters.count)))
let idxEnd = idx + 1
let range: Range = idx..<idxEnd
let char = chars.substring(with: range)
result += char
}
return result
}
This throws an error on the substring method:
Cannot convert value of type 'Range<Int>' to expected argument
type 'Range<String.Index>' (aka 'Range<String.CharacterView.Index>')
I'm confused why I can't simply provide a Range with 2 integers, and why it's making me go the roundabout way of making a Range<String.Index>.
So I have to change the Range creation to this very over-complicated way:
let idx = Int(arc4random_uniform(UInt32(chars.characters.count)))
let start = chars.index(chars.startIndex, offsetBy: idx)
let end = chars.index(chars.startIndex, offsetBy: idx + 1)
let range: Range = start..<end
Why isn't it good enough for Swift for me to simply create a range with 2 integers and the half-open range operator? (..<)
Quite the contrast to "swift", in javascript I can simply do chars.substr(idx, 1)
I suggest converting your String to [Character] so that you can index it easily with Int:
func makeToken(length: Int) -> String {
let chars = Array("abcdefghijklmnopqrstuvwxyz0123456789!?##$%ABCDEFGHIJKLMNOPQRSTUVWXYZ".characters)
var result = ""
for _ in 0..<length {
let idx = Int(arc4random_uniform(UInt32(chars.count)))
result += String(chars[idx])
}
return result
}
Swift takes great care to provide a fully Unicode-compliant, type-safe, String abstraction.
Indexing a given Character, in an arbitrary Unicode string, is far from a trivial task. Each Character is a sequence of one or more Unicode scalars that (when combined) produce a single human-readable character. In particular, hiding all this complexity behind a simple Int based indexing scheme might result in the wrong performance mental model for programmers.
Having said that, you can always convert your string to a Array<Character> once for easy (and fast!) indexing. For instance:
let chars: String = "abcdefghijklmnop"
var charsArray = Array(chars.characters)
...
let resultingString = String(charsArray)
If I have a long range of numbers such as 1...1000000, what would be an efficient way to convert them to strings with the following mapping?
1->A, 2->B, 3->C, ... 10->A0, 11->AA, 12->AB etc.
I took the approach of splitting each number into digits (using modulus) and using it to get a character from an array to build the strings. Takes about 5 seconds for 1...1000. Is there a faster approach?
My code:
let numbers = 1...1000000
let charArray:[Character] = ["0","A","B","C","D","E","F","G","H","I"]
var results: [String] = []
func transformNumbers() {
for number in numbers {
var string = ""
var i = number
while i > 0 {string.insert(charArray[(i%10)], at: string.startIndex); i/=10}
results.append(string)
}
}
Your code took about 15 seconds on my old MacBook for 1...1000000, and the code below, less than 1 second:
(Using Xcode 8.3.3 with Release build on macOS 10.12.5)
let unicodeScalarArray:[UnicodeScalar] = ["0","A","B","C","D","E","F","G","H","I"]
let utf16CodeUnitArray:[UInt16] = unicodeScalarArray.map{UInt16($0.value)}
var results: [String] = []
func transformNumbers7() {
results = numbers.map {number in
var digits: [UInt16] = []
var i = number
while i > 0 {digits.append(utf16CodeUnitArray[i%10]); i/=10}
digits.reverse()
return String(utf16CodeUnits: digits, count: digits.count)
}
}
Generally,
Repeated insert(_:at:) can be slower than repeated append(_:) and reverse()
Working with Characters may be less efficient than UnicodeScalar, UTF-16 Code Units or UTF-8 Code Units.
Not sure if it is the fastest way, but switching to a map expression instead of mutating the results list speeds things up a bit over 10x on my machine:
let results = numbers.map { (val: Int) -> String in
var string = ""
var i = val
while i > 0 {string.insert(charArray[(i%10)], at: string.startIndex); i/=10}
return string
}
The following was possible with Swift 2.2:
let m = "alpha"
for i in m.startIndex..<m.endIndex {
print(m[i])
}
a
l
p
h
a
With 3.0, we get the following error:
Type 'Range' (aka 'Range') does not conform to protocol 'Sequence'
I am trying to do a very simple operation with strings in swift -- simply traverse through the first half of the string (or a more generic problem: traverse through a range of a string).
I can do the following:
let s = "string"
var midIndex = s.index(s.startIndex, offsetBy: s.characters.count/2)
let r = Range(s.startIndex..<midIndex)
print(s[r])
But here I'm not really traversing the string. So the question is: how do I traverse through a range of a given string. Like:
for i in Range(s.startIndex..<s.midIndex) {
print(s[i])
}
You can traverse a string by using indices property of the characters property like this:
let letters = "string"
let middle = letters.index(letters.startIndex, offsetBy: letters.characters.count / 2)
for index in letters.characters.indices {
// to traverse to half the length of string
if index == middle { break } // s, t, r
print(letters[index]) // s, t, r, i, n, g
}
From the documentation in section Strings and Characters - Counting Characters:
Extended grapheme clusters can be composed of one or more Unicode scalars. This means that different characters—and different representations of the same character—can require different amounts of memory to store. Because of this, characters in Swift do not each take up the same amount of memory within a string’s representation. As a result, the number of characters in a string cannot be calculated without iterating through the string to determine its extended grapheme cluster boundaries.
emphasis is my own.
This will not work:
let secondChar = letters[1]
// error: subscript is unavailable, cannot subscript String with an Int
Another option is to use enumerated() e.g:
let string = "Hello World"
for (index, char) in string.characters.enumerated() {
print(char)
}
or for Swift 4 just use
let string = "Hello World"
for (index, char) in string.enumerated() {
print(char)
}
Use the following:
for i in s.characters.indices[s.startIndex..<s.endIndex] {
print(s[i])
}
Taken from Migrating to Swift 2.3 or Swift 3 from Swift 2.2
Iterating over characters in a string is cleaner in Swift 4:
let myString = "Hello World"
for char in myString {
print(char)
}
If you want to traverse over the characters of a String, then instead of explicitly accessing the indices of the String, you could simply work with the CharacterView of the String, which conforms to CollectionType, allowing you access to neat subsequencing methods such as prefix(_:) and so on.
/* traverse the characters of your string instance,
up to middle character of the string, where "middle"
will be rounded down for strings of an odd amount of
characters (e.g. 5 characters -> travers through 2) */
let m = "alpha"
for ch in m.characters.prefix(m.characters.count/2) {
print(ch, ch.dynamicType)
} /* a Character
l Character */
/* round odd division up instead */
for ch in m.characters.prefix((m.characters.count+1)/2) {
print(ch, ch.dynamicType)
} /* a Character
l Character
p Character */
If you'd like to treat the characters within the loop as strings, simply use String(ch) above.
With regard to your comment below: if you'd like to access a range of the CharacterView, you could easily implement your own extension of CollectionType (specified for when Generator.Element is Character) making use of both prefix(_:) and suffix(_:) to yield a sub-collection given e.g. a half-open (from..<to) range
/* for values to >= count, prefixed CharacterView will be suffixed until its end */
extension CollectionType where Generator.Element == Character {
func inHalfOpenRange(from: Int, to: Int) -> Self {
guard case let to = min(to, underestimateCount()) where from <= to else {
return self.prefix(0) as! Self
}
return self.prefix(to).suffix(to-from) as! Self
}
}
/* example */
let m = "0123456789"
for ch in m.characters.inHalfOpenRange(4, to: 8) {
print(ch) /* \ */
} /* 4 a (sub-collection) CharacterView
5
6
7 */
The best way to do this is :-
let name = "nick" // The String which we want to print.
for i in 0..<name.count
{
// Operation name[i] is not allowed in Swift, an alternative is
let index = name.index[name.startIndex, offsetBy: i]
print(name[index])
}
for more details visit here
Swift 4.2
Simply:
let m = "alpha"
for i in m.indices {
print(m[i])
}
Swift 4:
let mi: String = "hello how are you?"
for i in mi {
print(i)
}
To concretely demonstrate how to traverse through a range in a string in Swift 4, we can use the where filter in a for loop to filter its execution to the specified range:
func iterateStringByRange(_ sentence: String, from: Int, to: Int) {
let startIndex = sentence.index(sentence.startIndex, offsetBy: from)
let endIndex = sentence.index(sentence.startIndex, offsetBy: to)
for position in sentence.indices where (position >= startIndex && position < endIndex) {
let char = sentence[position]
print(char)
}
}
iterateStringByRange("string", from: 1, to: 3) will print t, r and i
When iterating over the indices of characters in a string, you seldom only need the index. You probably also need the character at the given index. As specified by Paulo (updated for Swift 4+), string.indices will give you the indices of the characters. zip can be used to combine index and character:
let string = "string"
// Define the range to conform to your needs
let range = string.startIndex..<string.index(string.startIndex, offsetBy: string.count / 2)
let substring = string[range]
// If the range is in the type "first x characters", like until the middle, you can use:
// let substring = string.prefix(string.count / 2)
for (index, char) in zip(substring.indices, substring) {
// index is the index in the substring
print(char)
}
Note that using enumerated() will produce a pair of index and character, but the index is not the index of the character in the string. It is the index in the enumeration, which can be different.
I'm trying to get a valid substring of at most 255 UTF8 code units from a Swift string (the idea is to be able to store it an a database VARCHAR(255) field).
The standard way of getting a substring is this :
let string: String = "Hello world!"
let startIndex = string.startIndex
let endIndex = string.startIndex.advancedBy(255, limit: string.endIndex)
let databaseSubstring1 = string[startIndex..<endIndex]
But obviously that would give me a string of 255 characters that may require more than 255 bytes in UTF8 representation.
For UTF8 I can write this :
let utf8StartIndex = string.utf8.startIndex
let utf8EndIndex = utf8StartIndex.advancedBy(255, limit: string.utf8.endIndex)
let databaseSubstringUTF8View = name.utf8[utf8StartIndex..<utf8EndIndex]
let databaseSubstring2 = String(databaseSubstringUTF8View)
But I run the risk of having half a character at the end, which means my UTF8View would not be a valid UTF8 sequence.
And as expected databaseSubstring2 is an optional string because the initializer can fail (it is defined as public init?(_ utf8: String.UTF8View)).
So I need some way of stripping invalid UTF8 code points at the end, or – if possible – a builtin way of doing what I'm trying to do here.
EDIT
Turns out that databases understand characters, so I should not try to count UTF8 code units, but rather how many characters the database will count in my string (which will probably depend on the database).
According to #OOPer, MySQL counts characters as UTF-16 code units. I have come up with the following implementation :
private func databaseStringForString(string: String, maxLength: Int = 255) -> String
{
// Start by clipping to 255 characters
let startIndex = string.startIndex
let endIndex = startIndex.advancedBy(maxLength, limit: string.endIndex)
var string = string[startIndex..<endIndex]
// Remove characters from the end one by one until we have less than
// the maximum number of UTF-16 code units
while (string.utf16.count > maxLength) {
let startIndex = string.startIndex
let endIndex = string.endIndex.advancedBy(-1, limit: startIndex)
string = string[startIndex..<endIndex]
}
return string
}
The idea is to count UTF-16 code units, but remove characters from the end (that is what Swift think what a character is).
EDIT 2
Still according to #OOPer, Posgresql counts characters as unicode scalars, so this should probably work :
private func databaseStringForString(string: String, maxLength: Int = 255) -> String
{
// Start by clipping to 255 characters
let startIndex = string.startIndex
let endIndex = startIndex.advancedBy(maxLength, limit: string.endIndex)
var string = string[startIndex..<endIndex]
// Remove characters from the end one by one until we have less than
// the maximum number of Unicode Scalars
while (string.unicodeScalars.count > maxLength) {
let startIndex = string.startIndex
let endIndex = string.endIndex.advancedBy(-1, limit: startIndex)
string = string[startIndex..<endIndex]
}
return string
}
As I write in my comment, you may need your databaseStringForString(_:maxLength:) to truncate your string to match the length limit of your DBMS. PostgreSQL with utf8, MySQL with utf8mb4.
And I would write the same functionality as your EDIT 2:
func databaseStringForString(string: String, maxUnicodeScalarLength: Int = 255) -> String {
let start = string.startIndex
for index in start..<string.endIndex {
if string[start..<index.successor()].unicodeScalars.count > maxUnicodeScalarLength {
return string[start..<index]
}
}
return string
}
This may be less efficient, but a little bit shorter.
let s = "abc\u{1D122}\u{1F1EF}\u{1F1F5}" //->"abc𝄢🇯🇵"
let dbus = databaseStringForString(s, maxUnicodeScalarLength: 5) //->"abc𝄢"(=="abc\u{1D122}")
So, someone who works with MySQL with utf8(=utf8mb3) needs something like this:
func databaseStringForString(string: String, maxUTF16Length: Int = 255) -> String {
let start = string.startIndex
for index in start..<string.endIndex {
if string[start..<index.successor()].utf16.count > maxUTF16Length {
return string[start..<index]
}
}
return string
}
let dbu16 = databaseStringForString(s, maxUTF16Length: 4) //->"abc"