How to get a character from its ASCII value? - swift

I want to get letters from their ASCII code in Swift 3. I would do it like this in Java :
for(int i = 65 ; i < 75 ; i++)
{
System.out.print((char)i);
}
Which would log letters from A to J.
Now I tried this in Swift :
let s = String(describing: UnicodeScalar(i))
Instead of only getting the letter, I get this :
Optional("A")
Optional("B")
Optional("C")
...
What am I doing wrong? Thanks for your help.

UnicodeScalar has a few failable initialisers for integer types that can reprent values that aren't valid Unicode code points. Therefore you'll need to unwrap the UnicodeScalar? returned, as in the case that you pass an invalid code point, the initialiser will return nil.
However, given that you're dealing exclusively with ASCII characters, you can simply annotate i as a UInt8 and take advantage of the fact that UnicodeScalar has a non-failable initialiser for a UInt8 input (as it will always represent a valid code point):
for i : UInt8 in 65..<75 {
print(UnicodeScalar(i))
}

Related

How to convert String to UTF-8 to Integer in Swift

I'm trying to take each character (individual number, letter, or symbol) from a string file name without the extension and put each one into an array index as an integer of the utf-8 code (i.e. if the file name is "A1" without the extension, I would want "A" as an int "41" in first index, and "1" as int "31" in second index)
Here is the code I have but I'm getting this error "No exact matches in call to instance method 'append'", my guess is because .utf8 still keeps it as a string type:
for i in allNoteFiles {
var CharacterArray : [Int] = []
for character in i {
var utf8Character = String(character).utf8
CharacterArray.append(utf8Character) //error is here
}
....`//more code down here within the for in loop using CharacterArray indexes`
I'm sure the answer is probably simple, but I'm very new to Swift.
I've tried appending var number instead with:
var number = Int(utf8Character)
and
var number = (utf8Character).IntegerValue
but I get errors "No exact matches in call to initializer" and "Value of type 'String.UTF8View' has no member 'IntegerValue'"
Any help at all would be greatly appreciated. Thanks!
The reason
var utf8Character = String(character).utf8
CharacterArray.append(utf8Character)
doesn't work for you is because utf8Character is not a single integer, but a UTF8View: a lightweight way to iterate over the UTF-8 codepoints in a string. Every Character in a String can be made up of any number of UTF-8 bytes (individual integers) β€” while ASCII characters like "A" and "1" map to a single UTF-8 byte, the vast majority of characters do not: every UTF-8 code point maps to between 1 and 4 individual bytes. The Encoding section of UTF-8 on Wikipedia has a few very illustrative examples of how this works.
Now, assuming that you do want to split a string into individual UTF-8 bytes (either because you can guarantee your original string is ASCII-only, so the assumption that "character = byte" holds, or because you actually care about the bytes [though this is rarely the case]), there's a short and idiomatic solution to what you're looking for.
String.UTF8View is a Sequence of UInt8 values (individual bytes), and as such, you can use the Array initializer which takes a Sequence:
let characterArray: [UInt8] = Array(i.utf8)
If you need an array of Int values instead of UInt8, you can map the individual bytes ahead of time:
let characterArray: [Int] = Array(i.utf8.lazy.map { Int($0) })
(The .lazy avoids creating and storing an array of values in the middle of the operation.)
However, do note that if you aren't careful (e.g., your original string is not ASCII), you're bound to get very unexpected results from this operation, so keep that in mind.

Getting character ASCII value as an Integer in Swift

I have been trying to get the character ascii code as an int so as then I can modify it and change the character by doing some math. However I am finding it difficult to do so as I get conversion errors between the different types of integers and can't seem to find an answer
var n:Character = pass[I] //using the string protocol extension
if n.isASCII
{
var tempo:Int = Int(n.asciiValue)
temp += (tempo | key) //key and temp are of type int
}
In Swift, a Character may not necessarily be an ASCII one. It would for example have no sense to return the ascii value of "πŸͺ‚" which requires a large unicode encoding. This is why asciiValue property has an optional UInt8 value, which is annotated UInt8?.
The simplest solution
Since you checked yourself that the character isAscii, you can safely go for an unconditional unwrapping with !:
var tempo:Int = Int(n.asciiValue!) // <--- just change this line
A more elegant alternative
You could also take advantage of optional binding that uses the fact that the optional is nil when there is no ascii value (i.e. n was not an ASCII character):
if let tempo = n.asciiValue // is true only if there is an ascii value
{
temp += (Int(tempo) | key)
}

If a sequence of code points forms a Unicode character, does every non-empty prefix of that sequence also form a valid character?

The problem I have is that given a sequence of bytes, I want to determine its longest prefix which forms a valid Unicode character (extended grapheme cluster) assuming UTF8 encoding.
I am using Swift, so I would like to use Swift's built-in function(s) to do so. But these functions only decode a complete sequence of bytes. So I was thinking to convert prefixes of the byte sequence via Swift and take the last prefix that didn't fail and consists of 1 character only. Obviously, this might lead to trying out the entire sequence of bytes, which I want to avoid. A solution would be to stop trying out prefixes after 4 prefixes in a row failed. If the property asked in my question holds, this would then guarantee that all longer prefixes must also fail.
I find the Unicode Text Segmentation Standard unreadable, otherwise I would try to directly implement boundary detection of extended grapheme clusters...
After taking a long hard look at the specification for computing the boundaries for extended grapheme clusters (EGCs) at https://www.unicode.org/reports/tr29/#Grapheme_Cluster_Boundary_Rules,
it is obvious that the rules for EGCs all have the shape of describing when it is allowed to append a code point to an existing EGC to form a longer EGC. From that fact alone my two questions follow: 1) Yes, every non-empty prefix of code points which form an EGC is also an EGC. 2) No, by adding a code point to a valid Unicode string you will not decrease its length in terms of number of EGCs it consists of.
So, given this, the following Swift code will extract the longest Unicode character from the start of a byte sequence (or return nil if there is no valid Unicode character there):
func lex<S : Sequence>(_ input : S) -> (length : Int, out: Character)? where S.Element == UInt8 {
// This code works under three assumptions, all of which are true:
// 1) If a sequence of codepoints does not form a valid character, then appending codepoints to it does not yield a valid character
// 2) Appending codepoints to a sequence of codepoints does not decrease its length in terms of extended grapheme clusters
// 3) a codepoint takes up at most 4 bytes in an UTF8 encoding
var chars : [UInt8] = []
var result : String = ""
var resultLength = 0
func value() -> (length : Int, out : Character)? {
guard let character = result.first else { return nil }
return (length: resultLength, out: character)
}
var length = 0
var iterator = input.makeIterator()
while length - resultLength <= 4 {
guard let char = iterator.next() else { return value() }
chars.append(char)
length += 1
guard let s = String(bytes: chars, encoding: .utf8) else { continue }
guard s.count == 1 else { return value() }
result = s
resultLength = length
}
return value()
}

How can I convert a single Character type to uppercase?

All I want to do is convert a single Character to uppercase without the overhead of converting to a String and then calling .uppercased(). Is there any built-in way to do this, or a way for me to call the toupper() function from C without any bridging? I really don't think I should have to go out of my way for something so simple.
To call the C toupper() you need to get the Unicode code point of the Character. But Character has no method for getting its code point (a Character may consist of multiple code points), so you have to convert the Character into a String to obtain any of its code points.
So you really have to convert to String to get anywhere. Unless you store the character as a UnicodeScalar instead of a Character. In this case you can do this:
assert(unicodeScalar.isASCII) // toupper argument must be "representable as an unsigned char"
let uppercase = UnicodeScalar(toupper(CInt(unicodeScalar.value)))
But this isn't really more readable than simply using String:
let uppercase = Character(String(character).uppercased())
just add this to your program
extension Character {
//converts a character to uppercase
func convertToUpperCase() -> Character {
if(self.isUppercase){
return self
}
return Character(self.uppercased())
}
}

Convert Character to Integer in Swift

I am creating an iPhone app and I need to convert a single digit number into an integer.
My code has a variable called char that has a type Character, but I need to be able to do math with it, therefore I think I need to convert it to a string, however I cannot find a way to do that.
In the latest Swift versions (at least in Swift 5) there is a more straighforward way of converting Character instances. Character has property wholeNumberValue which tries to convert a character to Int and returns nil if the character does not represent and integer.
let char: Character = "5"
if let intValue = char.wholeNumberValue {
print("Value is \(intValue)")
} else {
print("Not an integer")
}
With a Character you can create a String. And with a String you can create an Int.
let char: Character = "1"
if let number = Int(String(char)) {
// use number
}
The String middleman type conversion isn’t necessary if you use the unicodeScalars property of Swift 4.0’s Character type.
let myChar: Character = "3"
myChar.unicodeScalars.first!.value - Unicode.Scalar("0")!.value // 3: UInt32
This uses a trick commonly seen in C code of subtracting the value of the char ’0’ literal to convert from ascii values to decimal values. See this site for the conversions: https://www.asciitable.com
Also there are some implicit unwraps in my answer. To avoid those, you can validate that you have a decimal digit with CharacterSet.decimalDigits, and/or use guard lets around the first property. You can also subtract 48 directly rather than converting ”0” through Unicode.Scalar.