Invalid dog face scalar - swift

I thought I understand Unicode scalars in Swift pretty well, but the dog face emoji proved me wrong.
for code in "🐶".utf16 {
print(code)
}
The UTF-16 codes are 55357 and 56374. In hex, that's d83d and dc36.
Now:
let dog = "\u{d83d}\u{dc36}"
Instead of getting a string with "🐶", I'm getting an error:
Invalid unicode scalar
I tried with UTF-8 codes and it didn't work neither. Not throwing an error, but returning "ð¶" instead of the dog face.
What is wrong here?

The \u{nnnn} escape sequence expects a Unicode scalar value, not the UTF-16 representation (with high and low surrogates):
for code in "🐶".unicodeScalars {
print(String(code.value, radix: 16))
}
// 1f436
let dog = "\u{1F436}"
print(dog) // 🐶
Solutions to reconstruct a string from its UTF-16 representation can be found at Is there a way to create a String from utf16 array in swift?. For example:
let utf16: [UInt16] = [ 0xd83d, 0xdc36 ]
let dog = String(utf16CodeUnits: utf16, count: utf16.count)
print(dog) // 🐶

Related

Swift: Simple method to replace a single character in a String?

I wanted to replace the first character of a String and got it to work like this:
s.replaceSubrange(Range(NSMakeRange(0,1),in:s)!, with:".")
I wonder if there is a simpler method to achieve the same result?
[edit]
Get nth character of a string in Swift programming language doesn't provide a mutable substring. And it requires writing a String extension, which isn't really helping when trying to shorten code.
To replace the first character, you can do use String concatenation with dropFirst():
var s = "😃hello world!"
s = "." + s.dropFirst()
print(s)
Result:
.hello world!
Note: This will not crash if the String is empty; it will just create a String with the replacement character.
Strings work very differently in Swift than many other languages. In Swift, a character is not a single byte but instead a single visual element. This is very important when working with multibyte characters like emoji (see: Why are emoji characters like 👩‍👩‍👧‍👦 treated so strangely in Swift strings?)
If you really do want to set a single random byte of your string to an arbitrary value as you expanded on in the comments of your question, you'll need to drop out of the string abstraction and work with your data as a buffer. This is sort of gross in Swift thanks to various safety features but it's doable:
var input = "Hello, world!"
//access the byte buffer
var utf8Buffer = input.utf8CString
//replace the first byte with whatever random data we want
utf8Buffer[0] = 46 //ascii encoding of '.'
//now convert back to a Swift string
var output:String! = nil //buffer for holding our new target
utf8Buffer.withUnsafeBufferPointer { (ptr) in
//Load the byte buffer into a Swift string
output = String.init(cString: ptr.baseAddress!)
}
print(output!) //.ello, world!

How to create a single character String

I've reproduced this problem in a Swift playground but haven't solved it yet...
I'd like to print one of a range of characters in a UILabel. If I explicitly declare the character, it works:
// This works.
let value: String = "\u{f096}"
label.text = value // Displays the referenced character.
However, I want to construct the String. The code below appears to produce the same result as the line above, except that it doesn't. It just produces the String \u{f096} and not the character it references.
// This doesn't work
let n: Int = 0x95 + 1
print(String(n, radix: 16)) // Prints "96".
let value: String = "\\u{f0\(String(n, radix: 16))}"
label.text = value // Displays the String "\u{f096}".
I'm probably missing something simple. Any ideas?
How about stop using string conversion voodoo and use standard library type UnicodeScalar?
You can also create Unicode scalar values directly from their numeric representation.
let airplane = UnicodeScalar(9992)
print(airplane)
// Prints "✈︎"
UnicodeScalar.init there is actually returning optional value, so you must unwrap it.
If you need String just convert it via Character type to String.
let airplaneString: String = String(Character(airplane)) // Assuming that airplane here is unwrapped

Guard Statement Parameter Error [duplicate]

How do you get the length of a String? For example, I have a variable defined like:
var test1: String = "Scott"
However, I can't seem to find a length method on the string.
As of Swift 4+
It's just:
test1.count
for reasons.
(Thanks to Martin R)
As of Swift 2:
With Swift 2, Apple has changed global functions to protocol extensions, extensions that match any type conforming to a protocol. Thus the new syntax is:
test1.characters.count
(Thanks to JohnDifool for the heads up)
As of Swift 1
Use the count characters method:
let unusualMenagerie = "Koala 🐨, Snail 🐌, Penguin 🐧, Dromedary 🐪"
println("unusualMenagerie has \(count(unusualMenagerie)) characters")
// prints "unusualMenagerie has 40 characters"
right from the Apple Swift Guide
(note, for versions of Swift earlier than 1.2, this would be countElements(unusualMenagerie) instead)
for your variable, it would be
length = count(test1) // was countElements in earlier versions of Swift
Or you can use test1.utf16count
TLDR:
For Swift 2.0 and 3.0, use test1.characters.count. But, there are a few things you should know. So, read on.
Counting characters in Swift
Before Swift 2.0, count was a global function. As of Swift 2.0, it can be called as a member function.
test1.characters.count
It will return the actual number of Unicode characters in a String, so it's the most correct alternative in the sense that, if you'd print the string and count characters by hand, you'd get the same result.
However, because of the way Strings are implemented in Swift, characters don't always take up the same amount of memory, so be aware that this behaves quite differently than the usual character count methods in other languages.
For example, you can also use test1.utf16.count
But, as noted below, the returned value is not guaranteed to be the same as that of calling count on characters.
From the language reference:
Extended grapheme clusters can be composed of one or more Unicode
scalars. This means that different characters—and different
representations of the same character—can require different amounts of
memory to store. Because of this, characters in Swift do not each take
up the same amount of memory within a string’s representation. As a
result, the number of characters in a string cannot be calculated
without iterating through the string to determine its extended
grapheme cluster boundaries. If you are working with particularly long
string values, be aware that the characters property must iterate over
the Unicode scalars in the entire string in order to determine the
characters for that string.
The count of the characters returned by the characters property is not
always the same as the length property of an NSString that contains
the same characters. The length of an NSString is based on the number
of 16-bit code units within the string’s UTF-16 representation and not
the number of Unicode extended grapheme clusters within the string.
An example that perfectly illustrates the situation described above is that of checking the length of a string containing a single emoji character, as pointed out by n00neimp0rtant in the comments.
var emoji = "👍"
emoji.characters.count //returns 1
emoji.utf16.count //returns 2
Swift 1.2 Update: There's no longer a countElements for counting the size of collections. Just use the count function as a replacement: count("Swift")
Swift 2.0, 3.0 and 3.1:
let strLength = string.characters.count
Swift 4.2 (4.0 onwards): [Apple Documentation - Strings]
let strLength = string.count
Swift 1.1
extension String {
var length: Int { return countElements(self) } //
}
Swift 1.2
extension String {
var length: Int { return count(self) } //
}
Swift 2.0
extension String {
var length: Int { return characters.count } //
}
Swift 4.2
extension String {
var length: Int { return self.count }
}
let str = "Hello"
let count = str.length // returns 5 (Int)
Swift 4
"string".count
;)
Swift 3
extension String {
var length: Int {
return self.characters.count
}
}
usage
"string".length
If you are just trying to see if a string is empty or not (checking for length of 0), Swift offers a simple boolean test method on String
myString.isEmpty
The other side of this coin was people asking in ObjectiveC how to ask if a string was empty where the answer was to check for a length of 0:
NSString is empty
Swift 5.1, 5
let flag = "🇵🇷"
print(flag.count)
// Prints "1" -- Counts the characters and emoji as length 1
print(flag.unicodeScalars.count)
// Prints "2" -- Counts the unicode lenght ex. "A" is 65
print(flag.utf16.count)
// Prints "4"
print(flag.utf8.count)
// Prints "8"
tl;dr If you want the length of a String type in terms of the number of human-readable characters, use countElements(). If you want to know the length in terms of the number of extended grapheme clusters, use endIndex. Read on for details.
The String type is implemented as an ordered collection (i.e., sequence) of Unicode characters, and it conforms to the CollectionType protocol, which conforms to the _CollectionType protocol, which is the input type expected by countElements(). Therefore, countElements() can be called, passing a String type, and it will return the count of characters.
However, in conforming to CollectionType, which in turn conforms to _CollectionType, String also implements the startIndex and endIndex computed properties, which actually represent the position of the index before the first character cluster, and position of the index after the last character cluster, respectively. So, in the string "ABC", the position of the index before A is 0 and after C is 3. Therefore, endIndex = 3, which is also the length of the string.
So, endIndex can be used to get the length of any String type, then, right?
Well, not always...Unicode characters are actually extended grapheme clusters, which are sequences of one or more Unicode scalars combined to create a single human-readable character.
let circledStar: Character = "\u{2606}\u{20DD}" // ☆⃝
circledStar is a single character made up of U+2606 (a white star), and U+20DD (a combining enclosing circle). Let's create a String from circledStar and compare the results of countElements() and endIndex.
let circledStarString = "\(circledStar)"
countElements(circledStarString) // 1
circledStarString.endIndex // 2
In Swift 2.0 count doesn't work anymore. You can use this instead:
var testString = "Scott"
var length = testString.characters.count
Here's something shorter, and more natural than using a global function:
aString.utf16count
I don't know if it's available in beta 1, though. But it's definitely there in beta 2.
Updated for Xcode 6 beta 4, change method utf16count --> utf16Count
var test1: String = "Scott"
var length = test1.utf16Count
Or
var test1: String = "Scott"
var length = test1.lengthOfBytesUsingEncoding(NSUTF16StringEncoding)
As of Swift 1.2 utf16Count has been removed. You should now use the global count() function and pass the UTF16 view of the string. Example below...
let string = "Some string"
count(string.utf16)
For Xcode 7.3 and Swift 2.2.
let str = "🐶"
If you want the number of visual characters:
str.characters.count
If you want the "16-bit code units within the string’s UTF-16 representation":
str.utf16.count
Most of the time, 1 is what you need.
When would you need 2? I've found a use case for 2:
let regex = try! NSRegularExpression(pattern:"🐶",
options: NSRegularExpressionOptions.UseUnixLineSeparators)
let str = "🐶🐶🐶🐶🐶🐶"
let result = regex.stringByReplacingMatchesInString(str,
options: NSMatchingOptions.WithTransparentBounds,
range: NSMakeRange(0, str.utf16.count), withTemplate: "dog")
print(result) // dogdogdogdogdogdog
If you use 1, the result is incorrect:
let result = regex.stringByReplacingMatchesInString(str,
options: NSMatchingOptions.WithTransparentBounds,
range: NSMakeRange(0, str.characters.count), withTemplate: "dog")
print(result) // dogdogdog🐶🐶🐶
You could try like this
var test1: String = "Scott"
var length = test1.bridgeToObjectiveC().length
in Swift 2.x the following is how to find the length of a string
let findLength = "This is a string of text"
findLength.characters.count
returns 24
Swift 2.0:
Get a count: yourString.text.characters.count
Fun example of how this is useful would be to show a character countdown from some number (150 for example) in a UITextView:
func textViewDidChange(textView: UITextView) {
yourStringLabel.text = String(150 - yourStringTextView.text.characters.count)
}
In swift4 I have always used string.count till today I have found that
string.endIndex.encodedOffset
is the better substitution because it is faster - for 50 000 characters string is about 6 time faster than .count. The .count depends on the string length but .endIndex.encodedOffset doesn't.
But there is one NO. It is not good for strings with emojis, it will give wrong result, so only .count is correct.
In Swift 4 :
If the string does not contain unicode characters then use the following
let str : String = "abcd"
let count = str.count // output 4
If the string contains unicode chars then use the following :
let spain = "España"
let count1 = spain.count // output 6
let count2 = spain.utf8.count // output 7
In Xcode 6.1.1
extension String {
var length : Int { return self.utf16Count }
}
I think that brainiacs will change this on every minor version.
Get string value from your textview or textfield:
let textlengthstring = (yourtextview?.text)! as String
Find the count of the characters in the string:
let numberOfChars = textlength.characters.count
Here is what I ended up doing
let replacementTextAsDecimal = Double(string)
if string.characters.count > 0 &&
replacementTextAsDecimal == nil &&
replacementTextHasDecimalSeparator == nil {
return false
}
Swift 4 update comparing with swift 3
Swift 4 removes the need for a characters array on String. This means that you can directly call count on a string without getting characters array first.
"hello".count // 5
Whereas in swift 3, you will have to get characters array and then count element in that array. Note that this following method is still available in swift 4.0 as you can still call characters to access characters array of the given string
"hello".characters.count // 5
Swift 4.0 also adopts Unicode 9 and it can now interprets grapheme clusters. For example, counting on an emoji will give you 1 while in swift 3.0, you may get counts greater than 1.
"👍🏽".count // Swift 4.0 prints 1, Swift 3.0 prints 2
"👨‍❤️‍💋‍👨".count // Swift 4.0 prints 1, Swift 3.0 prints 4
Swift 4
let str = "Your name"
str.count
Remember: Space is also counted in the number
You can get the length simply by writing an extension:
extension String {
// MARK: Use if it's Swift 2
func stringLength(str: String) -> Int {
return str.characters.count
}
// MARK: Use if it's Swift 3
func stringLength(_ str: String) -> Int {
return str.characters.count
}
// MARK: Use if it's Swift 4
func stringLength(_ str: String) -> Int {
return str.count
}
}
Best way to count String in Swift is this:
var str = "Hello World"
var length = count(str.utf16)
String and NSString are toll free bridge so you can use all methods available to NSString with swift String
let x = "test" as NSString
let y : NSString = "string 2"
let lenx = x.count
let leny = y.count
test1.characters.count
will get you the number of letters/numbers etc in your string.
ex:
test1 = "StackOverflow"
print(test1.characters.count)
(prints "13")
Apple made it different from other major language. The current way is to call:
test1.characters.count
However, to be careful, when you say length you mean the count of characters not the count of bytes, because those two can be different when you use non-ascii characters.
For example;
"你好啊hi".characters.count will give you 5 but this is not the count of the bytes.
To get the real count of bytes, you need to do "你好啊hi".lengthOfBytes(using: String.Encoding.utf8). This will give you 11.
Right now (in Swift 2.3) if you use:
myString.characters.count
the method will return a "Distance" type, if you need the method to return an Integer you should type cast like so:
var count = myString.characters.count as Int
my two cents for swift 3/4
If You need to conditionally compile
#if swift(>=4.0)
let len = text.count
#else
let len = text.characters.count
#endif

Convert unicode symbols \uXXXX in String to Character in Swift

I'm receiving via a REST API a string which contains unicode encoded characters in form of \uXXXX
e.g. Ain\u2019t which should be Ain’t
Is there a nice way to convert these?
You can use \u{my_unicode}:
print("Ain\u{2019}t this a beautiful day")
/* Prints "Ain’t this a beautiful day"
From the Language Guide - Strings and Characters - Unicode:
String literals can include the following special characters:
...
An arbitrary Unicode scalar, written as \u{n}, where n is a 1–8 digit
hexadecimal number with a value equal to a valid Unicode code point
You can apply a string transform StringTransform:
extension String {
var decodingUnicodeCharacters: String { applyingTransform(.init("Hex-Any"), reverse: false) ?? "" }
}
let string = #"Ain\u2019t"#
print(string.decodingUnicodeCharacters) // "Ain’t\n"

How can I get the Unicode codepoint represented by an integer in Swift?

So I know how to convert String to utf8 format like this
for character in strings.utf8 {
// for example A will converted to 65
var utf8Value = character
}
I already read the guide but can't find how to convert Unicode code point that represented by integer to String. For example: converting 65 to A. I already tried to use the "\u"+utf8Value but it still failed.
Is there any way to do this?
If you look at the enum definition for Character you can see the following initializer:
init(_ scalar: UnicodeScalar)
If we then look at the struct UnicodeScalar, we see this initializer:
init(_ v: UInt32)
We can put them together, and we get a whole character
Character(UnicodeScalar(65))
and if we want it in a string, it's just another initializer away...
1> String(Character(UnicodeScalar(65)))
$R1: String = "A"
Or (although I can't figure out why this one works) you can do
String(UnicodeScalar(65))