Is there a difference between String(validatingUTF8:) and String(utf8String:)? - swift

Those two functions seem to have very close signatures and very close descriptions:
https://developer.apple.com/documentation/swift/string/init(utf8string:)-3mcco
String.init(utf8String:)
Creates a string by copying the data from a given null-terminated C array of UTF8-encoded bytes.
https://developer.apple.com/documentation/swift/string/init(validatingutf8:)-208fn
String.init(validatingUTF8:)
Creates a new string by copying and validating the null-terminated UTF-8 data referenced by the given pointer.
Since both are nullable initializers, what are their actual differences? Is there a possible input that would give a different output for each? If they are identical in behavior, then which one is recommended to use?

String.init(validatingUTF8:) is a method from the Swift standard library, the implementation is in CString.swift:
public init?(validatingUTF8 cString: UnsafePointer<CChar>) {
let len = UTF8._nullCodeUnitOffset(in: cString)
guard let str = cString.withMemoryRebound(to: UInt8.self, capacity: len, {
String._tryFromUTF8(UnsafeBufferPointer(start: $0, count: len))
})
else { return nil }
self = str
}
String.init(utf8String:) is implemented in NSStringAPI.swift:
/// Creates a string by copying the data from a given
/// C array of UTF8-encoded bytes.
public init?(utf8String bytes: UnsafePointer<CChar>) {
if let str = String(validatingUTF8: bytes) {
self = str
return
}
if let ns = NSString(utf8String: bytes) {
self = String._unconditionallyBridgeFromObjectiveC(ns)
} else {
return nil
}
}
and is the Swift overlay for the Foundation NSString initializer
- (nullable instancetype)initWithUTF8String:(const char *)nullTerminatedCString
which in turn is implemented non-Apple platforms in swift-corelibs-foundation/Sources/Foundation/NSString.swift as
public convenience init?(utf8String nullTerminatedCString: UnsafePointer<Int8>) {
guard let str = String(validatingUTF8: nullTerminatedCString) else { return nil }
self.init(str)
}
So there is no difference in how the two methods convert C strings, but String.init(utf8String:) needs import Foundation whereas String.init(validatingUTF8:) does not need additional imports.

Related

Copying swift string to fixed size char[][]

I have a C struct like this.
struct someStruct {
char path[10][MAXPATHLEN];
};
I'd like to copy a list of Swift strings into the char[10][] array.
For me it's very challenging to handle c two-dimensional char array in Swift. Could anyone share some code which can work with Swift 5? Thanks!
C Arrays are imported to Swift as tuples. Here we have a two-dimensional C array, which becomes a nested tuple in Swift:
public struct someStruct {
public var path: (
(Int8, ..., Int8),
(Int8, ..., Int8),
...
(Int8, ..., Int8)
)
}
There is no really “nice” solution that I am aware of, but using the fact that Swift preserves the memory layout of imported C structures (source), one can achive the goal with some pointer magic:
var s = someStruct()
let totalSize = MemoryLayout.size(ofValue: s.path)
let itemSize = MemoryLayout.size(ofValue: s.path.0)
let numItems = totalSize / itemSize
withUnsafeMutablePointer(to: &s.path) {
$0.withMemoryRebound(to: Int8.self, capacity: totalSize) { ptr in
for i in 0..<numItems {
let itemPtr = ptr + i * itemSize
strlcpy(itemPtr, "String \(i)", itemSize)
}
print(ptr)
}
}
ptr is a pointer to s.path, and itemPtr is pointer to s.path[i]. strlcpy copies the string, here we use the fact that one can pass a Swift string directly to a C function taking a const char* argument (and a temporary null-terminated UTF-8 representation is created automatically).
I strongly encourage you to use some kind of helper methods.
Example:
/* writes str to someStruct instance at index */
void writePathToStruct(struct someStruct* s, size_t index, const char* str) {
assert(index < 10 && "Specified index is out of bounds");
strcpy(s->path[index], str);
}
Now, when calling this function, filling the array looks much cleaner:
var someStructInstance = someStruct()
let pathIndex: Int = 3
let path = "/dev/sda1"
let encoding = String.Encoding.ascii
withUnsafeMutablePointer(to: &someStructInstance) { pointer -> Void in
writePathToStruct(pointer, pathIndex, path.cString(using: encoding)!)
}
By design, tuples can not be accessed by variable index. Reading statically can thus be done without a helper function.
let pathRead = withUnsafeBytes(of: &someStructInstance.path.3) { pointer -> String? in
return String(cString: pointer.baseAddress!.assumingMemoryBound(to: CChar.self), encoding: encoding)
}
print(pathRead ?? "<Empty path>")
However, I assume you will definitely have to read the array with a dynamic index.
In that case, I encourage you to use a helper method as well:
const char* readPathFromStruct(const struct someStruct* s, size_t index) {
assert(index < 10 && "Specified index is out of bounds");
return s->path[index];
}
which will result in a much cleaner Swift code:
pathRead = withUnsafePointer(to: &someStructInstance) { pointer -> String? in
return String(cString: readPathFromStruct(pointer, 3), encoding: encoding)
}

Convert Swift String to wchar_t

For context: I'm trying to use the very handy LibXL. I've used it with success in Obj-C and C++ but am now trying to port over to Swift. In order to better support Unicode, I need to sent all strings to the LibXL api as wchar_t*.
So, for this purpose I've cobbled together this code:
extension String {
///Function to convert a String into a wchar_t buffer.
///Don't forget to free the buffer!
var wideChar: UnsafeMutablePointer<wchar_t>? {
get {
guard let _cString = self.cString(using: .utf16) else {
return nil
}
let buffer = UnsafeMutablePointer<wchar_t>.allocate(capacity: _cString.count)
memcpy(buffer, _cString, _cString.count)
return buffer
}
}
The calls to LibXL appear to be working (getting a print of the error messages returns 'Ok'). Except when I try to actually write to a cell in a test spreadsheet. I get can't write row 0 in trial version:
if let name = "John Doe".wideChar, let passKey = "mac-f.....lots of characters...3".wideChar {
xlBookSetKeyW(book, name, passKey)
print(">: " + String.init(cString: xlBookErrorMessageW(book)))
}
if let sheetName = "Output".wideChar, let path = savePath.wideChar, let test = "Hello".wideChar {
let sheet: SheetHandle = xlBookAddSheetW(book, sheetName, nil)
xlSheetWriteStrW(sheet, 0, 0, test, sectionTitleFormat)
print(">: " + String.init(cString: xlBookErrorMessageW(book)))
let success = xlBookSaveW(book, path)
dump(success)
print(">: " + String.init(cString: xlBookErrorMessageW(book)))
}
I'm presuming that my code for converting to wchar_t* is incorrect. Can someone point me in the right direction for that..?
ADDENDUM: Thanks to #MartinR for the answer. It appears that the block 'consumes' any pointers that are used in it. So, for example, when writing a string using
("Hello".withWideChars({ wCharacters in
xlSheetWriteStrW(newSheet, destRow, destColumn, wCharacters, aFormatHandle)
})
The aFormatHandle will become invalid after the writeStr line executes and isn't re-useable. It's necessary to create a new FormatHandle for each write command.
There are different problems here. First, String.cString(using:) does
not work well with multi-byte encodings:
print("ABC".cString(using: .utf16)!)
// [65, 0] ???
Second, wchar_t contains UTF-32 code points, not UTF-16.
Finally, in
let buffer = UnsafeMutablePointer<wchar_t>.allocate(capacity: _cString.count)
memcpy(buffer, _cString, _cString.count)
the allocation size does not include the trailing null character,
and the copy copies _cString.count bytes, not characters.
All that can be fixed, but I would suggest a different API
(similar to the String.withCString(_:) method):
extension String {
/// Calls the given closure with a pointer to the contents of the string,
/// represented as a null-terminated wchar_t array.
func withWideChars<Result>(_ body: (UnsafePointer<wchar_t>) -> Result) -> Result {
let u32 = self.unicodeScalars.map { wchar_t(bitPattern: $0.value) } + [0]
return u32.withUnsafeBufferPointer { body($0.baseAddress!) }
}
}
which can then be used like
let name = "John Doe"
let passKey = "secret"
name.withWideChars { wname in
passKey.withWideChars { wpass in
xlBookSetKeyW(book, wname, wpass)
}
}
and the clean-up is automatic.

Converting a String to UnsafeMutablePointer<UInt16>

I'm trying to use a library which was written in C. I've imported .a and .h files at Xcode project, and checked it works properly. I've already made them working on Objective-C, and now for Swift.
A problem I've got is functions' arguments. There's a function requires an argument widechar(defined as typedef Unsigned short int in Library), which was UnsafeMutablePointer<UInt16> in Swift. The function translates it and return the result.
So I should convert a String to UnsafeMutablePointer<UInt16>. I tried to find the right way to converting it, but I've only got converting it to UnsafeMutablePointer<UInt8>. I couldn't find answer/information about converting String to UnsafeMutablePointer<UInt16>.
Here's a source code I've written.
extension String{
var utf8CString: UnsafePointer<Int8> {
return UnsafePointer((self as NSString).utf8String!)
}
}
func translate(toBraille: String, withTable: String) -> [String]? {
let filteredString = toBraille.onlyAlphabet
let table = withTable.utf8CString
var inputLength = CInt(filteredString.count)
var outputLength = CInt(maxBufferSize)
let inputValue = UnsafeMutablePointer<widechar>.allocate(capacity: Int(outputLength))
let outputValue = UnsafeMutablePointer<widechar>.allocate(capacity: Int(outputLength))
lou_translateString(table, inputValue, &inputLength, outputValue, &outputLength, nil, nil, 0)
//This is a function that I should use.
let result:[String] = []
return result
}
You have to create an array with the UTF-16 representation of the Swift
string that you can pass to the function, and on return create
a Swift string from the UTF-16 array result.
Lets assume for simplicity that the C function is imported to Swift as
func translateString(_ source: UnsafeMutablePointer<UInt16>, _ sourceLen: UnsafeMutablePointer<CInt>,
_ dest: UnsafeMutablePointer<UInt16>, _ destLen: UnsafeMutablePointer<CInt>)
Then the following should work (explanations inline):
// Create array with UTF-16 representation of source string:
let sourceString = "Hello world"
var sourceUTF16 = Array(sourceString.utf16)
var sourceLength = CInt(sourceUTF16.count)
// Allocate array for UTF-16 representation of destination string:
let maxBufferSize = 1000
var destUTF16 = Array<UInt16>(repeating: 0, count: maxBufferSize)
var destLength = CInt(destUTF16.count)
// Call translation function:
translateString(&sourceUTF16, &sourceLength, &destUTF16, &destLength)
// Create Swift string from UTF-16 representation in destination buffer:
let destString = String(utf16CodeUnits: destUTF16, count: Int(destLength))
I have assumed that the C function updates destLength to reflect
the actual length of the translated string on return.

Swift - converting from UnsafePointer<UInt8> with length to String

I considered a lot of similar questions, but still can't get the compiler to accept this.
Socket Mobile API (in Objective-C) passes ISktScanDecodedData into a delegate method in Swift (the data may be binary, which I suppose is why it's not provided as string):
func onDecodedData(device: DeviceInfo?, DecodedData d: ISktScanDecodedData?) {
let symbology: String = d!.Name()
let rawData: UnsafePointer<UInt8> = d!.getData()
let rawDataSize: UInt32 = decoded!.getDataSize()
// want a String (UTF8 is OK) or Swifty byte array...
}
In C#, this code converts the raw data into a string:
string s = Marshal.PtrToStringAuto(d.GetData(), d.GetDataSize());
In Swift, I can get as far as UnsafeArray, but then I'm stuck:
let rawArray = UnsafeArray<UInt8>(start: rawData, length: Int(rawDataSize))
Alternatively I see String.fromCString and NSString.stringWithCharacters, but neither will accept the types of arguments at hand. If I could convert from UnsafePointer<UInt8> to UnsafePointer<()>, for example, then this would be available (though I'm not sure if it would even be safe):
NSData(bytesNoCopy: UnsafePointer<()>, length: Int, freeWhenDone: Bool)
Is there an obvious way to get a string out of all this?
This should work:
let data = NSData(bytes: rawData, length: Int(rawDataSize))
let str = String(data: data, encoding: NSUTF8StringEncoding)
Update for Swift 3:
let data = Data(bytes: rawData, count: Int(rawDataSize))
let str = String(data: data, encoding: String.Encoding.utf8)
The resulting string is nil if the data does not represent
a valid UTF-8 sequence.
How about this, 'pure' Swift 2.2 instead of using NSData:
public extension String {
static func fromCString
(cs: UnsafePointer<CChar>, length: Int!) -> String?
{
if length == .None { // no length given, use \0 standard variant
return String.fromCString(cs)
}
let buflen = length + 1
var buf = UnsafeMutablePointer<CChar>.alloc(buflen)
memcpy(buf, cs, length))
buf[length] = 0 // zero terminate
let s = String.fromCString(buf)
buf.dealloc(buflen)
return s
}
}
and Swift 3:
public extension String {
static func fromCString
(cs: UnsafePointer<CChar>, length: Int!) -> String?
{
if length == nil { // no length given, use \0 standard variant
return String(cString: cs)
}
let buflen = length + 1
let buf = UnsafeMutablePointer<CChar>.allocate(capacity: buflen)
memcpy(buf, cs, length)
buf[length] = 0 // zero terminate
let s = String(cString: buf)
buf.deallocate(capacity: buflen)
return s
}
}
Admittedly it's a bit stupid to alloc a buffer and copy the data just to add the zero terminator.
Obviously, as mentioned by Zaph, you need to make sure your assumptions about the string encoding are going to be right.

Pointers, Pointer Arithmetic, and Raw Data in Swift

My application uses a somewhat complex inmutable data structure that is encoded in a binary file. I need to have access to it at the byte level, avoiding any copying. Normally, I would use C or C++ pointer arithmetic and typecasts, to access and interpret the raw byte values. I would like to do the same with Swift.
I have found that the following works:
class RawData {
var data: NSData!
init(rawData: NSData) {
data = rawData
}
func read<T>(byteLocation: Int) -> T {
let bytes = data.subdataWithRange(NSMakeRange(byteLocation, sizeof(T))).bytes
return UnsafePointer<T>(bytes).memory
}
func example_ReadAnIntAtByteLocation5() -> Int {
return read(5) as Int
}
}
However, I am not sure how efficient it is. Do data.subdataWithRange and NSMakeRange allocate objects every time I call them, or are they just syntactic sugar for dealing with pointers?
Is there a better way to do this in Swift?
EDIT:
I have created a small Objective-C class that just encapsulates a function to offset a pointer by a given number of bytes:
#implementation RawDataOffsetPointer
inline void* offsetPointer(void* ptr, int bytes){
return (char*)ptr + bytes;
}
#end
If I include this class in the bridging header, then I can change my read method to
func read<T>(byteLocation: Int) -> T {
let ptr = offsetPointer(data.bytes, CInt(byteLocation))
return UnsafePointer<T>(ptr).memory
}
which will not copy data from my buffer, or allocate other objects.
However, it would still be nice to do some pointer arithmetic from Swift, if it were possible.
If you just want to do it directly, UnsafePointer<T> can be manipulated arithmetically:
let oldPointer = UnsafePointer<()>
let newPointer = oldPointer + 10
You can also cast a pointer like so (UnsafePointer<()> is equivalent to void *)
let castPointer = UnsafePointer<MyStruct>(oldPointer)
I would recommend looking into NSInputStream, which allows you to read NSData as a series of bytes (UInt8 in Swift).
Here is a little sample I put together in the playground:
func generateRandomData(count:Int) -> NSData
{
var array = Array<UInt8>(count: count, repeatedValue: 0)
arc4random_buf(&array, UInt(count))
return NSData(bytes: array, length: count)
}
let randomData = generateRandomData(256 * 1024)
let stream = NSInputStream(data: randomData)
stream.open() // IMPORTANT
var readBuffer = Array<UInt8>(count: 16 * 1024, repeatedValue: 0)
var totalBytesRead = 0
while (totalBytesRead < randomData.length)
{
let numberOfBytesRead = stream.read(&readBuffer, maxLength: readBuffer.count)
// Do something with the data
totalBytesRead += numberOfBytesRead
}
You can create an extension to read primitive types like so:
extension NSInputStream
{
func readInt32() -> Int
{
var readBuffer = Array<UInt8>(count:sizeof(Int32), repeatedValue: 0)
var numberOfBytesRead = self.read(&readBuffer, maxLength: readBuffer.count)
return Int(readBuffer[0]) << 24 |
Int(readBuffer[1]) << 16 |
Int(readBuffer[2]) << 8 |
Int(readBuffer[3])
}
}
I would recommend the simple way to use UnsafeArray.
let data = NSData(contentsOfFile: filename)
let ptr = UnsafePointer<UInt8>(data.bytes)
let bytes = UnsafeBufferPointer<UInt8>(start:ptr, count:data.length)