Is Swift's handling of CVarArg for String buggy? - swift

While writing a Swift wrapper for a C wrapper of a C++ library, I've stumbled on some weird bugs regarding Swift's CVarArg. The C wrapper I already have uses variadic functions which I converted to functions using va_list as an argument so they could be imported (since Swift cannot import C variadic functions). When passing arguments to such a function, once bridged to Swift, it uses the private _cVarArgEncoding property of types conforming to CVarArg to "encode" the values which are then sent as a pointer to the C function. It seems however that this encoding is faulty for Swift Strings.
To demonstrate, I've created the following package:
Package.swift
// swift-tools-version:5.2
import PackageDescription
let package = Package(
name: "CVarArgTest",
products: [
.executable(
name: "CVarArgTest",
targets: ["CVarArgTest"]),
],
targets: [
.target(
name: "CLib"),
.target(
name: "CVarArgTest",
dependencies: ["CLib"])
]
)
CLib
CTest.h
#ifndef CTest_h
#define CTest_h
#include <stdio.h>
/// Prints out the strings provided in args
/// #param num The number of strings in `args`
/// #param args A `va_list` of strings
void test_va_arg_str(int num, va_list args);
/// Prints out the integers provided in args
/// #param num The number of integers in `args`
/// #param args A `va_list` of integers
void test_va_arg_int(int num, va_list args);
/// Just prints the string
/// #param str The string
void test_str_print(const char * str);
#endif /* CTest_h */
CTest.c
#include "CTest.h"
#include <stdarg.h>
void test_va_arg_str(int num, va_list args)
{
printf("Printing %i strings...\n", num);
for (int i = 0; i < num; i++) {
const char * str = va_arg(args, const char *);
puts(str);
}
}
void test_va_arg_int(int num, va_list args)
{
printf("Printing %i integers...\n", num);
for (int i = 0; i < num; i++) {
int foo = va_arg(args, int);
printf("%i\n", foo);
}
}
void test_str_print(const char * str)
{
puts(str);
}
main.swift
import Foundation
import CLib
// The literal String is perfectly bridged to the CChar pointer expected by the function
test_str_print("Hello, World!")
// Prints the integers as expected
let argsInt: [CVarArg] = [123, 456, 789]
withVaList(argsInt) { listPtr in
test_va_arg_int(Int32(argsInt.count), listPtr)
}
// ERROR: Thread 1: EXC_BAD_ACCESS (code=EXC_I386_GPFLT)
let argsStr: [CVarArg] = ["Test", "Testing", "The test"]
withVaList(argsStr) { listPtr in
test_va_arg_str(Int32(argsStr.count), listPtr)
}
The package is available here as well.
As commented in the code above, printing a String via C or a va_list containing Ints works as expected, but when converted to const char *, there's an exception (EXC_BAD_ACCESS (code=EXC_I386_GPFLT)).
So, in short: did I mess up the C side of it or is Swift doing something wrong here? I've tested this in Xcode 11.5 and 12.0b2. If it's a bug, I'll be happy to report it.

This one's a bit tricky: your string is actually being bridged to an Objective-C NSString * rather than a C char *:
(lldb) p str
(const char *) $0 = 0x3cbe9f4c5d32b745 ""
(lldb) p (id)str
(NSTaggedPointerString *) $1 = 0x3cbe9f4c5d32b745 #"Test"
(If you're wondering why it's an NSTaggedPointerString rather than just an NSString, this article is a great read -- in short, the string is short enough to be stored directly in the bytes of the pointer variable rather than in an object on the heap.
Looking at the source code for withVaList, we see that a type's va_list representation is determined by its implementation of the _cVarArgEncoding property of the CVarArg protocol. The standard library has some implementations of this protocol for some basic integer and pointer types, but there's nothing for String here. So who's converting our string to an NSString?
Searching around the Swift repo on GitHub, we find that Foundation is the culprit:
//===----------------------------------------------------------------------===//
// CVarArg for bridged types
//===----------------------------------------------------------------------===//
extension CVarArg where Self: _ObjectiveCBridgeable {
/// Default implementation for bridgeable types.
public var _cVarArgEncoding: [Int] {
let object = self._bridgeToObjectiveC()
_autorelease(object)
return _encodeBitsAsWords(object)
}
}
In plain English: any object which can be bridged to Objective-C is encoded as a vararg by converting to an Objective-C object and encoding a pointer to that object. C varargs are not type-safe, so your test_va_arg_str just assumes it's a char* and passes it to puts, which crashes.
So is this a bug? I don't think so -- I suppose this behavior is probably intentional for compatibility with functions like NSLog that are more commonly used with Objective-C objects than C ones. However, it's certainly a surprising pitfall, and it's probably one of the reasons why Swift doesn't like to let you call C variadic functions.
You'll want to work around this by manually converting your strings to C-strings. This can get a bit ugly if you have an array of strings that you want to convert without making unnecessary copies, but here's a function that should be able to do it.
extension Collection where Element == String {
/// Converts an array of strings to an array of C strings, without copying.
func withCStrings<R>(_ body: ([UnsafePointer<CChar>]) throws -> R) rethrows -> R {
return try withCStrings(head: [], body: body)
}
// Recursively call withCString on each of the strings.
private func withCStrings<R>(head: [UnsafePointer<CChar>],
body: ([UnsafePointer<CChar>]) throws -> R) rethrows -> R {
if let next = self.first {
// Get a C string, add it to the result array, and recurse on the remainder of the collection
return try next.withCString { cString in
var head = head
head.append(cString)
return try dropFirst().withCStrings(head: head, body: body)
}
} else {
// Base case: no more strings; call the body closure with the array we've built
return try body(head)
}
}
}
func withVaListOfCStrings<R>(_ args: [String], body: (CVaListPointer) -> R) -> R {
return args.withCStrings { cStrings in
withVaList(cStrings, body)
}
}
let argsStr: [String] = ["Test", "Testing", "The test"]
withVaListOfCStrings(argsStr) { listPtr in
test_va_arg_str(Int32(argsStr.count), listPtr)
}
// Output:
// Printing 3 strings...
// Test
// Testing
// The test

Related

Freeglut doesn't initialize when using it from Swift

I've tried to use the Freeglut library in a Swift 4 Project. When the
void glutInit(int *argcp, char **argv);
function is shifted to Swift, its declaration is
func glutInit(_ pargc: UnsafeMutablePointer<Int32>!, _ argv: UnsafeMutablePointer<UnsafeMutablePointer<Int8>?>!)
Since I don't need the real arguments from the command line I want to make up the two arguments. I tried to define **argv in the Bridging-Header.h file
#include <OpenGL/gl.h>
#include <GL/glut.h>
char ** argv[1] = {"t"};
and use them in main.swift
func main() {
var argcp: Int32 = 1
glutInit(&argcp, argv!) // EXC_BAD_ACCESS
glutInitDisplayMode(UInt32(GLUT_DOUBLE | GLUT_RGB | GLUT_DEPTH));
glutCreateWindow("my project")
glutDisplayFunc(display)
initOpenGL()
glutMainLoop()
but with that I get Thread 1: EXC_BAD_ACCESS (code=1, address=0x74) at the line with glutInit().
How can I initialize glut properly? How can I get an UnsafeMutablePointer<UnsafeMutablePointer<Int8>?>! so that it works?
The reason the right code in C char * argv[1] = {"t"}; does not work is because Swift imports fixed size C-array as a tuple, not a pointer to the first element.
But your char ** argv[1] = {"t"}; is completely wrong. Each Element of argv needs to be char **, but you assign char * ("t"). Xcode must have shown you a warning at first build:
warning: incompatible pointer types initializing 'char **' with an expression of type 'char [2]'
You should better take incompatible pointer types warning as error, unless you know what you are doing completely.
Generally, you should better not write some codes generating actual code/data like char * argv[1] = {"t"}; in a header file.
You can try it with Swift code.
As you know, when you want to pass a pointer to single element T, you declare a var of type T and pass &varName to the function you call.
As argcp in your code.
As well, when you want to pass a pointer to multiple element T, you declare a var of type [T] (Array<T>) and pass &arrName to the function you call.
(Ignoring immutable case to simplify.)
The parameter argv matches this case, where T == UnsafeMutablePointer<Int8>?.
So declare a var of type [UnsafeMutablePointer<Int8>?].
func main() {
var argc: Int32 = 1
var argv: [UnsafeMutablePointer<Int8>?] = [
strdup("t")
]
defer { argv.forEach{free($0)} }
glutInit(&argc, &argv)
//...
}
But I wonder if you really want to pass something to glutInit().
You can try something like this:
func main() {
var argc: Int32 = 0 //<- 0
glutInit(&argc, nil)
//...
}
I'm not sure if freeglut accept this, but you can find some articles on the web saying that this works in some implementation of Glut.

Assigning UnsafeMutablePointer value to UnsafePointer without unsafeBitCast

I am working with a C API that defines a structure with const char* and a function that returns a char* and am trying to find the best way to do the assignment.
Is there a way to do this without using unsafeBitCast? If I don't put the cast then I get this error:
Cannot assign value of type 'UnsafeMutablePointer<pchar>'
(aka 'UnsafeMutablePointer<UInt8>')
to type 'UnsafePointer<pchar>!'
(aka 'ImplicitlyUnwrappedOptional<UnsafePointer<UInt8>>')
Also, will the initialization of pairPtr below using pair() allocate a pair struct on the stack to initialize the allocated pair on the heap because this seems inefficient for the case where the structure just has to be zero'd out.
Here is sample code:
C library header (minimized to demonstrate the problem):
#ifndef __PAIR_INCLUDE__
#define __PAIR_INCLUDE__
typedef unsigned char pchar;
pchar*
pstrdup(const pchar* str);
typedef struct _pair {
const pchar* left;
const pchar* right;
} pair;
#endif // __PAIR_INCLUDE__
My Swift code:
import pair
let leftVal = pstrdup("left")
let rightVal = pstrdup("right")
let pairPtr = UnsafeMutablePointer<pair>.allocate(capacity: 1)
pairPtr.initialize(to: pair())
// Seems like there should be a better way to handle this:
pairPtr.pointee.left = unsafeBitCast(leftVal, to: UnsafePointer<pchar>.self)
pairPtr.pointee.right = unsafeBitCast(rightVal, to: UnsafePointer<pchar>.self)
The C code:
#include "pair.h"
#include <string.h>
pchar*
pstrdup(const pchar* str) {
return strdup(str);
}
The module definition:
module pair [extern_c] {
header "pair.h"
export *
}
You can create an UnsafePointer<T> from an UnsafeMutablePtr<T>
simply with
let ptr = UnsafePointer(mptr)
using the
/// Creates an immutable typed pointer referencing the same memory as the
/// given mutable pointer.
///
/// - Parameter other: The pointer to convert.
public init(_ other: UnsafeMutablePointer<Pointee>)
initializer of UnsafePointer. In your case that would for example be
p.left = UnsafePointer(leftVal)

What difference in withUnsafePointer and Unmanaged.passUnretained

I want to see how swift String struct pointer changes when I append new string to original one, comparing to Objective C. For Objective C I use this code:
NSMutableString *st1 = [[NSMutableString alloc] initWithString:#"123"];
[st1 appendString:#"456"];
In Objective C st1 string object changes internaly (adds 456 and becomes 123456), but st1 pointer lefts the same and points to the same object. In Swift, since String is not mutable, var st1 must change its address after addition, because it will hold another string, with my strings summ (123456). Is this all correct?
This is my playground code for swift tests:
import Cocoa
var str = "123"
withUnsafePointer(to: &str) { print($0) }
str += "456"
withUnsafePointer(to: &str) { print($0) }
var str2 = "123"
print(Unmanaged<AnyObject>.passUnretained(str2 as AnyObject).toOpaque())
str2 += "456"
print(Unmanaged<AnyObject>.passUnretained(str2 as AnyObject).toOpaque())
and this is results:
0x000000010e228c40 // this are same
0x000000010e228c40
0x00007fb26ed5a790 // this are not
0x00007fb26ed3f6d0
// and they completly different from first two
Why I get same pointer when I use withUnsafePointer? Why I get different pointers when I use Unmanaged.passUnretained? And why pointers received from this methods are completly different?
To better explain the behaviour you're seeing, we can actually look at the String source code.
Here's the full definition of String
public struct String {
/// Creates an empty string.
public init() {
_core = _StringCore()
}
public // #testable
init(_ _core: _StringCore) {
self._core = _core
}
public // #testable
var _core: _StringCore
}
So String is just a wrapper around some type called _StringCore. We can find its definition here. Here are the relevant parts:
public struct _StringCore {
//...
public var _baseAddress: UnsafeMutableRawPointer?
var _countAndFlags: UInt
//...
}
As you can see, the _StringCore doesn't directly contain the buffer of memory that stores the string's content. Instead, it references it externally, via a UnsafeMutableRawPointer.
The first time you declare str, it was given some memory on the stack, at address 0x000000010e228c40. When you made a change to str, you actually had no effect on this String struct's location. Instead, you caused the _baseAddress of the String's _core to change. Array works a very similar way. This is how the string's copy-on-write behaviour is implemented, too.
As for the Unmanaged behaviour, str2 as AnyObject creates a copy of str2, so you end up making 2 different copies, hence the difference in
the printed addressed.

Swift convert string to UnsafeMutablePointer<Int8>

I have a C function mapped to Swift defined as:
func swe_set_eph_path(path: UnsafeMutablePointer<Int8>) -> Void
I am trying to pass a path to the function and have tried:
var path = [Int8](count: 1024, repeatedValue: 0);
for i in 0...NSBundle.mainBundle().bundlePath.lengthOfBytesUsingEncoding(NSUTF16StringEncoding)-1
{
var range = i..<i+1
path[i] = String.toInt(NSBundle.mainBundle().bundlePath[range])
}
println("\(path)")
swe_set_ephe_path(&path)
but on the path[i] line I get the error:
'subscript' is unavailable: cannot subscript String with a range of
Int
swe_set_ephe_path(NSBundle.mainBundle().bundlePath)
nor
swe_set_ephe_path(&NSBundle.mainBundle().bundlePath)
don't work either
Besides not working, I feel there has got to be a better, less convoluted way of doing this. Previous answers on StackOverflow using CString don't seem to work anymore. Any suggestions?
Previous answers on StackOverflow using CString don't seem to work anymore
Nevertheless, UnsafePointer<Int8> is a C string. If your context absolutely requires an UnsafeMutablePointer, just coerce, like this:
let s = NSBundle.mainBundle().bundlePath
let cs = (s as NSString).UTF8String
var buffer = UnsafeMutablePointer<Int8>(cs)
swe_set_ephe_path(buffer)
Of course I don't have your swe_set_ephe_path, but it works fine in my testing when it is stubbed like this:
func swe_set_ephe_path(path: UnsafeMutablePointer<Int8>) {
println(String.fromCString(path))
}
In current version of Swift language you can do it like this (other answers are outdated):
let path = Bundle.main.bundlePath
let param = UnsafeMutablePointer<Int8>(mutating: (path as NSString).utf8String)
It’s actually extremely irritating of the library you’re using that it requires (in the C declaration) a char * path rather than const char * path. (this is assuming the function doesn’t mutate the input string – if it does, you’re in a whole different situation).
If it didn’t, the function would come over to Swift as:
// note, UnsafePointer not UnsafeMutablePointer
func swe_set_eph_path(path: UnsafePointer<Int8>) -> Void
and you could then rely on Swift’s implicit conversion:
let str = "blah"
swe_set_eph_path(str) // Swift implicitly converts Strings
// to const C strings when calling C funcs
But you can do an unsafe conversion quite easily, in combination with the withCString function:
str.withCString { cstr in
swe_set_eph_path(UnsafeMutablePointer(cstr))
}
I had a static library (someLibrary.a) written in C++ compiled for iOS.
The header file (someLibrary.h) had a function exposed like this:
extern long someFunction(char* aString);
The declaration in Swift looks like this:
Int someFunction(aString: UnsafeMutablePointer<Int8>)
I made an extension to String:
extension String {
var UTF8CString: UnsafeMutablePointer<Int8> {
return UnsafeMutablePointer((self as NSString).UTF8String)
}
}
So then I can call the method like so:
someFunction(mySwiftString.UTF8CString)
Update: Make String extension (swift 5.7)
extension String {
var UTF8CString: UnsafeMutablePointer<Int8> {
return UnsafeMutablePointer(mutating: (self as NSString).utf8String!)
}
}

String value to UnsafePointer<UInt8> function parameter behavior

I found the following code compiles and works:
func foo(p:UnsafePointer<UInt8>) {
var p = p
for p; p.memory != 0; p++ {
print(String(format:"%2X", p.memory))
}
}
let str:String = "今日"
foo(str)
This prints E4BB8AE697A5 and that is a valid UTF8 representation of 今日
As far as I know, this is undocumented behavior. from the document:
When a function is declared as taking a UnsafePointer argument, it can accept any of the following:
nil, which is passed as a null pointer
An UnsafePointer, UnsafeMutablePointer, or AutoreleasingUnsafeMutablePointer value, which is converted to UnsafePointer if necessary
An in-out expression whose operand is an lvalue of type Type, which is passed as the address of the lvalue
A [Type] value, which is passed as a pointer to the start of the array, and lifetime-extended for the duration of the call
In this case, str is non of them.
Am I missing something?
ADDED:
And it doesn't work if the parameter type is UnsafePointer<UInt16>
func foo(p:UnsafePointer<UInt16>) {
var p = p
for p; p.memory != 0; p++ {
print(String(format:"%4X", p.memory))
}
}
let str:String = "今日"
foo(str)
// ^ 'String' is not convertible to 'UnsafePointer<UInt16>'
Even though the internal String representation is UTF16
let str = "今日"
var p = UnsafePointer<UInt16>(str._core._baseAddress)
for p; p.memory != 0; p++ {
print(String(format:"%4X", p.memory)) // prints 4ECA65E5 which is UTF16 今日
}
This is working because of one of the interoperability changes the Swift team has made since the initial launch - you're right that it looks like it hasn't made it into the documentation yet. String works where an UnsafePointer<UInt8> is required so that you can call C functions that expect a const char * parameter without a lot of extra work.
Look at the C function strlen, defined in "shims.h":
size_t strlen(const char *s);
In Swift it comes through as this:
func strlen(s: UnsafePointer<Int8>) -> UInt
Which can be called with a String with no additional work:
let str = "Hi."
strlen(str)
// 3
Look at the revisions on this answer to see how C-string interop has changed over time: https://stackoverflow.com/a/24438698/59541