Why Swift's malloc/MemoryLayout.size take/return signed integers? - swift

public func malloc(_ __size: Int) -> UnsafeMutableRawPointer!
#frozen public enum MemoryLayout<T> {
public static func size(ofValue value: T) -> Int
...
When in C malloc/sizeof take/return size_t which is unsigned?
Isn't Swift calling libc under the hood?
EDIT: is this the reason why? https://qr.ae/pvFOQ6
They are basically trying to get away from C's legacy?

Yes, it's calling the libc functions under the hood.
The StdlibRationales.rst document in the Swift repo explains why it imports size_t as Int:
Converging APIs to use Int as the default integer type allows users to write fewer explicit type conversions.
Importing size_t as a signed Int type would not be a problem for 64-bit platforms. The only concern is about 32-bit platforms, and only about operating on array-like data structures that span more than half of the address space. Even today, in 2015, there are enough 32-bit platforms that are still interesting, and x32 ABIs for 64-bit CPUs are also important. We agree that 32-bit platforms are important, but the usecase for an unsigned size_t on 32-bit platforms is pretty marginal, and for code that nevertheless needs to do that there is always the option of doing a bitcast to UInt or using C.

Related

Precondition failed: Negative count not allowed

Error:
Precondition failed: Negative count not allowed: file /BuildRoot/Library/Caches/com.apple.xbs/Sources/swiftlang/swiftlang-900.0.74.1/src/swift/stdlib/public/core/StringLegacy.swift, line 49
Code:
String(repeating: "a", count: -1)
Thinking:
Well, it doesn't make sense repeating some string a negative number of times. Since we have types in Swift, why not use an UInt?
Here we have some documentation about it.
Use UInt only when you specifically need an unsigned integer type with
the same size as the platform’s native word size. If this isn’t the
case, Int is preferred, even when the values to be stored are known to
be nonnegative. A consistent use of Int for integer values aids code
interoperability, avoids the need to convert between different number
types, and matches integer type inference, as described in Type Safety
and Type Inference.
Apple Docs
Ok that Int is preferred, therefore the API is just following the rules, but why the Strings API is designed like that? Why this constructor is not private and the a public one with UInt ro something like that? Is there a "real" reason? It this some "undefined behavior" kind of thing?
Also: https://forums.developer.apple.com/thread/98594
This isn't undefined behavior — in fact, a precondition indicates the exact opposite: an explicit check was made to ensure that the given count is positive.
As to why the parameter is an Int and not a UInt — this is a consequence of two decisions made early in the design of Swift:
Unlike C and Objective-C, Swift does not allow implicit (or even explicit) casting between integer types. You cannot pass an Int to function which takes a UInt, and vice versa, nor will the following cast succeed: myInt as? UInt. Swift's preferred method of converting is using initializers: UInt(myInt)
Since Ints are more generally applicable than UInts, they would be the preferred integer type
As such, since converting between Ints and UInts can be cumbersome and verbose, the easiest way to interoperate between the largest number of APIs is to write them all in terms of the common integer currency type: Int. As the docs you quote mention, this "aids code interoperability, avoids the need to convert between different number types, and matches integer type inference"; trapping at runtime on invalid input is a tradeoff of this decision.
In fact, Int is so strongly ingrained in Swift that when Apple framework interfaces are imported into Swift from Objective-C, NSUInteger parameters and return types are converted to Int and not UInt, for significantly easier interoperability.

What is the correct type for returning a C99 `bool` to Rust via the FFI?

A colleague and I have been scratching our heads over how to return a bool from <stdbool.h> (a.k.a. _Bool) back to Rust via the FFI.
We have our C99 code we want to use from Rust:
bool
myfunc(void) {
...
}
We let Rust know about myfunc using an extern C block:
extern "C" {
fn myfunc() -> T;
}
What concrete type should T be?
Rust doesn't have a c_bool in the libc crate, and if you search the internet, you will find various GitHub issues and RFCs where people discuss this, but don't really come to any consensus as to what is both correct and portable:
https://github.com/rust-lang/rfcs/issues/1982#issuecomment-297534238
https://github.com/rust-lang/rust/issues/14608
https://github.com/rust-lang/rfcs/issues/992
https://github.com/rust-lang/rust/pull/46156
As far as I can gather:
The size of a bool in C99 is undefined other than the fact it must be at least large enough to store true (1) and false (0). In other words, at least one bit long.
It could even be one bit wide.
Its size might be ABI defined.
This comment suggests that if a C99 bool is passed into a function as a parameter or out of a function as the return value, and the bool is smaller than a C int then it is promoted to the same size as an int. Under this scenario, we can tell Rust T is u32.
All right, but what if (for some reason) a C99 bool is 64 bits wide? Is u32 still safe? Perhaps under this scenario we truncate the 4 most significant bytes, which would be fine, since the 4 least significant bytes are more than enough to represent true and false.
Is my reasoning correct? Until Rust gets a libc::c_bool, what would you use for T and why is it safe and portable for all possible sizes of a C99 bool (>=1 bit)?
As of 2018-02-01, the size of Rust's bool is officially the same as C's _Bool.
This means that bool is the correct type to use in FFI.
The rest of this answer applies to versions of Rust before the official decision was made
Until Rust gets a libc::c_bool, what would you use for T and why is it safe and portable for all possible sizes of a C99 bool (>=1 bit)?
As you've already linked to, the official answer is still "to be determined". That means that the only possibility that is guaranteed to be correct is: nothing.
That's right, as sad as it may be. The only truly safe thing would be to convert your bool to a known, fixed-size integral type, such as u8, for the purpose of FFI. That means you need to marshal it on both sides.
Practically, I'd keep using bool in my FFI code. As people have pointed out, it magically lines up on all the platforms that are in wide use at the moment. If the language decides to make bool FFI compatible, you are good to go. If they decide something else, I'd be highly surprised if they didn't introduce a lint to allow us to catch the errors quickly.
See also:
Is bool guaranteed to be 1 byte?
After a lot of thought, I'm going to try answering my own question. Please comment if you can find a hole in the following reasoning.
This is not the correct answer -- see the comments below
I think a Rust u8 is always safe for T.
We know that a C99 bool is an integer large enough to store 0 or 1, which means it's free to be an unsigned integer of at least 1-bit, or (if you are feeling weird) a signed integer of at least 2-bits.
Let's break it down by case:
If the C99 bool is 8-bits then a Rust u8 is perfect. Even in the signed case, the top bit will be a zero since representing 0 and 1 never requires a negative power of two.
If the C99 bool is larger than a Rust u8, then by "casting it down" to a 8-bit size, we only ever discard leading zeros. Thus this is safe too.
Now consider the case where the C99 bool is smaller than the Rust u8. When returning a value from a C function, it's not possible to return a value of size less than one byte due to the underlying calling convention. The CC will require return value to be loaded into a register or into a location on the stack. Since the smallest register or memory location is one byte, the return value will need to be extended (with zeros) to at least a one byte sized value (and I believe the same is true of function arguments, which too must adhere to calling convention). If the value is extended to a one-byte value, then it's the same as case 1. If the value is extended to a larger size, then it's the same as case 2.

Should I prefer to use specifically-sized Int (Int8 and Int16) in Swift?

I'm converting projects from Java to Swift. My Java code uses small data types (short, byte). Should I use the Int16, Int8 equivalents in Swift, or only use the Int type for all? Where is the memory optimization as well as the speed?
Use Double and Int unless compelled by circumstances to do otherwise. The other types are all for compatibility with externalities.
For example, you have to use CGFloat to interchange with Core Graphics, and an occasional UIKit object requires a Float instead of a Double; and you might have to use Int8 for purposes of interchange with some C API or to deal with data downloaded from the network.
But Double and Int are the "natural" Swift types, and Swift numerics are very rigid and clumsy, so you should stick with those types wherever you can.

What's the rationale of Swift's size methods taking `Int`s?

I've noticed a lot of swift built ins take or return Ints and not UInts:
Here are some examples from Array:
mutating func reserveCapacity(minimumCapacity: Int)
var capacity: Int { get }
init(count: Int, repeatedValue: T)
mutating func removeAtIndex(index: Int) -> T
Given that the language is completely new, and assuming that this design choice was not arbitrary - I'm wondering: Why do swift built ins take Ints and not UInts?
Some notes: Asking because I'm working on a few collections myself and I'm wondering what types I should use for things like reserveCapacity etc. What I'd naturally expect is for reserveCapacity to take a UInt instead.
UInt is a common cause of bugs. It is very easy to accidentally generate a -1 and wind up with infinite loops or similar problems. (Many C and C++ programmers have learned the hard way that you really should just use int unless there's a need for unsigned.) Because of how Swift manages type conversion, this is even more important. If you've ever worked with NSUInteger with "signed-to-unsigned" warnings turned on (which are an error in Swift, not an optional warning like in C), you know what a pain that is.
The Swift Programming Guide explains their specific rationale in the section on UInt:
NOTE
Use UInt only when you specifically need an unsigned integer type with the same size as the platform’s native word size. If this is not the case, Int is preferred, even when the values to be stored are known to be non-negative. A consistent use of Int for integer values aids code interoperability, avoids the need to convert between different number types, and matches integer type inference, as described in Type Safety and Type Inference.
Here is a possible explanation (I am no expert on this subject): Suppose you have this code
let x = 3
test(x)
func test(t: Int) {
}
This will compile without a problem, since the type of 'x' is inferred to be Int.
However, if you change the function to
func test(t: UInt) {
}
The compiler will give you a build error ('Int' is not convertible to 'UInt')
So my guess is that it is just for convenience, because Swift's type safety would otherwise require you to convert them manually each time.

Best pratice for typedef of uint32

On a system where both long and int is 4 bytes which is the best and why?
typedef unsigned long u32;
or
typedef unsigned int u32;
note: uint32_t is not an option
Nowadays every platform has stdint.h or its C++ equivalent cstdint which define uint32_t. Please use the standard type rather than creating your own.
http://pubs.opengroup.org/onlinepubs/7999959899/basedefs/stdint.h.html
http://www.cplusplus.com/reference/cstdint/
http://msdn.microsoft.com/en-us/library/hh874765.aspx
The size will be the same between both, so it depend only on your use.
If you need to store decimal values, use long.
A better and complete answer here:
https://stackoverflow.com/questions/271076/what-is-the-difference-between-an-int-and-a-long-in-c/271132
Edit: I'm not sure about decimal with long, if someone can confirm, thanks.
Since you said the standard uint32_t is not an option, using long and int are both correct on 32-bit machines, I'll say
typedef unsigned int u32;
is a little better, because on two popular 64-bit machine data models (LLP64 and LP64), int is still 32-bit, while long could be 32-bit or 64-bit. See 64-bit data models