I recently asked a question around sort of generic enum/structs in C and realized that although I brought up a comparison with the enum possibilities in e.g. Swift and Rust, I don't really understand how those are handled internally to those languages.
For Rust, I found a (rather roundabout) article titled Peeking inside a Rust enum
— look for the "But Rust enums aren't just that." heading and then keep scrolling until the "In Rust, it's called a discriminant." part. Eventually that part gets around to basically saying that Rust enums are sort of equivalent to something like this in C:
struct {
enum actual_options discriminant;
union {
/* … various data types/sub-structs corresponding to each option's need… */
};
};
Is it basically the same with a Swift enum under the hood? I.e. that I should expect an enum to have basically the same memory overhead as a struct of the largest possible option in my enum, plus at least one extra byte to store the tag/discriminant of the overarching case?
I'm also interested in what code gets generated to use whatever sort of underlying structure. I'm assuming it can't really much more fancy/optimized than what you'd do in C for the structure shown above? E.g.
struct raw_enum {
enum { case1, case2, case3 } tag;
union {
struct { int x; int y; } case1_data;
const char* case2_data;
struct { float a; double b; void* c; char d; } case3_data;
};
};
struct raw_enum d;
fill_in_some_value(&d);
if (d.tag == case1) {
// use `d.case1_data`…
} else if (d.tag == case2) {
// use `d.case2_data`…
} else if (d.tag == case3) {
// use `d.case3_data`…
} else {
// any runtime assertion for an unknown tag that could somehow sneak in???
}
Is that a reasonable approximation to what Swift does in the code it generates around enums?
Related
I am converting a program written in Pascal to Swift and some Pascal features do not have direct Swift equivalents such as variant records and defining sets as types. A variant record in Pascal enables you to assign different field types to the same area of memory in a record. In other words, one particular location in a record could be either of type A or of type B. This can be useful in either/or cases, where a record can have either one field or the other field, but not both. What are the Swift equivalents for a variant record and a set type like setty in the Pascal fragment?
The Pascal code fragment to be converted is:
const
strglgth = 16;
sethigh = 47;
setlow = 0;
type
setty = set of setlow..sethigh;
cstclass = (reel,pset,strg);
csp = ^constant; /* pointer to constant type */
constant = record case cclass: cstclass of
reel: (rval: packed array [1..strglgth] of char);
pset: (pval: setty);
strg: (slgth: 0..strglgth;
sval: packed array [1..strglgth] of char)
end;
var
lvp: csp
My partial Swift code is
let strglgth = 16
let sethigh = 47
let setlow = 0
enum cstclass : Int {case reel = 0, pset, strg}
var lvp: csp
Any advice is appreciated. Thanks in advance.
Variant records in Pascal are very simular to unions in C.
So this link will probably be helpful:
https://developer.apple.com/documentation/swift/imported_c_and_objective-c_apis/using_imported_c_structs_and_unions_in_swift
In case the link ever goes dead, here's the relevant example:
union SchroedingersCat {
bool isAlive;
bool isDead;
};
In Swift, it’s imported like this:
struct SchroedingersCat {
var isAlive: Bool { get set }
var isDead: Bool { get set }
init(isAlive: Bool)
init(isDead: Bool)
init()
}
That would be more like a functional port. It does not seem to take care of the fact that Variant records are actually meant to use the same piece of memory in different ways, so if you'd have some low-level code that reads from a stream or you have a pointer to such a structure, this might not help you.
In that case you might want to try just reserving some bytes, and write different getters/setters to access them. That would even work if you'd have to port more complex structures, like nested variant types.
But overall, if possible, I'd recommend to avoid porting such structures too literally, and use idioms that match Swift better.
Our project depends on a C library that declares a general struct
typedef struct
{
SomeType a_field;
char payload[248];
} GeneralStruct
and a more specific one:
typedef struct
{
SomeType a_field;
OtherType other_field;
AnotherType another_file;
YetAnotherType yet_another_field;
} SpecificStruct
We have some examples of its usage in C++ and in some cases it's needed to cast the general one to the specific one like:
GeneralStruct generalStruct = // ...
SpecificStruct specificStruct = reinterpret_cast<SpecificStruct&>(generalStruct)
Is it something like reinterpret_cast available in Swift? I guess I could read the bytes from payload manually, but I'm looking for an idiomatic way
withMemoryRebound(to:capacity:_:) can be used
... when you have a pointer to memory bound to one type and you need to access that memory as instances of another type.
Example: Take the address of the general struct, then rebind and dereference the pointer:
let general = GeneralStruct()
let specific = withUnsafePointer(to: general) {
$0.withMemoryRebound(to: SpecificStruct.self, capacity: 1) {
$0.pointee
}
}
If both types have the same size and a compatible memory layout then you can also use unsafeBitCast(_:to:):
Use this function only to convert the instance passed as x to a layout-compatible type when conversion through other means is not possible.
Warning: Calling this function breaks the guarantees of the Swift type system; use with extreme care.
Example:
let specific = unsafeBitCast(general, to: SpecificStruct.self)
I am making a structure that acts like a String, except that it only deals with Unicode UTF-32 scalar values. Thus, it is an array of UInt32. (See this question for more background.)
What I want to do
I want to be able to use my custom ScalarString struct as a key in a dictionary. For example:
var suffixDictionary = [ScalarString: ScalarString]() // Unicode key, rendered glyph value
// populate dictionary
suffixDictionary[keyScalarString] = valueScalarString
// ...
// check if dictionary contains Unicode scalar string key
if let renderedSuffix = suffixDictionary[unicodeScalarString] {
// do something with value
}
Problem
In order to do that, ScalarString needs to implement the Hashable Protocol. I thought I would be able to do something like this:
struct ScalarString: Hashable {
private var scalarArray: [UInt32] = []
var hashValue : Int {
get {
return self.scalarArray.hashValue // error
}
}
}
func ==(left: ScalarString, right: ScalarString) -> Bool {
return left.hashValue == right.hashValue
}
but then I discovered that Swift arrays don't have a hashValue.
What I read
The article Strategies for Implementing the Hashable Protocol in Swift had a lot of great ideas, but I didn't see any that seemed like they would work well in this case. Specifically,
Object property (array is does not have hashValue)
ID property (not sure how this could be implemented well)
Formula (seems like any formula for a string of 32 bit integers would be processor heavy and have lots of integer overflow)
ObjectIdentifier (I'm using a struct, not a class)
Inheriting from NSObject (I'm using a struct, not a class)
Here are some other things I read:
Implementing Swift's Hashable Protocol
Swift Comparison Protocols
Perfect hash function
Membership of custom objects in Swift Arrays and Dictionaries
How to implement Hashable for your custom class
Writing a good Hashable implementation in Swift
Question
Swift Strings have a hashValue property, so I know it is possible to do.
How would I create a hashValue for my custom structure?
Updates
Update 1: I would like to do something that does not involve converting to String and then using String's hashValue. My whole point for making my own structure was so that I could avoid doing lots of String conversions. String gets it's hashValue from somewhere. It seems like I could get it using the same method.
Update 2: I've been looking into the implementation of string hash codes algorithms from other contexts. I'm having a little difficulty knowing which is best and expressing them in Swift, though.
Java hashCode algorithm
C algorithms
hash function for string (SO question and answers in C)
Hashing tutorial (Virginia Tech Algorithm Visualization Research Group)
General Purpose Hash Function Algorithms
Update 3
I would prefer not to import any external frameworks unless that is the recommended way to go for these things.
I submitted a possible solution using the DJB Hash Function.
Update
Martin R writes:
As of Swift 4.1, the compiler can synthesize Equatable and Hashable
for types conformance automatically, if all members conform to
Equatable/Hashable (SE0185). And as of Swift 4.2, a high-quality hash
combiner is built-in into the Swift standard library (SE-0206).
Therefore there is no need anymore to define your own hashing
function, it suffices to declare the conformance:
struct ScalarString: Hashable, ... {
private var scalarArray: [UInt32] = []
// ... }
Thus, the answer below needs to be rewritten (yet again). Until that happens refer to Martin R's answer from the link above.
Old Answer:
This answer has been completely rewritten after submitting my original answer to code review.
How to implement to Hashable protocol
The Hashable protocol allows you to use your custom class or struct as a dictionary key. In order to implement this protocol you need to
Implement the Equatable protocol (Hashable inherits from Equatable)
Return a computed hashValue
These points follow from the axiom given in the documentation:
x == y implies x.hashValue == y.hashValue
where x and y are values of some Type.
Implement the Equatable protocol
In order to implement the Equatable protocol, you define how your type uses the == (equivalence) operator. In your example, equivalence can be determined like this:
func ==(left: ScalarString, right: ScalarString) -> Bool {
return left.scalarArray == right.scalarArray
}
The == function is global so it goes outside of your class or struct.
Return a computed hashValue
Your custom class or struct must also have a computed hashValue variable. A good hash algorithm will provide a wide range of hash values. However, it should be noted that you do not need to guarantee that the hash values are all unique. When two different values have identical hash values, this is called a hash collision. It requires some extra work when there is a collision (which is why a good distribution is desirable), but some collisions are to be expected. As I understand it, the == function does that extra work. (Update: It looks like == may do all the work.)
There are a number of ways to calculate the hash value. For example, you could do something as simple as returning the number of elements in the array.
var hashValue: Int {
return self.scalarArray.count
}
This would give a hash collision every time two arrays had the same number of elements but different values. NSArray apparently uses this approach.
DJB Hash Function
A common hash function that works with strings is the DJB hash function. This is the one I will be using, but check out some others here.
A Swift implementation provided by #MartinR follows:
var hashValue: Int {
return self.scalarArray.reduce(5381) {
($0 << 5) &+ $0 &+ Int($1)
}
}
This is an improved version of my original implementation, but let me also include the older expanded form, which may be more readable for people not familiar with reduce. This is equivalent, I believe:
var hashValue: Int {
// DJB Hash Function
var hash = 5381
for(var i = 0; i < self.scalarArray.count; i++)
{
hash = ((hash << 5) &+ hash) &+ Int(self.scalarArray[i])
}
return hash
}
The &+ operator allows Int to overflow and start over again for long strings.
Big Picture
We have looked at the pieces, but let me now show the whole example code as it relates to the Hashable protocol. ScalarString is the custom type from the question. This will be different for different people, of course.
// Include the Hashable keyword after the class/struct name
struct ScalarString: Hashable {
private var scalarArray: [UInt32] = []
// required var for the Hashable protocol
var hashValue: Int {
// DJB hash function
return self.scalarArray.reduce(5381) {
($0 << 5) &+ $0 &+ Int($1)
}
}
}
// required function for the Equatable protocol, which Hashable inheirits from
func ==(left: ScalarString, right: ScalarString) -> Bool {
return left.scalarArray == right.scalarArray
}
Other helpful reading
Which hashing algorithm is best for uniqueness and speed?
Overflow Operators
Why are 5381 and 33 so important in the djb2 algorithm?
How are hash collisions handled?
Credits
A big thanks to Martin R over in Code Review. My rewrite is largely based on his answer. If you found this helpful, then please give him an upvote.
Update
Swift is open source now so it is possible to see how hashValue is implemented for String from the source code. It appears to be more complex than the answer I have given here, and I have not taken the time to analyze it fully. Feel free to do so yourself.
Edit (31 May '17): Please refer to the accepted answer. This answer is pretty much just a demonstration on how to use the CommonCrypto Framework
Okay, I got ahead and extended all arrays with the Hashable protocol by using the SHA-256 hashing algorithm from the CommonCrypto framework. You have to put
#import <CommonCrypto/CommonDigest.h>
into your bridging header for this to work. It's a shame that pointers have to be used though:
extension Array : Hashable, Equatable {
public var hashValue : Int {
var hash = [Int](count: Int(CC_SHA256_DIGEST_LENGTH) / sizeof(Int), repeatedValue: 0)
withUnsafeBufferPointer { ptr in
hash.withUnsafeMutableBufferPointer { (inout hPtr: UnsafeMutableBufferPointer<Int>) -> Void in
CC_SHA256(UnsafePointer<Void>(ptr.baseAddress), CC_LONG(count * sizeof(Element)), UnsafeMutablePointer<UInt8>(hPtr.baseAddress))
}
}
return hash[0]
}
}
Edit (31 May '17): Don't do this, even though SHA256 has pretty much no hash collisions, it's the wrong idea to define equality by hash equality
public func ==<T>(lhs: [T], rhs: [T]) -> Bool {
return lhs.hashValue == rhs.hashValue
}
This is as good as it gets with CommonCrypto. It's ugly, but fast and not manypretty much no hash collisions for sure
Edit (15 July '15): I just made some speed tests:
Randomly filled Int arrays of size n took on average over 1000 runs
n -> time
1000 -> 0.000037 s
10000 -> 0.000379 s
100000 -> 0.003402 s
Whereas with the string hashing method:
n -> time
1000 -> 0.001359 s
10000 -> 0.011036 s
100000 -> 0.122177 s
So the SHA-256 way is about 33 times faster than the string way. I'm not saying that using a string is a very good solution, but it's the only one we can compare it to right now
It is not a very elegant solution but it works nicely:
"\(scalarArray)".hashValue
or
scalarArray.description.hashValue
Which just uses the textual representation as a hash source
One suggestion - since you are modeling a String, would it work to convert your [UInt32] array to a String and use the String's hashValue? Like this:
var hashValue : Int {
get {
return String(self.scalarArray.map { UnicodeScalar($0) }).hashValue
}
}
That could conveniently allow you to compare your custom struct against Strings as well, though whether or not that is a good idea depends on what you are trying to do...
Note also that, using this approach, instances of ScalarString would have the same hashValue if their String representations were canonically equivalent, which may or may not be what you desire.
So I suppose that if you want the hashValue to represent a unique String, my approach would be good. If you want the hashValue to represent a unique sequence of UInt32 values, #Kametrixom's answer is the way to go...
In c++, one can introduce an alias reference as follows:
StructType & alias = lengthyExpresionThatEvaluatesToStuctType;
alias.anAttribute = value; // modify "anAttribute" on the original struct
Is there a similar syntactic sugar for manipulating a (value typed) struct in Swift?
Update 1: For example: Let say the struct is contained in a dictionary of kind [String:StructType], and that I like to modify several attributes in the the struct myDict["hello"]. I could make a temporary copy of that entry. Modify the copy, and then copy the temporary struct back to the dictionary, as follows:
var temp = myDict["hello"]!
temp.anAttribute = 1
temp.anotherAttribute = "hej"
myDict["hello"] = temp
However, if my function has several exit points I would have to write myDict["hello"] = temp before each exit point, and it would therefore be more convinient if I could just introduce and alias (reference) for myDict["hello"] , as follows:
var & alias = myDict["hello"]! // how to do this in swift ???
alias.anAttribute = 1
alias.anotherAttribute = "hej"
Update 2: Before down- or close- voting this question: Please look at Building Better Apps with Value Types in swift (from WWWDC15)!! Value type is an important feature of Swift! As you may know, Swift has borrowed several features from C++, and value types are maybe the most important feature of C++ (when C++ is compared to Java and such languages). When it comes to value types, C++ has some syntactic sugar, and my questions is: Does Swift have a similar sugar hidden in its language?. I am sure Swift will have, eventually... Please, do not close-vote this question if you do not understand it!
I have just read Deitel's book on Swift. While I'am not an expert (yet) I am not completely novel. I am trying to use Swift as efficient as possible!
Swift doesn't allow reference semantics to value types generally speaking, except when used as function parameters declared inout. You can pass a reference to the struct to a function that works on an inout version (I believe, citation needed, that this is implemented as a copy-write, not as a memory reference). You can also capture variables in nested functions for similar semantics. In both cases you can return early from the mutating function, while still guaranteeing appropriate assignment. Here is a sample playground that I ran in Xcode 6.3.2 and Xcode 7-beta1:
//: Playground - noun: a place where people can play
import Foundation
var str = "Hello, playground"
struct Foo {
var value: Int
}
var d = ["nine": Foo(value: 9), "ten": Foo(value: 10)]
func doStuff(key: String) {
let myNewValue = Int(arc4random())
func doMutation(inout temp: Foo) {
temp.value = myNewValue
}
if d[key] != nil {
doMutation(&d[key]!)
}
}
doStuff("nine")
d // d["nine"] has changed... unless you're really lucky
// alternate approach without using inout
func doStuff2(key: String) {
if var temp = d[key] {
func updateValues() {
temp.value = Int(arc4random())
}
updateValues()
d[key] = temp
}
}
doStuff2("ten")
d // d["ten"] has changed
You don't have to make the doMutation function nested in your outer function, I just did that to demonstrate the you can capture values like myNewValue from the surrounding function, which might make implementation easier. updateValues, however, must be nested because it captures temp.
Despite the fact that this works, based on your sample code, I think that using a class here (possibly a final class if you are concerned about performance) is really more idiomatic imperative-flavored Swift.
You can, if you really want to, get a raw pointer using the standard library function withUnsafeMutablePointer. You can probably also chuck the value into an inner class that only has a single member. There are also functional-flavored approaches that might mitigate the early-return issue.
MusicPlayer's API relies on variable length arrays as the last member of a struct to handle passing around data of unknown size. Looking at the generated interface for MusicPlayer, the structs used in this method present their last element in a single value tuple.
example:
struct MusicEventUserData {
var length: UInt32
var data: (UInt8)
}
I doubt that any of this has been officially exposed but has anyone figured out whether this syntax is a red herring or actually significant? I don't think that there is a means to hand arbitrarily sized things via swift but does this help when calling from C?
after test on a playground I can see there is no difference between (Int) and Int type.
Here is my tests :
func testMethod(param1: Int, param2: (Int)) -> Int{
return param1 + param2
}
testMethod(2, 3) // return 5
testMethod(3, (6)) // return 9
About the calling in C, I just think it is a little bug on the bridging from ObjC to swift
MusicPlayer is no longer exported as above. As of Xcode 6.3b1
typedef struct MusicEventUserData
{
UInt32 length;
UInt8 data[1];
} MusicEventUserData;
This is much closer to the C declaration. It still does not completely explain how to deal with the API in swift but that is another question.